Creating Pixelated 3D Effect with Metal Shaders
50 min read
Lately, I’ve been dedicating a lot of time to studying computer graphics and working with 3D basics. This article serves as a way to consolidate and apply what I’ve learned. Low chances that someone would need to implement something like this in a real iOS app. There are many libraries that allow this and more (Spline, Rive, etc.), but still I think it’s a great exercise to understand the fundamentals.
This article is filled with interactive visualizations to make it easier to understand and see what’s happening. We will be focusing more on the 3D rendering and less on the MetalKit specifics (like command queues, buffers, etc.).
Project Structure
We are going to divide our journey into several steps.
Step 1: Basic 3D Rendering
Here we are going to do a basic 3D rendering of a flower model (or basically any 3D model).
Step 2: Offscreen Rendering
This step will introduce the concept of offscreen rendering so the architecture of our project will be more clear and scalable.
Step 3: Post-Processing Pipeline
This is where we will apply the pixelation effect to the offscreen texture. Each step produces a working solution that can be run and tested independently.
At the end we will have this working animation.
Step 1: Basic 3D Rendering
UI Integration
As usual we are going to use UIViewRepresentable
to bridge between SwiftUI and Metal, providing a clean separation between the declarative UI layer and the imperative Metal rendering code.
import MetalKit
import SwiftUI
struct MetalViewRepresentable: UIViewRepresentable {
final class Coordinator: NSObject {
let renderer = Renderer()
weak var mtkView: MTKView?
}
func makeCoordinator() -> Coordinator {
Coordinator()
}
func makeUIView(context: Context) -> MTKView {
let device = context.coordinator.renderer.device
let view = MTKView(frame: .zero, device: device)
view.clearColor = MTLClearColorMake(0, 0, 0, 1)
view.colorPixelFormat = .bgra8Unorm
view.depthStencilPixelFormat = .depth32Float
view.preferredFramesPerSecond = 60
view.isPaused = false
view.enableSetNeedsDisplay = false
view.framebufferOnly = true
view.delegate = context.coordinator.renderer
context.coordinator.mtkView = view
return view
}
func updateUIView(
_ uiView: MTKView,
context: Context
) {}
}
Building Convenient Toolbelt
Before continuing with the Renderer
implementation let’s introduce a couple of conveniences that will help us with shader and asset management.
Shader Management with ShaderLibrary
ShaderLibrary
is a wrapper around MTLLibrary
that provides convenient shader function access using Swift’s @dynamicMemberLookup
feature. This eliminates the need for verbose function boilerplate at a calling site.
import MetalKit
/// A wrapper around MTLLibrary that provides convenient shader function access
/// using Swift's @dynamicMemberLookup feature.
@dynamicMemberLookup
public struct ShaderLibrary {
/// The underlying Metal library
let library: MTLLibrary
public init(
library: MTLLibrary
) {
self.library = library
}
/// Retrieves a shader function by name
/// - Parameter name: The name of the shader function
/// - Returns: The Metal function
/// - Throws: An error if the function cannot be found
public func function(named name: String) throws -> MTLFunction {
let function = try library.makeFunction(
name: name,
constantValues: .init()
)
return function
}
public subscript(
dynamicMember member: String
) -> MTLFunction {
get throws {
try function(named: member)
}
}
}
Asset Management with ObjectParser
ObjectParser
is a helper type that loads an .obj
file and its internal resources. It provides the mesh and submesh buffers (positions, normals, UVs), along with the vertex descriptor that matches the shaders.
For the sake of simplicity we do some assumptions about the .obj
file. Including the fact that it has a single mesh and a single texture.
.obj
file contains the data about vertices, normals, UVs and builds faces (triangles) from them. If you ever tried to do “Hello World” in Metal and display a triangle then you are familiar with defining vertices and triangles.
MDLVertexDescriptor
reads the vertex data from the .obj
file and stores it in a format that can be used by Metal. In our case we expect this data to have 3 attributes: position, normal and UV.
MDLAsset
is what we use to actually work with the .obj
file. It provides a way to load the file and get the mesh and submesh buffers. Mesh buffer contains all the vertices defined in the file without any ordering. Submesh buffer takes the vertices and with indicies places them in order. This means that there are might be multiple submeshes defining different parts of a model. To get more familiar with this concept I’d recommend to watch this tutorial.
Here is the complete implementation of ObjectParser
:
import ModelIO
import MetalKit
public struct ObjectParser {
// mesh contains all the vertices, unordered
public let mesh: MTKMesh
// submesh takes the vertices and with indicies places them in order
public let submeshes: [MTKSubmesh]
public var textures: [MTLTexture]
public let mdlVertexDescriptor: MDLVertexDescriptor = {
// Position (0), Normal (1), Texcoord (2)
let mdlVertexDescriptor = MDLVertexDescriptor()
mdlVertexDescriptor.attributes[0] = MDLVertexAttribute(
name: MDLVertexAttributePosition,
format: .float3,
offset: 0,
bufferIndex: 0
)
mdlVertexDescriptor.attributes[1] = MDLVertexAttribute(
name: MDLVertexAttributeNormal,
format: .float3,
offset: 12,
bufferIndex: 0
)
mdlVertexDescriptor.attributes[2] = MDLVertexAttribute(
name: MDLVertexAttributeTextureCoordinate,
format: .float2,
offset: 24,
bufferIndex: 0
)
mdlVertexDescriptor.layouts[0] = MDLVertexBufferLayout(stride: 32)
return mdlVertexDescriptor
}()
// Replace/extend your init with texcoord attribute and material texture loading
public init(
modelURL: URL,
device: MTLDevice,
) {
let allocator = MTKMeshBufferAllocator(device: device)
let asset = MDLAsset(
url: modelURL,
vertexDescriptor: mdlVertexDescriptor,
bufferAllocator: allocator
)
// Grab the first mesh
let mdlMesh = asset.childObjects(of: MDLMesh.self).first as! MDLMesh
// Ensure normals exist if missing
if mdlMesh.vertexAttributeData(forAttributeNamed: MDLVertexAttributeNormal, as: .float3) == nil {
mdlMesh.addNormals(withAttributeNamed: MDLVertexAttributeNormal, creaseThreshold: 0.0)
}
// Build MTKMesh
let mesh = try! MTKMesh(mesh: mdlMesh, device: device)
self.mesh = mesh
self.submeshes = mesh.submeshes
let keys: [MDLMaterialSemantic] = [.baseColor]
let textureLoader = MTKTextureLoader(device: device)
var _textures: [any MTLTexture] = []
mdlMesh.submeshes?.forEach { submesh in
if
let mdlSubmesh = submesh as? MDLSubmesh,
let material = mdlSubmesh.material
{
for key in keys {
if let prop = material.property(with: key) {
// If it’s a texture sampler, use its URL
if
prop.type == .string,
let name = prop.stringValue
{
// Resolve relative to the OBJ’s folder
let texURL = modelURL.deletingLastPathComponent().appendingPathComponent(name)
if let tex = try? textureLoader.newTexture(URL: texURL, options: [
.SRGB: false as NSNumber,
.origin: MTKTextureLoader.Origin.bottomLeft
]) {
_textures.append(tex)
break
}
} else if
prop.type == .URL,
let url = prop.urlValue
{
// If MTL references a full URL
if let tex = try? textureLoader.newTexture(URL: url, options: [
.SRGB: false as NSNumber,
.origin: MTKTextureLoader.Origin.bottomLeft
]) {
_textures.append(tex)
break
}
} else if
prop.type == .texture,
let mdlTex = prop.textureSamplerValue?.texture
{
// Embedded MDLTexture
if let tex = try? textureLoader.newTexture(texture: mdlTex, options: [
.SRGB: false as NSNumber,
.origin: MTKTextureLoader.Origin.bottomLeft
]) {
_textures.append(tex)
break
}
}
}
}
}
}
self.textures = _textures
}
}
Affine Transformations with AffineTransform
This type encapsulates the essential 3D transformations: translation, rotation, and scaling. Here I’ve prepared a little visualization to play with the transformations interactively.
We handle rotation using quaternions. The API lets us set the axis and angle, and it handles the rest. Quaternions are a powerful tool widely used in 3D graphics. For more information I suggest to visit this tutorial explaining the concept and comparing it with another one - Euler Angles. Also this video from 3Blue1Brown explaining the concept of quaternions is a great resource.
The final matrix is computed as Translation * Rotation * Scale
. This order ensures that scaling happens first, then rotation, then translation. Experiment with other orders to see how it affects the result.
import simd
/// A struct that encapsulates the essential 3D transformations: translation, rotation, and scaling.
struct AffineTransform {
/// Translation vector in 3D space
var translation: SIMD3<Float> = .zero
/// Rotation quaternion (angle: 0.0, axis: zero vector by default)
var rotation: simd_quatf = simd_quatf(angle: 0.0, axis: SIMD3<Float>(0,0,0))
/// Scale factors for each axis (uniform scale of 1.0 by default)
var scale: SIMD3<Float> = SIMD3<Float>(repeating: 1)
/// Computed 4x4 model matrix combining all transformations
///
/// The matrix is built by multiplying translation, rotation, and scale matrices
/// in the order: Translation * Rotation * Scale
var modelMatrix: float4x4 {
Self.makeTranslate(translation) *
float4x4(rotation) *
Self.makeScale(scale)
}
/// Creates a 4x4 translation matrix from a 3D vector
/// - Parameter vector: The translation vector
/// - Returns: A 4x4 translation matrix
static func makeTranslate(_ vector: SIMD3<Float>) -> float4x4 {
let baseX: SIMD4<Float> = [1, 0, 0, 0]
let baseY: SIMD4<Float> = [0, 1, 0, 0]
let baseZ: SIMD4<Float> = [0, 0, 1, 0]
let baseW: SIMD4<Float> = [vector.x, vector.y, vector.z, 1]
return float4x4(baseX, baseY, baseZ, baseW)
}
/// Creates a 4x4 scale matrix from a 3D vector
/// - Parameter vector: The scale factors for each axis
/// - Returns: A 4x4 scale matrix
static func makeScale(_ vector: SIMD3<Float>) -> float4x4 {
float4x4(diagonal: SIMD4<Float>(vector.x, vector.y, vector.z, 1.0))
}
}
Renderer Implementation
Now we are ready to display the 3D object.
For the Renderer
we will start from the basics. We will need to initialize the device, command queue, and object parser.
It might be not the best approach to keep everything inside the initializer since each of these components relatively heavy initialization logic on their own. But for the sake of simplicity we will keep it this way.
Notice that we’re using a predefined resource URL for the model. The specific model I used is available here.
final class Renderer: NSObject {
let library: ShaderLibrary
let device: any MTLDevice = MTLCreateSystemDefaultDevice()!
let commandQueue: any MTLCommandQueue
let mdlObject: ObjectParser
private var instanceTransforms: [AffineTransform] = [
AffineTransform(
translation: SIMD3<Float>(0.0, -10.0, 0.0),
scale: SIMD3<Float>(repeating: 0.6)
)
]
private let instanceBuffer: MTLBuffer
init(
modelURL: URL = Bundle.main.url(
forResource: "12973_anemone_flower_v1_l2",
withExtension: "obj"
)!
) {
library = .init(
library: try! device.makeDefaultLibrary(bundle: .main)
)
commandQueue = device.makeCommandQueue()!
mdlObject = ObjectParser(
modelURL: modelURL,
device: device
)
instanceBuffer = device.makeBuffer(
length: MemoryLayout<float4x4>.stride * instanceTransforms.count,
options: []
)!
super.init()
}
}
And add a basic MTKViewDelegate
implementation. In this initial setup, we use the view’s render pass as a render target, so the model is rendered directly to the screen.
extension Renderer: MTKViewDelegate {
func draw(in view: MTKView) {
guard
let drawable = view.currentDrawable,
let commandQueue = device.makeCommandQueue(),
let commandBuffer = commandQueue.makeCommandBuffer()
else {
return
}
if
let sceneRenderPassDescriptor = view.currentRenderPassDescriptor,
let renderEncoder = commandBuffer.makeRenderCommandEncoder(
descriptor: sceneRenderPassDescriptor
)
{
do {
try drawModel(in: view, renderEncoder: renderEncoder)
} catch {
fatalError(error.localizedDescription)
}
renderEncoder.endEncoding()
}
commandBuffer.present(drawable)
commandBuffer.commit()
}
func mtkView(
_ view: MTKView,
drawableSizeWillChange size: CGSize
) {}
}
To draw the model we need to define a pipeline state that holds information about required shaders and data format. To define the vertex data we reuse the MDLVertexDescriptor
from the object parser.
extension Renderer {
func drawModel(
in view: MTKView,
renderEncoder: any MTLRenderCommandEncoder
) throws {
let pipelineDescriptor = MTLRenderPipelineDescriptor()
pipelineDescriptor.vertexDescriptor = MTKMetalVertexDescriptorFromModelIO(mdlObject.mdlVertexDescriptor)
pipelineDescriptor.vertexFunction = try! library.modelVertex
pipelineDescriptor.fragmentFunction = try! library.modelFragment
pipelineDescriptor.colorAttachments[0].pixelFormat = view.colorPixelFormat
pipelineDescriptor.depthAttachmentPixelFormat = .depth32Float
let renderPipelineState = try device.makeRenderPipelineState(descriptor: pipelineDescriptor)
renderEncoder.setRenderPipelineState(renderPipelineState)
let depthStencilDescriptor = MTLDepthStencilDescriptor()
depthStencilDescriptor.depthCompareFunction = .less
depthStencilDescriptor.isDepthWriteEnabled = true
if
let depthStencilState = device.makeDepthStencilState(
descriptor: depthStencilDescriptor
)
{
renderEncoder.setDepthStencilState(depthStencilState)
}
renderEncoder.setVertexBuffer(
mdlObject.mesh.vertexBuffers[0].buffer,
offset: mdlObject.mesh.vertexBuffers[0].offset,
index: 0
)
}
}
Next, we configure the matrix that projects 3D coordinates onto the 2D screen. This involves creating matrices that transform models through various coordinate spaces and ultimately clip them to fit the visible area. We’ll explore this process in detail, including an interactive demonstration, later in the article.
func drawModel(
in view: MTKView,
renderEncoder: any MTLRenderCommandEncoder
) throws {
...
let aspect = Float(view.drawableSize.width / max(1, view.drawableSize.height))
let perspectiveMatrix = AffineTransform.perspective(
fovyRadians: .pi / 4,
aspect: aspect,
near: 0.1,
far: 1000
)
let viewMatrix = AffineTransform.lookAt(
eye: SIMD3<Float>(0.0, 0.0, 40.0),
center: SIMD3<Float>(0.0, 0.0, 0.0),
up: SIMD3<Float>(0.0, 1.0, 0.0),
)
var uniforms: SceneUniforms = .init(
projection: perspectiveMatrix * viewMatrix
)
renderEncoder.setVertexBytes(
&uniforms,
length: MemoryLayout<SceneUniforms>.stride,
index: 1
)
let ptr = instanceBuffer
.contents()
.bindMemory(
to: float4x4.self,
capacity: instanceTransforms.count
)
for (i, t) in instanceTransforms.enumerated() {
ptr[i] = t.modelMatrix
}
renderEncoder.setVertexBuffer(
instanceBuffer,
offset: 0,
index: 2
)
}
Also implement the necessary 3D math primitives required for rendering.
struct SceneUniforms {
var projection: simd_float4x4
}
extension AffineTransform {
static func perspective(
fovyRadians: Float,
aspect: Float,
near: Float,
far: Float
) -> float4x4 {
let yScale = 1 / tan(fovyRadians * 0.5)
let xScale = yScale / aspect
let zRange = far - near
let zScale = far / zRange
let wz = -near * zScale
return float4x4(
SIMD4<Float>( xScale, 0, 0, 0 ),
SIMD4<Float>( 0, yScale, 0, 0 ),
SIMD4<Float>( 0, 0, zScale, 1 ),
SIMD4<Float>( 0, 0, wz, 0 ),
)
}
static func lookAt(
eye: SIMD3<Float>,
center: SIMD3<Float>,
up: SIMD3<Float>
) -> float4x4 {
let zAxis = normalize(center - eye)
let xAxis = normalize(cross(up, zAxis))
let yAxis = cross(zAxis, xAxis)
let translation = SIMD3<Float>(
-dot(xAxis, eye),
-dot(yAxis, eye),
-dot(zAxis, eye)
)
return float4x4(
SIMD4<Float>(xAxis.x, yAxis.x, zAxis.x, 0),
SIMD4<Float>(xAxis.y, yAxis.y, zAxis.y, 0),
SIMD4<Float>(xAxis.z, yAxis.z, zAxis.z, 0),
SIMD4<Float>(translation.x, translation.y, translation.z, 1)
)
}
}
One last step before we jump into the shader code. We need to set up texture sampling and configure the model drawing process.
func drawModel(
in view: MTKView,
renderEncoder: any MTLRenderCommandEncoder
) throws {
...
let sampDesc = MTLSamplerDescriptor()
sampDesc.minFilter = .linear
sampDesc.magFilter = .linear
sampDesc.sAddressMode = .repeat
sampDesc.tAddressMode = .repeat
let sampler = device.makeSamplerState(descriptor: sampDesc)
renderEncoder.setFragmentSamplerState(sampler, index: 0)
for (submesh, texture) in zip(mdlObject.submeshes, mdlObject.textures) {
renderEncoder.setFragmentTexture(texture, index: 0)
renderEncoder.drawIndexedPrimitives(
type: submesh.primitiveType,
indexCount: submesh.indexCount,
indexType: submesh.indexType,
indexBuffer: submesh.indexBuffer.buffer,
indexBufferOffset: submesh.indexBuffer.offset,
instanceCount: instanceTransforms.count
)
}
}
In the shader code we need to reflect the structure of the vertex data and the uniforms. The fragment function just samples the texture and returns the color. The vertex function builds the final position by multiplying the model matrix by the position.
#include <metal_stdlib>
using namespace metal;
struct SceneUniforms {
float4x4 projection;
};
struct VertexIn {
float3 position [[attribute(0)]];
float3 normal [[attribute(1)]];
float2 uv [[attribute(2)]];
};
struct VertexOut {
float4 position [[position]];
float3 worldPosition;
float3 normal;
float2 uv;
};
vertex VertexOut modelVertex(
VertexIn in [[stage_in]],
constant SceneUniforms &u [[buffer(1)]],
constant float4x4 *instanceModels [[buffer(2)]],
uint instanceID [[instance_id]]
) {
// Get the model matrix for this instance
float4x4 model = instanceModels[instanceID];
VertexOut out;
float4 worldPos = model * float4(in.position, 1.0);
out.worldPosition = worldPos.xyz;
// Transform vertex position from model space to clip space
// Order: Model -> World -> View -> Projection
out.position = u.projection * worldPos;
// Transform the normal vector by the model matrix
out.normal = (model * float4(in.normal, 0.0)).xyz;
// Pass through UV coordinates unchaged
out.uv = in.uv;
return out;
}
fragment float4 modelFragment(
VertexOut in [[stage_in]],
texture2d<float> tex [[texture(0)]],
sampler samp [[sampler(0)]]
) {
const float2 uv = in.uv;
return tex.sample(samp, uv);
}
Voila! We have a working renderer that displays the 3D object.

How the 3D math works
The Model-View-Projection (MVP) matrix is the classic approach in 3D math that transforms vertices through three coordinate spaces to get them from 3D world coordinates to 2D screen coordinates. There are other approaches to do this, but I haven’t studied them yet so we will stick with this one.
When we are talking about Space, we mean the coordinate system we work in.
1. Model Matrix
The Model matrix transforms vertices from Model Space (object local space) to World Space (where everything exists in the scene). We’ve defined the value inside the instanceTransforms
array to allow each instance of the object to have a different transformation.
AffineTransform(
translation: SIMD3<Float>(0.0, -10.0, 0.0),
scale: SIMD3<Float>(repeating: 0.6)
)
Using this matrix we can position, rotate, and scale the object. You can return to the affine transformations visualization to play with it.
Basically by defining the model matrix we tell how we want to place the object in the world where other objects might also be placed. And based on this, we’re not actually moving the object itself, but rather the space around it.
2. View Matrix
The View matrix transforms from World Space to Camera Space. It’s like moving the camera around the world.
let viewMatrix = AffineTransform.lookAt(
eye: SIMD3<Float>(0.0, 0.0, 40.0), // Camera position
center: SIMD3<Float>(0.0, 0.0, 0.0), // What we're looking at
up: SIMD3<Float>(0.0, 1.0, 0.0) // Up direction
)
By analogy to the model matrix, the view matrix is used to position the camera (I also like to think of it as the observer) in the world. And since there’s only one observer, all objects will be seen (if they are within the field of view) from a single point.
3. Projection Matrix
The Projection matrix transforms from Camera Space to Clip Space and maps the 3D scene to the 2D screen. This creates the perspective effect where distant objects appear smaller.
let perspectiveMatrix = AffineTransform.perspective(
fovyRadians: .pi / 4, // 45° field of view
aspect: aspect, // Screen aspect ratio
near: 0.1, // Near clipping plane
far: 1000 // Far clipping plane
)
This creates the “frustum” - a pyramid-shaped viewing volume. Everything inside gets rendered, everything outside gets clipped away.
Why This Works
Each transformation serves a specific purpose in the 3D pipeline:
- Model: “Where is this object in the world?”
- View: “Where is the camera looking?”
- Projection: “How should perspective work?”
By combining them into a single matrix, we can transform any vertex with just one matrix multiplication in the shader:
out.position = u.projection * model * float4(in.position, 1.0);
var uniforms: SceneUniforms = .init(
mvp: perspectiveMatrix * viewMatrix
)
Same as with affine transformations, the order is crucial.
- First:
modelMatrix
(model space → world space) - Then:
viewMatrix
(world space → camera space) - Finally:
perspectiveMatrix
(camera space → clip space)
This is exactly what happens in our modelVertex
shader. The vertex starts in model space, gets transformed through world space, camera space, and finally clip space.
Here is a visualization to help better understand the camera space and perspective transformation.
For a deeper dive into the mathematical principles, check out the OpenGL Tutorial on Matrices, which explains these concepts in detail.
Step 1.1: Lighting & Diffuse
Now we are going to add support for a lighting model to make the image look more lively and natural. We’ll use a point light as the basis for our implementation. A point light is a light source that emits light in all directions from a single point in space. We can control its position, color, intensity, and attenuation (how quickly the light fades with distance).
First of all we need to define how the surface reflects the light. There are many different models, but we’ll use the Lambertian model for this example. Lambertian reflection is a type of diffuse reflection where the surface reflects light in a way that is proportional to the cosine of the angle between the surface normal and the light direction. Low angles between the surface normal and the light direction will result in a brighter reflection and vice versa.
Light direction depends on the lighting model. We choose to use the point light model, so the light direction is the direction from the light source to the point on the surface.
Try moving the light around to see how the diffuse lighting changes. Notice how the sphere appears brighter when the light is closer and dimmer when it’s farther away.
Create a new struct to store the point light properties.
struct PointLight {
var position: SIMD3<Float>
var color: SIMD3<Float>
var intensity: Float
var attenuation: Float
}
Instantiate the point light with some initial values. This instance will be sent to the fragment shader as a uniform.
func drawModel(
in view: MTKView,
renderEncoder: any MTLRenderCommandEncoder
) throws {
...
var pointLight = PointLight(
position: SIMD3<Float>(0.0, 0.0, 10.0),
color: SIMD3<Float>(1.0, 1.0, 1.0),
intensity: 16.0,
attenuation: 0.1
)
let sampDesc = MTLSamplerDescriptor()
sampDesc.minFilter = .linear
sampDesc.magFilter = .linear
sampDesc.sAddressMode = .repeat
sampDesc.tAddressMode = .repeat
let sampler = device.makeSamplerState(descriptor: sampDesc)
renderEncoder.setFragmentSamplerState(sampler, index: 0)
renderEncoder.setFragmentBytes(
&pointLight,
length: MemoryLayout<PointLight>.stride,
index: 1
)
for (submesh, texture) in zip(mdlObject.submeshes, mdlObject.textures) {
renderEncoder.setFragmentTexture(texture, index: 0)
renderEncoder.drawIndexedPrimitives(
type: submesh.primitiveType,
indexCount: submesh.indexCount,
indexType: submesh.indexType,
indexBuffer: submesh.indexBuffer.buffer,
indexBufferOffset: submesh.indexBuffer.offset,
instanceCount: instanceTransforms.count
)
}
}
As described above, we need to calculate the diffuse lighting in the fragment shader. Here we find the light direction and calculate the diffuse lighting based on it. Note that here the light vector is directed towards the light source, not the other way around. Attenuation parameter controls how fast the light intensity decreases with distance. In physically-based lighting it’s calculated using the inverse square law. You can read more about it here and here.
struct PointLight {
float3 position;
float3 color;
float intensity;
float attenuation;
};
fragment float4 modelFragment(
VertexOut in [[stage_in]],
constant PointLight &light [[buffer(1)]],
texture2d<float> tex [[texture(0)]],
sampler samp [[sampler(0)]]
) {
// Sample the base texture color
float3 color = tex.sample(samp, in.uv).rgb;
// Calculate lighting
float3 normal = normalize(in.normal);
float3 lightDir = light.position - in.worldPosition;
float distance = length(lightDir);
lightDir = normalize(lightDir);
// Calculate attenuation (inverse square law with minimum distance)
float attenuation = 1.0 / (1.0 + light.attenuation * distance * distance);
// Calculate diffuse lighting (Lambertian)
float NdotL = max(dot(normal, lightDir), 0.0);
float3 diffuse = light.color * light.intensity * NdotL * attenuation;
// Combine albedo with lighting
float3 finalColor = color * diffuse;
return float4(finalColor, 1.0);
}
And here is the result. If it looks a bit too dark, try changing the light position and intensity.

Step 2: Offscreen Rendering
Offscreen rendering means that the rendered scene is not immediately displayed directly on the screen, but is stored in an intermediate texture. This way, it can be reused in several different stages of processing at once, and at the end, it can all be combined into a complete scene.
In our case it might be a bit overkill since we only have one effect to apply, but still I think it’s a good idea to know about it: the experiment we build here later can be expanded to include more effects and features.
Creating an Offscreen Texture
First, we need to instantiate a texture that will serve as our render target for the 3D scene. Here we also need a separate depth texture to store depth values. Having a dedicated depth texture is essential for proper depth testing during offscreen rendering, since our drawing operations won’t go directly to the screen.
Because we’re rendering offscreen instead of directly to the screen (i.e. we don’t use currentRenderPassDescriptor
from MTKView
), we need to create a separate render pass that uses the texture we just created as the render target.
func draw(in view: MTKView) {
guard
let drawable = view.currentDrawable,
let commandQueue = device.makeCommandQueue(),
let commandBuffer = commandQueue.makeCommandBuffer()
else {
return
}
let width = max(1, Int(view.drawableSize.width))
let height = max(1, Int(view.drawableSize.height))
// ---- START: MODEL ----
let modelTextureDescriptor = MTLTextureDescriptor.texture2DDescriptor(
pixelFormat: view.colorPixelFormat,
width: width,
height: height,
mipmapped: false
)
modelTextureDescriptor.usage = [.renderTarget, .shaderRead]
modelTextureDescriptor.storageMode = .private
modelTextureDescriptor.textureType = .type2D
let modelTexture = device.makeTexture(descriptor: modelTextureDescriptor)
let modelDepthDescriptor = MTLTextureDescriptor.texture2DDescriptor(
pixelFormat: .depth32Float,
width: width,
height: height,
mipmapped: false
)
modelDepthDescriptor.usage = [.renderTarget]
modelDepthDescriptor.storageMode = .private
let modelDepthTexture = device.makeTexture(descriptor: modelDepthDescriptor)
let offscreenPassDesctiptor = MTLRenderPassDescriptor()
offscreenPassDesctiptor.colorAttachments[0].texture = modelTexture
offscreenPassDesctiptor.colorAttachments[0].loadAction = .clear
offscreenPassDesctiptor.colorAttachments[0].storeAction = .store
offscreenPassDesctiptor.colorAttachments[0].clearColor = MTLClearColorMake(0, 0, 0, 1)
offscreenPassDesctiptor.depthAttachment.texture = modelDepthTexture
offscreenPassDesctiptor.depthAttachment.loadAction = .clear
offscreenPassDesctiptor.depthAttachment.storeAction = .dontCare
offscreenPassDesctiptor.depthAttachment.clearDepth = 1.0
if
let renderEncoder = commandBuffer.makeRenderCommandEncoder(
descriptor: offscreenPassDesctiptor
)
{
do {
try drawModel(in: view, renderEncoder: renderEncoder)
} catch {
fatalError(error.localizedDescription)
}
renderEncoder.endEncoding()
}
// ---- END: MODEL ----
}
Displaying the Result
Next, we need to add a new render pass to display the result on the screen. Here we use the modelTexture
we created earlier as the source texture, and currentRenderPassDescriptor
to display the result on the screen.
This part is named as “blit” because what it does is basically taking the texture and displaying it on the screen. If there were other textures to display, we would need to add more complex logic to handle and compose them.
func draw(in view: MTKView) {
...
// ---- END: MODEL ----
// ---- START: SCENE BLIT ----
if
let sceneRenderPassDescriptor = view.currentRenderPassDescriptor,
let renderEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: sceneRenderPassDescriptor),
let texture = modelTexture
{
do {
try drawBlit(
it: view,
renderEncoder: renderEncoder,
sceneTexture: texture
)
} catch {
fatalError(error.localizedDescription)
}
renderEncoder.endEncoding()
}
// ---- END: SCENE BLIT ----
commandBuffer.present(drawable)
commandBuffer.commit()
}
For this step, we use a dedicated pair of shader functions. We don’t define a vertex descriptor here because all necessary data will be provided by the vertex shader itself.
func drawBlit(
it view: MTKView,
renderEncoder: any MTLRenderCommandEncoder,
sceneTexture: any MTLTexture
) throws {
let pipelineDescriptor = MTLRenderPipelineDescriptor()
pipelineDescriptor.vertexFunction = try! library.blitVertex
pipelineDescriptor.fragmentFunction = try! library.blitFragment
pipelineDescriptor.colorAttachments[0].pixelFormat = view.colorPixelFormat
pipelineDescriptor.depthAttachmentPixelFormat = view.depthStencilPixelFormat
let pipelineState = try device.makeRenderPipelineState(descriptor: pipelineDescriptor)
renderEncoder.setRenderPipelineState(pipelineState)
let samplerDescriptor = MTLSamplerDescriptor()
samplerDescriptor.minFilter = .linear
samplerDescriptor.magFilter = .linear
samplerDescriptor.sAddressMode = .clampToEdge
samplerDescriptor.tAddressMode = .clampToEdge
let sampler = device.makeSamplerState(descriptor: samplerDescriptor)!
renderEncoder.setFragmentTexture(sceneTexture, index: 0)
renderEncoder.setFragmentSamplerState(sampler, index: 0)
renderEncoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: 3)
}
To display the texture on the screen, we use a fullscreen triangle technique. It has constant values for the vertices and UVs, that’s why we don’t need to pass them as parameters and the vertex shader just outputs them directly.
This way we can avoid the overhead of managing a vertex buffer and the complexity of drawing a quad. You can read more about it and comparison to the full quad technique in this article.
struct BlitVertexOut {
float4 position [[position]];
float2 uv;
};
vertex BlitVertexOut blitVertex(
uint vid [[vertex_id]]
) {
float2 pos[3] = { float2(-1, -1), float2(3, -1), float2(-1, 3) };
float2 uv[3] = { float2(0, 0), float2(2, 0), float2(0, 2) };
BlitVertexOut out;
out.position = float4(pos[vid], 0, 1);
out.uv = uv[vid];
return out;
}
fragment float4 blitFragment(
BlitVertexOut in [[stage_in]],
texture2d<float> src [[texture(0)]],
sampler samp [[sampler(0)]]
) {
float2 uv = float2(in.uv.x, 1.0 - in.uv.y);
return float4(src.sample(samp, uv).rgb, 1.0);
}
Here you should see no difference between two approaches.
Step 3: Post-Processing Pipeline
Now comes the fun part! We’re going to take our rendered 3D scene and apply a pixelation effect to it.
The idea is to create a third rendering pass that sits between our 3D scene and the final display. This pass takes our rendered texture as input, applies the pixelation effect, and outputs a new texture.
Setting Up the Post-Processing Pass
We need to create another offscreen texture to hold our post-processed result:
func draw(in view: MTKView) {
...
// ---- END: MODEL ----
// ---- START: POST-PROCESS ----
let postProcessTextureDescriptor = MTLTextureDescriptor.texture2DDescriptor(
pixelFormat: view.colorPixelFormat,
width: width,
height: height,
mipmapped: false
)
postProcessTextureDescriptor.usage = [.renderTarget, .shaderRead]
postProcessTextureDescriptor.storageMode = .private
let postProcessTexture = device.makeTexture(descriptor: postProcessTextureDescriptor)
let postProcessPassDescriptor = MTLRenderPassDescriptor()
postProcessPassDescriptor.colorAttachments[0].texture = postProcessTexture
postProcessPassDescriptor.colorAttachments[0].loadAction = .clear
postProcessPassDescriptor.colorAttachments[0].storeAction = .store
postProcessPassDescriptor.colorAttachments[0].clearColor = MTLClearColorMake(0, 0, 0, 1)
if
let renderEncoder = commandBuffer.makeRenderCommandEncoder(
descriptor: postProcessPassDescriptor
),
let texture = modelTexture
{
do {
try drawQuants(
in: view,
renderEncoder: renderEncoder,
texture: texture
)
} catch {
fatalError(error.localizedDescription)
}
renderEncoder.endEncoding()
}
// ---- END: POST-PROCESS ----
// ---- START: SCENE BLIT ----
if
let sceneRenderPassDescriptor = view.currentRenderPassDescriptor,
let renderEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: sceneRenderPassDescriptor),
let texture = postProcessTexture // replace modelTexture with postProcessTexture
{ ... }
...
}
Notice how we’re now using postProcessTexture
instead of modelTexture
in the final blit operation. This means we’re displaying the post-processed result, not the original 3D scene.
The Pixelation Effect
The drawQuants
function is where the pixelation magic happens. It takes our rendered 3D scene and applies a quantization effect to create those chunky, blocky pixels:
extension Renderer {
...
func drawQuants(
in view: MTKView,
renderEncoder: any MTLRenderCommandEncoder,
texture: any MTLTexture
) throws {
struct PPUniforms {
var viewportSize: SIMD2<Float>
var pixelSize: Float
var lineThickness: Float
var gridColor: SIMD3<Float>
var gridAlpha: Float
}
let samplerDescriptor = MTLSamplerDescriptor()
samplerDescriptor.minFilter = .nearest
samplerDescriptor.magFilter = .nearest
samplerDescriptor.sAddressMode = .clampToEdge
samplerDescriptor.tAddressMode = .clampToEdge
let sampler = device.makeSamplerState(descriptor: samplerDescriptor)
let pipelineDescriptor = MTLRenderPipelineDescriptor()
pipelineDescriptor.vertexFunction = try! library.quantVertex
pipelineDescriptor.fragmentFunction = try! library.quantFragment
pipelineDescriptor.colorAttachments[0].pixelFormat = view.colorPixelFormat
let renderPipelineState = try device.makeRenderPipelineState(descriptor: pipelineDescriptor)
let width = max(1, Int(view.drawableSize.width))
let height = max(1, Int(view.drawableSize.height))
var ppUniforms = PPUniforms(
viewportSize: SIMD2<Float>(Float(width), Float(height)),
pixelSize: 12.0, // size of each pixel block in screen pixels
lineThickness: 1.0, // grid line thickness in pixels
gridColor: SIMD3<Float>(0.1, 0.1, 0.1), // dark grid
gridAlpha: 0.35 // grid opacity
)
renderEncoder.setRenderPipelineState(renderPipelineState)
renderEncoder.setFragmentTexture(texture, index: 0)
renderEncoder.setFragmentSamplerState(sampler, index: 0)
renderEncoder.setFragmentBytes(&ppUniforms, length: MemoryLayout<PPUniforms>.stride, index: 0)
renderEncoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: 3)
}
}
Key Parameters
The PPUniforms
struct controls the pixelation effect:
pixelSize
: How big each pixel block should be (in screen pixels)lineThickness
: How thick the grid lines should begridColor
: The color of the grid lines (dark gray in this case)gridAlpha
: How opaque the grid lines should be
The Shader Code
The pixelation effect is implemented in the fragment shader. Here’s how it works:
struct PostProcessUniforms {
float2 viewportSize; // in pixels
float pixelSize; // pixelation block size in pixels
float lineThickness; // grid line thickness in pixels
float3 gridColor; // RGB for grid
float gridAlpha; // alpha for grid overlay
};
struct PostProcessVertexOut {
float4 position [[position]];
float2 uv;
};
// Fullscreen triangle vertex shader
vertex PostProcessVertexOut quantVertex(
uint vid [[vertex_id]]
) {
PostProcessVertexOut out;
float2 pos[3] = {
float2(-1.0, -1.0),
float2( 3.0, -1.0),
float2(-1.0, 3.0)
};
float2 uv[3] = {
float2(0.0, 0.0),
float2(2.0, 0.0),
float2(0.0, 2.0)
};
out.position = float4(pos[vid], 0.0, 1.0);
out.uv = uv[vid];
return out;
}
fragment float4 quantFragment(
PostProcessVertexOut in [[stage_in]],
constant PostProcessUniforms &u [[buffer(0)]],
texture2d<float> colorTex [[texture(0)]],
sampler samp [[sampler(0)]]
) {
// Original Texture
float2 texSize = u.viewportSize;
float2 uv = float2(in.uv.x, 1.0 - in.uv.y);
// Convert UV to pixel coordinates
float2 px = uv * texSize;
// Snap to grid
float2 block = floor(px / u.pixelSize) * u.pixelSize + 0.5 * u.pixelSize;
// Back to normalized UV coordinates
float2 qUV = block / texSize;
// Sample the scene color at the block center
float3 base = colorTex.sample(samp, qUV).rgb;
// Grid overlay: draw lines where we are close to the block edges
float2 modv = fmod(px, u.pixelSize);
float2 distToEdge = min(modv, u.pixelSize - modv);
float edgeDist = min(distToEdge.x, distToEdge.y);
float lineMask = smoothstep(u.lineThickness + 0.6, u.lineThickness, edgeDist);
float3 gridRGB = u.gridColor;
float3 colorWithGrid = mix(base, gridRGB, lineMask * u.gridAlpha);
return float4(colorWithGrid, 1.0);
}
Yay! Now we have a Minecraft flower pixelated 3D flower. Drag the slider to compare the original and the pixelated version.


How the Pixelation Algorithm Works
The magic happens in the fragment shader. Here’s the step-by-step process:
-
Convert UV to pixels:
float2 px = uv * texSize
- Convert normalized UV coordinates to actual pixel coordinates -
Snap to grid:
float2 block = floor(px / u.pixelSize) * u.pixelSize + 0.5 * u.pixelSize
- This creates a grid where each block ispixelSize
pixels wide -
Sample at block center:
float2 qUV = block / texSize
- Convert back to UV coordinates and sample the texture at the center of each block -
Add grid lines: The grid overlay code calculates how close each pixel is to the edge of its block and blends in grid lines using
smoothstep
for smooth edges
To build the grid lines mask we first modulate the pixel coordinates by the pixel size to find:
- Relevant pixelated block coordinates (the one that contains the pixel we are processing)
- Coordinates of that pixel in the local space of the pixelated block.
Modulation means that we take the remainder of the division of the pixel coordinates by the pixel size. This gives us the coordinates of the pixel in the local space of the pixelated block.
Next we need to find distance from that pixel to the closest edge of the block. We do this with an assumption that:
- One of the pixel coordinates already has the shortest distance to the edge
- One on the coordinates of
pixelSize - px.coords
has the shortest distance to the edge
With this in mind all we need to do is to find the minimum of these four values - this is what edgeDist
is.
And the last step is to blend the grid lines with the base color using the smoothstep
function. If edgeDist
is less than lineThickness
we blend in the grid lines, otherwise we blend in the base color. smoothstep
makes this blending look smooth and nice, but basically we can use any blending function we want.
Let’s add final touches and make the model rotate. Define the animationTime
and rotationSpeed
properties that will store the animation progress.
final class Renderer: NSObject {
private var animationTime: Float = 0.0
private var rotationSpeed: Float = 1.0
...
}
Animation time is updated with each draw call.
func draw(in view: MTKView) {
guard
let drawable = view.currentDrawable,
let commandQueue = device.makeCommandQueue(),
let commandBuffer = commandQueue.makeCommandBuffer()
else {
return
}
animationTime += 0.016
...
}
Finally, we need to update the rotation component of the model matrix.
func drawModel(
in view: MTKView,
renderEncoder: any MTLRenderCommandEncoder
) throws {
...
for (i, t) in instanceTransforms.enumerated() {
var _t = t
_t.rotation = simd_quatf(
angle: rotationSpeed * animationTime,
axis: SIMD3<Float>(0, 1, 0)
)
ptr[i] = _t.modelMatrix
}
...
}
I gotowe! Here is the final result. I imagine that it can be used as a background for some onboarding flow or something like that.
Conclusion
I hope you enjoyed this journey, because I sure did.
Of course, these are just the most basic things you can do in 3D. We also used a large number of MetalKit settings and components as is, without going into too much detail. And overall, the solution we built can be optimized a lot more (reusing textures, for example)
I will continue to explore this topic in the next articles. I’m really curious to see where I can go with all of this.
Here you can find the source code.
See you 🦄
Related Articles
Oscillating Glowing Strings with Metal and SwiftUI
Create a button with an interactive animation using Metal and SwiftUI. Explore the shader code and SwiftUI integration for a unique UI effect.
Bursting fireworks with Metal
Learn how to clone cool things and call them a firework using power of MetalKit and Swift.
Crafting a dissolve effect in Metal and SwiftUI
Learn to create a stunning dissolve effect in Metal by mastering shader programming, noise functions, and the fundamentals of GPU rendering