Another way to build bounding volume hierarchies (BVH) is by using a method called Surface Area Heuristic (SAH). SAH is part of a more complete algorithm for determining where to split a set of primitives into more bounding volumes or finalizing with a child node.
Similar to how I use octrees to partition the space in the previous article on acceleration structures, this algorithm divides the space in 12 buckets along one single axis. Then uses the surface area calculation to determine the best place to partition the space and recursively build the complete BVH.
You can find a very detailed description of how to use SAH in the PBRT book. I'll be writting my own explanation and will implement it in my Light rendered, and of course, I'll compare it to my octree method in the same scene.
Building a BVH using SAH
(I wonder if I can come up with more acronyms in the rest of the article).
Alright so, to begin with, the algorith follows these steps:
- If there are less than N (you set this to whatever works best, usually 2 or 4) primitives, create a leaf and you are done.
- Select the axis with the largest extent
- Partition this axis in 12 buckets.
- Using the centroids for the primitives bounding boxes (BB), put them in one of the 12 partitions adding the bounding volumes.
- Calculate the cost of partitioning the space in each of the 12 buckets, in other words, select a partition and add the costs of the previous buckets plus the cost of the next buckets. This cost calculation uses the surface area.
- Select the partition with the lowest cost. Now recursively keep building the BVH for the primitives for each partition by going back to step 1.
So far this looks pretty straighforward, the only unknown is how to calculate the surface area cost for an space partition.
Calculating the surface area is simply following this formula:
SA = 2 * (x * y + x * z + y * z)
Where x, y, z
are the values of the diagonal of the bounding box.
And a partition cost is calculated using:
Cost = SA * Number of Primitives
Using this information the Rust implementation for this algorithm is here: Github link. I was going to paste the whole code in this post but it is a wall of text.
If you followed that link, you might have seen that this implementation follows the original algorithm somewhat closely. I took some liberties in some parts to make the code more readable and easier to follow, with the drawback of lower performance. But wait! that lower performance only affects the time it takes to build the BVH, it does not affect the quality of the BVH itself.
Now I'll show it to you in action through the light-wasm
implementation:
async function loadWasm(wasmFile) {
const {
instance: { exports: wasm },
} = await WebAssembly.instantiate(wasmFile, {});
return wasm;
}
async function renderAndPresent(canvas, frameCount, framesAcc, wasm, bmpPtr, finalBitmap) {
const ctx = canvas.getContext('2d');
const width = canvas.width;
const renderStart = performance.now();
// render here
await wasm.render(bmpPtr);
const bitmap = new Uint8Array(wasm.memory.buffer, bmpPtr, framesAcc.length);
for (let i = 0; i < bitmap.length; i += 3) {
const r = bitmap[i];
const g = bitmap[i + 1];
const b = bitmap[i + 2];
finalBitmap[i / 3] =
(255 << 24) |
(Math.min(b, 255) << 16) |
(Math.min(g, 255) << 8) |
Math.min(r, 255);
}
const imageData = new ImageData(new Uint8ClampedArray(finalBitmap.buffer), width);
ctx.putImageData(imageData, 0, 0);
const elapsed = Math.floor(performance.now() - renderStart);
const elapsedMs = `${elapsed}ms|${(Math.round(10000 / elapsed) / 10).toFixed(1)}fps`;
ctx.font = '20px monospace';
ctx.textBaseline = 'top';
const measure = ctx.measureText(elapsedMs);
ctx.fillStyle = '#000000';
ctx.fillRect(0, 0, measure.width, measure.fontBoundingBoxDescent);
ctx.fillStyle = '#999999';
ctx.fillText(elapsedMs, 0, 0);
}
window.running = false;
(async () => {
const canvas = document.getElementById('canvas-1');
const width = canvas.width;
const height = canvas.height;
const wasmFile = await (await fetch('wasm/00088-light_wasm.wasm')).arrayBuffer();
const wasm = await loadWasm(wasmFile);
const bmpPtr = wasm.init(width, height);
let frameCount = 0;
const framesAcc = new Array(width * height * 3);
const finalBitmap = new Uint32Array(width * height);
window.running = true;
const animation = async () => {
frameCount++;
await renderAndPresent(canvas, frameCount, framesAcc, wasm, bmpPtr, finalBitmap);
window.running && window.requestAnimationFrame(animation);
};
window.requestAnimationFrame(animation);
})();
Update
I wanted to know how the renderer would behave with a more complex scene, so I updated the ligh-wasm implementation to accept JSON scene descriptors, check below.
async function loadWasm(wasmFile) {
const {
instance: { exports: wasm },
} = await WebAssembly.instantiate(wasmFile, {});
return wasm;
}
async function renderAndPresent(canvas, frameCount, framesAcc, wasm, bmpPtr, finalBitmap) {
const ctx = canvas.getContext('2d');
const width = canvas.width;
const renderStart = performance.now();
// render here
await wasm.render(bmpPtr);
const bitmap = new Uint8Array(wasm.memory.buffer, bmpPtr, framesAcc.length);
for (let i = 0; i < bitmap.length; i += 3) {
const r = bitmap[i];
const g = bitmap[i + 1];
const b = bitmap[i + 2];
finalBitmap[i / 3] =
(255 << 24) |
(Math.min(b, 255) << 16) |
(Math.min(g, 255) << 8) |
Math.min(r, 255);
}
const imageData = new ImageData(new Uint8ClampedArray(finalBitmap.buffer), width);
ctx.putImageData(imageData, 0, 0);
const elapsed = Math.floor(performance.now() - renderStart);
const elapsedMs = `${elapsed}ms|${(Math.round(10000 / elapsed) / 10).toFixed(1)}fps`;
ctx.font = '20px monospace';
ctx.textBaseline = 'top';
const measure = ctx.measureText(elapsedMs);
ctx.fillStyle = '#000000';
ctx.fillRect(0, 0, measure.width, measure.fontBoundingBoxDescent);
ctx.fillStyle = '#999999';
ctx.fillText(elapsedMs, 0, 0);
}
const scene = {
"camera": {
"eye": [0.0, 0.0, -5.0],
"leftBottom": [-8.0, -4.5, 5.0],
"leftTop": [-8.0, 4.5, 5.0],
"rightTop": [8.0, 4.5, 5.0],
"transforms": [
{
"type": "rotate",
"values": [0.0, 0.0, 0.0]
},
{
"type": "translate",
"values": [0.0, 7.5, -50.0]
}
]
},
"world": {
"materials": [
{
"type": "diffuse",
"color": [1.0, 0.0, 0.0],
"id": "diffuse-red"
},
{
"type": "diffuse",
"color": [0.0, 1.0, 0.0],
"id": "diffuse-green"
},
{
"type": "emissive",
"color": [3.0, 3.0, 3.0],
"id": "emissive-white"
},
{
"type": "emissive",
"color": [1.0, 0.0, 0.0],
"id": "emissive-red"
},
{
"type": "reflective",
"color": [0.0, 0.0, 1.0],
"id": "blue-reflective"
},
{
"type": "refractive",
"color": [1.0, 1.0, 1.0],
"id": "test-refractive"
}
],
"objects": [
{
"type": "sphere",
"center": [17.0, -3.0, 10.0],
"radius": 7.0,
"sections": 10.0,
"material": "blue-reflective"
},
{
"type": "sphere",
"center": [-22.0, -7.0, -10.0],
"radius": 3.0,
"sections": 10.0,
"material": "test-refractive"
},
{
"type": "cube",
"transforms": [
{
"type": "rotate",
"values": [0.0, 0.78539815, 0.78539815]
},
{
"type": "scale",
"values": [5.0, 5.0, 5.0]
},
{
"type": "translate",
"values": [10.0, -6.0, -10.0]
}
]
},
{
"type": "cornellBox",
"transforms": [
{
"type": "scale",
"values": [50.0, 35.0, 50.0]
},
{
"type": "translate",
"values": [0.0, 7.5, 0.0]
}
]
},
{
"type": "torus",
"radius1": 2.0,
"radius2": 5.0,
"steps1": 60,
"steps2": 100,
"transforms": [
{
"type": "rotate",
"values": [-0.78539815, -0.78539815, 0.0]
},
{
"type": "translate",
"values": [-19.0, -4.0, 2.0]
}
],
"material": "diffuse-green"
},
{
"type": "plane",
"material": "emissive-white",
"transforms": [
{
"type": "scale",
"values": [30.0, 10.0, 10.0]
},
{
"type": "translate",
"values": [0.0, 25.0, -5.0]
}
]
}
]
}
};
const sceneJson = JSON.stringify(scene);
const sceneJsonBytes = new TextEncoder().encode(sceneJson);
window.running = false;
(async () => {
const canvas = document.getElementById('canvas-2');
const width = canvas.width;
const height = canvas.height;
const wasmFile = await (await fetch('wasm/00088-light_wasm.wasm')).arrayBuffer();
const wasm = await loadWasm(wasmFile);
const strPtr = wasm.alloc(sceneJsonBytes.length);
const strArray = new Uint8Array(wasm.memory.buffer, strPtr, sceneJsonBytes.length);
for(let i = 0; i < strArray.length; i++) {
strArray[i] = sceneJsonBytes[i];
}
const bmpPtr = wasm.init_from_json(width, height, strArray.length, strPtr);
let frameCount = 0;
const framesAcc = new Array(width * height * 3);
const finalBitmap = new Uint32Array(width * height);
window.running = true;
const animation = async () => {
frameCount++;
await renderAndPresent(canvas, frameCount, framesAcc, wasm, bmpPtr, finalBitmap);
window.running && window.requestAnimationFrame(animation);
};
window.requestAnimationFrame(animation);
})();
This change does not change the performance characteristics, I do have to figure out why the image renders so dark, but that will have to wait for a later time fixed by having an internal frames accumulator using flaoting point inside the light-wasm module and sending the byte array to JavaScript.
Summary
If you go back and compare it to the octree implementation, the Surface Area Heuristic algorith is faster, but not by much, cue to the first table:
Platform | Alogrithm | ms |
---|---|---|
Native | Octree | 73 |
Native | SAH | 62 |
WASM | Octree | 82 |
WASM | SAH | 69 |
But when rendering a more complex scene (like this one) using the following parameters:
cargo run --release -- --json scene.json --png --save --samples 10 --width 1280 --height 720 --bvh-build-method octree --threads 1
And changing the BVH build method from Octree to SAH, I get the following results:
Platform | Algorithm | ms |
---|---|---|
Native | Octree | 16995 |
Native | SAH | 7799 |
The performance delta is much bigger, as you can see. This indicates that the SAH algorithm is MUCH better than the Octree method, much more than the simple scene shows.
SAH puts to shame my Octree building algorithm, I wonder if I can improve its quality to aproach SAH. For now, all renderings will default to use SAH for building the BVH.
Final note
I obtained the numbers in the previous tables using a M4 Pro MBP on Automatic energy mode.