Photon-Centric Shading with VM-Based Interpretive Transport : A Programmable Light Engine

When Light Becomes Code and Materials Become Interpreters

15 November 2024 30 min read Level: Advanced

I built this prototype to test a somewhat bizarre idea "what if materials could execute code?" Not metaphorically, but literally. Each material becomes a small program, each photon an execution context. It's slow, it's overengineered, and it's absolutely not production-ready. But it taught me things about light transport that I couldn't see in traditional renderers.

A disclaimer upfront

This project has no pretension to be performant or revolutionary. It's a technical playground to explore what happens when you treat photons as VM instructions. If you're looking for a production renderer, this isn't it. If you're curious about an unusual approach to understanding light transport through the lens of computation, read on.

The 8-15x performance overhead compared to traditional renderers ? That's the price of making every single photon decision observable and debuggable. Worth it ? For learning, absolutely. For production, absolutely not.

The Core Philosophy

Light as Programs

Every ray carries executable code. Materials don't just reflect, they interpret and execute.

Observable Execution

Every photon bounce can be traced, every shader decision logged, every path debugged.

Modular Architecture

Materials compile to bytecode. Shaders are programs. The scene becomes a computational graph.

The Mathematics Behind the Light

Core Equations Driving the Engine

Before diving into the VM architecture, let's understand the mathematical foundation. Every equation here is directly implemented in the opcodes, making the math observable.

1. The Rendering Equation

$$L_o(x, \omega_o) = L_e(x, \omega_o) + \int_{\Omega} f_r(x, \omega_i, \omega_o) L_i(x, \omega_i) (\omega_i \cdot n) d\omega_i$$
↳ Outgoing radiance = Emission + Integral of (BRDF × Incoming × Cosine)

In the VM: This equation is decomposed into opcodes. DirectLighting handles the emission and direct illumination, while Scatter sets up the recursive evaluation for Lᵢ.

2. Fresnel Equations (Schlick Approximation)

$$R_0 = \left( \frac{n_1 - n_2}{n_1 + n_2} \right)^2$$
↳ Base reflectance at normal incidence
$$R(\theta) = R_0 + (1 - R_0)(1 - \cos \theta)^5$$
↳ Reflectance at angle θ
R₀ = ((n₁ - n₂) / (n₁ + n₂))² = ...
Reflectivity R(θ) = ... | Transmittance T(θ) = ...
C++ Implementation
inline float schlickReflectance(float cosIncident, float indexRatio) {
    float r0 = (1.0f - indexRatio) / (1.0f + indexRatio);
    r0 = r0 * r0;
    return r0 + (1.0f - r0) * std::pow(1.0f - std::fabs(cosIncident), 5.0f);
}

Why Schlick?

The full Fresnel equations are computationally expensive. Schlick's approximation gives us visually identical results with much simpler math. The Fresnel opcode computes this and stores the result in the execution context for the JumpProbability opcode.

3. Snell's Law (Refraction)

$$n_1 \sin \theta_1 = n_2 \sin \theta_2$$
↳ Refraction angle relationship
$$\omega_t = \frac{n_1}{n_2}\omega_i + \left( \frac{n_1}{n_2}\cos \theta_i - \sqrt{1 - \left(\frac{n_1}{n_2}\right)^2(1 - \cos^2 \theta_i)} \right) n$$
↳ Refracted direction vector
n₁ sin(θ₁) = n₂ sin(θ₂)
C++ Implementation
inline bool refract(const Vec3& incident, const Vec3& normal,
                    float indexRatio, Vec3& refracted) {
    float cosIncident = std::clamp(incident.dot(normal), -1.0f, 1.0f);
    float etaI = 1.0f, etaT = indexRatio;
    Vec3 n = normal;

    // Handle inside/outside transition
    if (cosIncident > 0) {
        std::swap(etaI, etaT);
        n = normal * -1.0f;
    }

    float eta = etaI / etaT;
    float k = 1.0f - eta * eta * (1.0f - cosIncident * cosIncident);

    // Total internal reflection check
    if (k < 0) return false;

    refracted = incident * eta + n * (eta * cosIncident - std::sqrt(k));
    return true;
}

4. Lambertian BRDF

$$f_{\text{diffuse}} = \frac{\rho}{\pi}$$
↳ ρ is the albedo (surface color)
$$\text{PDF}(\omega) = \frac{\cos \theta}{\pi}$$
↳ Cosine-weighted sampling probability
Importance Sampling

The Scatter opcode uses cosine-weighted sampling to generate new ray directions. This importance sampling reduces variance by generating more samples where the BRDF × cosine term is larger.

C++ - Cosine-Weighted Direction Generation
Vec3 cosineWeightedDirection(const Vec3& normal) {
    // Generate points on unit disk
    float u = uniformFloat();
    float v = uniformFloat();
    float radius = std::sqrt(u);
    float theta = 2.0f * 3.14159265359f * v;

    // Convert to hemisphere coordinates
    float x = radius * std::cos(theta);
    float y = radius * std::sin(theta);
    float z = std::sqrt(std::max(0.0f, 1.0f - u));

    // Build orthonormal basis
    Vec3 tangent = std::fabs(normal.z) < 0.999f
        ? Vec3{0, 0, 1} : Vec3{1, 0, 0};
    tangent = tangent.cross(normal).normalized();
    Vec3 bitangent = normal.cross(tangent);

    return (tangent * x + bitangent * y + normal * z).normalized();
}

5. Geometry Factor for Direct Lighting

$$G(x \leftrightarrow y) = V(x \leftrightarrow y) \frac{|\cos \theta_x \cos \theta_y|}{||x - y||^2}$$
↳ V is visibility (0 or 1), θ are angles to surface normals

The Heart of Direct Lighting

This geometry factor is crucial for the DirectLighting opcode. It accounts for:

  • Distance falloff (inverse square law)
  • Surface orientation (cosine terms)
  • Visibility (shadow rays)

6. Next Event Estimation (NEE)

$$L_{\text{direct}} = \int_{A} L_e(y) f_r(x, y \to x, \omega_o) G(x \leftrightarrow y) dA(y)$$
↳ Direct illumination by sampling light sources
$$\approx \frac{1}{N} \sum_{i=1}^{N} \frac{L_e(y_i) f_r(x, y_i \to x, \omega_o) G(x \leftrightarrow y_i)}{p(y_i)}$$
↳ Monte Carlo estimation with N samples
C++ - NEE in DirectLighting Opcode
// Sample point on light source
Vec3 lightPoint, lightNormal;
float lightPdf;  // PDF with respect to area
Vec3 lightEmission = light.samplePoint(rng, lightPoint, lightPdf, lightNormal);

// Compute geometry factor components
Vec3 toLight = lightPoint - hitPoint;
float distanceSquared = toLight.dot(toLight);
float distance = std::sqrt(distanceSquared);
toLight = toLight / distance;

float cosineHit = std::max(0.0f, hit.normal.dot(toLight));
float cosineLight = std::max(0.0f, lightNormal.dot(-toLight));

// BRDF for diffuse surface
Vec3 brdf = albedo * (1.0f / 3.14159265359f);

// Final contribution
float geometryFactor = cosineHit * cosineLight / distanceSquared;
Vec3 contribution = lightEmission * brdf * (geometryFactor / lightPdf);

7. Russian Roulette Path Termination

$$P_{\text{continue}} = \min(1, \max(T.r, T.g, T.b))$$
↳ Probability to continue the path
$$T' = \frac{T}{P_{\text{continue}}}$$
↳ Adjust throughput to remain unbiased
Unbiased Path Termination

Russian Roulette allows us to terminate paths probabilistically while keeping the estimator unbiased. When a path's contribution becomes small, we randomly terminate it but boost the surviving paths to compensate. This is implemented in the main render loop, not as an opcode.

Interactive VM Execution Playground

Live Photon Execution Trace

500ms

VM State
IP: 0
Bounce: 0
Stack: []
Fresnel: 0.00
Throughput: (1.0, 1.0, 1.0)
Medium: Air
Opcode Stream

Try it yourself

Watch how different materials compile to different opcode sequences. The glass material's probabilistic branching based on Fresnel is particularly interesting, sometimes it reflects, sometimes it refracts, all decided by the VM at runtime. Yes, this is overkill. I regret nothing.

The Virtual Machine at the Heart

Why a VM for rendering?

Traditional renderers hardcode material behavior. This creates rigid systems where adding new material types requires modifying core engine code. By introducing a VM layer, we transform materials from static data into dynamic programs.

Photonic VM Architecture

Material Definition

C++ classes with behavior

Shader Compilation

compileShader() → opcodes

VM Execution

Photon-driven interpretation

The Instruction Set

Opcode Purpose Parameters Effect
Scatter Diffuse reflection RGB albedo Generates cosine-weighted direction
Reflect Mirror reflection None Perfect specular bounce
Refract Transmission IOR Snell's law refraction
Fresnel Reflection probability IOR Computes Schlick approximation
DirectLighting Light sampling RGB albedo Next Event Estimation
JumpProbability Conditional branch Probability, target Stochastic control flow
Call Function call Block ID Pushes return address
Return Function return None Pops call stack

Material as Code

C++20 - Diffuse Material
class DiffuseMaterial : public Material {
    Vec3 albedoColor;

public:
    CodeBlock compileShader() const override {
        CodeBlock block;
        // Sample direct lighting with albedo
        block.instructions.push_back({
            Opcode::DirectLighting,
            albedoColor.x, albedoColor.y, albedoColor.z, 0
        });
        // Scatter for indirect lighting
        block.instructions.push_back({
            Opcode::Scatter,
            albedoColor.x, albedoColor.y, albedoColor.z, 0
        });
        block.instructions.push_back({Opcode::Return, 0, 0, 0, 0});
        return block;
    }
};

The Beauty of Simplicity

A diffuse material is just three instructions: sample the light, scatter for the next bounce, and return. The VM handles all the complexity of execution, while the material definition remains clean and understandable.

C++20 - Glass Material
class GlassMaterial : public Material {
    float refractiveIndex;

public:
    CodeBlock compileShader() const override {
        CodeBlock block;
        // Compute Fresnel reflectance
        block.instructions.push_back({Opcode::Fresnel, refractiveIndex, 0, 0, 0});
        // Jump based on Fresnel probability
        block.instructions.push_back({Opcode::JumpProbability, -1, 0, 0, 4});
        // Refraction path
        block.instructions.push_back({Opcode::Refract, refractiveIndex, 0, 0, 0});
        block.instructions.push_back({Opcode::Return, 0, 0, 0, 0});
        // Reflection path
        block.instructions.push_back({Opcode::Reflect, 0, 0, 0, 0});
        block.instructions.push_back({Opcode::Return, 0, 0, 0, 0});
        return block;
    }
};
Stochastic Control Flow

Glass demonstrates the power of probabilistic branching. The Fresnel opcode computes reflection probability, then JumpProbability uses it to choose between reflection and refraction paths, all decided at runtime by the VM.

The Execution Context

ExecutionContext {
    currentBlock: 2                    // Material shader block
    instructionPointer: 3              // Current instruction
    callStack: [(0, 5)]               // Return addresses
    reflectionProbability: 0.04       // Fresnel result
    refractionDirection: (0.2, -0.8, 0.3)
    canRefract: true
    isInsideMedium: false
}

Stateful Execution

Each photon carries its execution context through the scene. This includes not just where it is in the program (instruction pointer), but also computed values like Fresnel reflectance and whether refraction is geometrically possible. The context evolves as the photon bounces.

The Wavefront Executor

C++20 - Core Execution Loop
static Vec3 executeShader(const Scene& scene, const RectangularLight& light,
                            RandomGenerator& rng, const HitRecord& hit,
                            Ray& nextRay, Vec3& throughput,
                            ExecutionContext& context) {
    Vec3 radiance{0, 0, 0};

    while (true) {
        const CodeBlock& block = scene.getShaderProgram().blocks[context.currentBlock];
        const Instruction& instruction = block.instructions[context.instructionPointer];

        switch (instruction.opcode) {
            case Opcode::Scatter: {
                Vec3 albedo{instruction.paramA, instruction.paramB, instruction.paramC};
                nextRay.direction = rng.cosineWeightedDirection(hit.normal);
                throughput = throughput * albedo;
                context.instructionPointer++;
                break;
            }

            case Opcode::JumpProbability: {
                float probability = instruction.paramA < 0
                    ? context.reflectionProbability
                    : instruction.paramA;
                if (rng.uniformFloat() < probability) {
                    context.instructionPointer = instruction.targetAddress;
                } else {
                    context.instructionPointer++;
                }
                break;
            }
            // ... other opcodes ...
        }
    }
    return radiance;
}

Debugging the Invisible

Pixel-by-Pixel Tracing

Select any pixel and see every material hit, every opcode executed

Computational Heatmaps

Visualize which pixels require the most computation

IR Profiling

See which shader blocks are visited most frequently

Execution Traces

Complete logs of photon journeys through your scene

Debug Output Example
pixel(320,320)
hit white
  direct
  scatter
  ret
hit mirror
  call
  reflect
  ret
  ret
hit glass
  fresnel
  jmp
  refract
  ret
---

Library Functions and Code Reuse

Shared Shader Libraries

The VM supports a library system where common operations can be defined once and called from multiple materials. For example, a reflection library that multiple materials can invoke:

C++20 - Mirror Material Using Library
class MirrorMaterial : public Material {
    int reflectionLibraryId;

public:
    CodeBlock compileShader() const override {
        CodeBlock block;
        // Call the shared reflection implementation
        block.instructions.push_back({Opcode::Call, 0, 0, 0, reflectionLibraryId});
        block.instructions.push_back({Opcode::Return, 0, 0, 0, 0});
        return block;
    }
};

Scene Compilation Pipeline

1. Material Registration

Add materials to scene

2. Library Creation

Build shared code blocks

3. Shader Compilation

Materials → IR bytecode

4. Link & Optimize

Resolve calls, build program

Performance Considerations

Why It's Slow (And Why That's OK)

This renderer prioritizes observability over speed:

  • Interpretation overhead: Each bounce involves VM instruction dispatch
  • Instrumentation cost: Tracking, logging, and profiling add overhead
  • Indirect calls: Virtual dispatch prevents many optimizations
  • Debug information: Maintaining execution context for debugging

But this "slowness" gives us something invaluable: complete visibility into the rendering process. Every photon's decision is observable, every material's behavior is traceable.

The Bigger Picture

Beyond Traditional Rendering

This architecture opens doors to possibilities that traditional renderers can't explore:

  • Dynamic material generation: Materials could be generated or modified at runtime
  • Genetic programming: Evolve material behaviors through shader mutation
  • Machine learning integration: Train materials to respond to light
  • Interactive debugging: Step through light transport like code

Conceptual Rendering Pipeline

Implementation Insights

Modular Design

Clean separation between geometry, materials, and execution

State Management

Execution context travels with each photon through the scene

Material Freedom

New materials need only implement compileShader()

Observable Metrics

Every decision point can be instrumented and analyzed

Real-World Usage Example

C++20 - Complete Scene Setup
Scene scene;

// Register materials
int whiteMaterial = scene.addMaterial(
    std::make_unique<DiffuseMaterial>(Vec3{0.78f, 0.78f, 0.78f}, "white")
);
int mirrorMaterial = scene.addMaterial(
    std::make_unique<MirrorMaterial>(0, "mirror")
);
int glassMaterial = scene.addMaterial(
    std::make_unique<GlassMaterial>(1.5f, "glass")
);

// Build geometry
scene.addGeometry(std::make_unique<Sphere>(Vec3{0, 0, 3}, 0.5f, glassMaterial));

// Compile all shaders to VM bytecode
scene.compileShaders();

// Create renderer with debugging enabled
WavefrontRenderer renderer(scene, camera, light, settings);
renderer.render();

// Save outputs including debug information
renderer.saveImage("beauty.ppm");
renderer.saveHeatMap("complexity.ppm");
renderer.saveDebugTrace("photon_trace.txt");
renderer.saveIRDump("shader_bytecode.txt");

The Debug Outputs

IR Dump Example

block 0 visits 0
block 1 visits 847293      // Reflection library - heavily used
  0 reflect 0 0 0 0
  1 ret 0 0 0 0
block 2 visits 423847       // White diffuse material
  0 direct 0.78 0.78 0.78 0
  1 scatter 0.78 0.78 0.78 0
  2 ret 0 0 0 0
block 3 visits 98234        // Glass material
  0 fresnel 1.5 0 0 0
  1 jmp -1 0 0 4
  2 refract 1.5 0 0 0
  3 ret 0 0 0 0
  4 reflect 0 0 0 0
  5 ret 0 0 0 0

Future Directions

Where This Could Go

The VM architecture opens fascinating possibilities:

Evolutionary Materials

Mutate and evolve shader bytecode to discover new materials

Neural Shaders

Train materials using reinforcement learning on VM instructions

Time-Travel Debugging

Record and replay photon execution paths

JIT Compilation

Compile hot paths to native code while maintaining observability

Philosophical Reflections

What I learned from this experiment

Building this taught me that sometimes the most valuable code is the slowest code. By forcing myself to make every photon decision explicit and observable, I had to really understand what was happening at each bounce. No hiding behind optimized libraries or GPU kernels, just raw, interpretable light transport.

The VM overhead? Sure, it's painful for performance. But watching a photon step through its shader instructions, seeing exactly why it chose to reflect instead of refract at a particular Fresnel angle, that's worth the nanoseconds.

This renderer will never ship in a product. But it shipped understanding to my brain, and sometimes that's the only optimization that matters.

Technical Deep Dive: The Executor

C++20 - Direct Lighting Implementation
case Opcode::DirectLighting: {
    Vec3 albedo{instruction.paramA, instruction.paramB, instruction.paramC};

    // Sample a point on the light
    Vec3 lightPoint, lightNormal;
    float lightPdf;
    Vec3 lightEmission = light.samplePoint(rng, lightPoint, lightPdf, lightNormal);

    // Compute geometry
    Vec3 toLight = lightPoint - hitPoint;
    float distanceSquared = toLight.dot(toLight);
    float distance = std::sqrt(distanceSquared);
    toLight = toLight / distance;

    // Check visibility
    Ray shadowRay{hitPoint, toLight};
    HitRecord shadowHit = scene.traceRay(shadowRay);

    if (!shadowHit.isHit || shadowHit.distance > distance - 1e-3f) {
        // Compute contribution
        float cosineHit = std::max(0.0f, hit.normal.dot(toLight));
        float cosineLight = std::max(0.0f, lightNormal.dot(-toLight));

        Vec3 brdf = albedo * (1.0f / 3.14159265359f);
        float geometryFactor = cosineHit * cosineLight / distanceSquared;

        radiance += lightEmission * brdf * (geometryFactor / lightPdf);
    }

    context.instructionPointer++;
    break;
}

Next Event Estimation in the VM

Even complex algorithms like NEE (Next Event Estimation) become VM instructions. This allows materials to decide whether to sample direct lighting, making the system incredibly flexible. A material could choose to skip direct lighting entirely, or implement its own custom light sampling strategy.

Making It Production-Ready (If You Really Want To)

How to eliminate the overhead

Look, I know I said this wasn't meant for production. But if you're stubborn and want to make this actually fast, here's exactly what you'd need to do:

Bottleneck Current Approach Production Solution Expected Speedup
VM Dispatch Overhead Switch statement per opcode JIT compile hot paths to native code 5-10x
Context Switching Full context per photon SIMD wavefront execution (32-64 rays) 8-16x
Memory Access Pointer chasing through blocks Flatten to linear bytecode array 2-3x
Debug Instrumentation Always enabled Compile-time flag removal 1.5-2x
Dynamic Material Loading Runtime interpretation AOT compilation to GPU kernels 20-50x
C++20 - JIT Compilation Approach
// Instead of interpreting, compile to native code
class JITCompiler {
    using ShaderFunc = Vec3(*)(
        const HitRecord&,
        Ray&,
        Vec3&,
        RandomGenerator&
    );

    ShaderFunc compile(const CodeBlock& block) {
        // Use LLVM or similar to generate native code
        // Each opcode becomes inlined assembly
        // Branches become native jumps
        // No more switch statements!

        return generatedFunction;
    }
};

The irony of optimization

I know this might seem ridiculous, building a VM just to compile it away, but look at this: we'd end up with the observability when we need it (interpreted mode) and the speed when we don't (JIT mode). The VM becomes a high-level IR that can target multiple backends. WebGPU compute shaders, anyone?

Lessons Learned

Debugging > Performance

Being able to step through a photon's decision made bugs obvious that would have taken hours to find in optimized code

Materials are Algorithms

Treating materials as programs revealed patterns I'd never noticed, like how glass is just probabilistic branching

Profiling at Opcode Level

Seeing which instructions run millions of times vs. rarely helped understand the real computational cost

Convergence Patterns

The heatmaps revealed which pixels were "expensive" usually glass edges where total internal reflection fights with refraction

What Could This Actually Be Used For?

Real Applications (Surprisingly)

1. Renderer Education

Perfect for teaching light transport. Students can literally watch photons make decisions.

2. Material Development

Artists could prototype complex materials by writing simple opcodes instead of full shaders.

3. Debugging Production Renderers

Run problematic scenes through this to understand what's happening, then optimize the real renderer.

4. Research Platform

Test new light transport algorithms where observability matters more than speed.

Advantages vs Limitations

Advantages Limitations
Complete observability of light transport 8-15x slower than optimized renderers
Materials as first-class programs Memory overhead from execution contexts
Pixel-perfect debugging capabilities Not GPU-friendly architecture
Easy to add new material types Complex materials = more opcodes = slower
Built-in profiling and analysis Requires understanding VM concepts

Conclusion

A Tool for Understanding, Not Production

If you've made it this far, you understand: this isn't about building a better renderer. It's about building a different kind of renderer. One where the goal isn't pixels per second, but insights per bounce.

Every equation above is directly observable in the VM execution. Every Fresnel calculation leaves a trace. Every shadow ray can be stepped through. It's inefficient, overengineered, and absolutely the wrong tool for rendering your next animation.

But if you want to understand, really understand, what your renderer is doing when it bounces light around a scene, then maybe treating photons as VM instructions isn't such a crazy idea after all.

The code is what it is: an experiment in making the invisible visible. Use it to learn, not to ship.

Cornell Box rendered with the Photonic VM Renderer

Sample Render (Hover for Opcode Trace)

Resolution: 640x640 pixels
Samples: 256 per pixel
VM Instructions: ~2.1M per pixel
Debug Data: 47 MB generated

Every photon in this image executed shader programs. Every material decision was logged. The beauty you see is the result of millions of tiny virtual machines working in concert. Still wondering if photons should really run code? Me too.

Get the Code / Discuss

The complete source code is available if you want to experiment with programmable photons yourself:

Discuss This Approach

If you've done something similar, found this useful for teaching, or just think treating light as code is as weird as I do, feel free to reach out. I'm particularly curious if anyone has ideas for the JIT compilation approach.