LDS Sampler : Advanced Quasi-Random Sampling

Building a High-Performance Sampler for Path Tracing

31 July 2025 25 min read Level: Advanced

In the world of 3D rendering, the quality of a sampler can make the difference between a noisy image and a photorealistic render. Today, I'll share a Low Discrepancy Sampler (LDS) variant I've experimented with the result isn't optimal in all cases, but in several tested scenes, it has shown real utility.

Why this sampler is interesting

Optimal performance

Direct generation without precomputed tables. Ideal for real-time rendering and large data volumes.

Infinite extensibility

Support of thousands of dimensions without degradation. Perfect for complex algorithms requiring many random variables.

Controlled quality

Combine the advantages of quasi-random sequences with a sophisticated scrambling to avoid visible patterns.

Minimal state

Only 5 integers of state. Facilitate parallelization and serialization for distributed calculations.

What I'll cover

  • The mathematics behind low discrepancy sequences
  • How this sampler is built
  • The role of hashing and scrambling
  • Implementation details and usage

The Sampling Problem

Why sampling is crucial

In path tracing, you need to sample millions of directions to simulate light. A bad sampler creates noise and slows down convergence. A good sampler distributes samples uniformly while avoiding visible patterns.

LDS Sample Distribution

LDS Sampler Architecture

🔐

Hashing

Hash64 and Hash32 functions

🔢

Prime Numbers

Based on √p sequences

🌀

Scrambling

Permutation layer

📐

Multi-Dimensional

Dimension indexing

1. Hashing Functions

C++20
inline uint64_t hash64(uint64_t x) const {
    x ^= x >> 33;
    x *= 0xff51afd7ed558ccdULL;
    x ^= x >> 33;
    x *= 0xc4ceb9fe1a85ec53ULL;
    x ^= x >> 33;
    return x;
}

About this hashing function

This function uses the SplitMix64 technique: • The shifts (>>) spread the high-order bits to the bottom • The multiplications by constants create the avalanche effect • The XOR operations mix the bits • The hash allows an apparently balanced distribution on [0, 1), at least sufficient for my usage

hash64(x) = x ⊕ (x >> 33)
↳ First diffusion
hash64(x) = hash64(x) × 0xff51afd7ed558ccd
↳ Multiplication by large prime number
hash64(x) = hash64(x) ⊕ (hash64(x) >> 33)
↳ Second diffusion
hash64(x) = hash64(x) × 0xc4ceb9fe1a85ec53
↳ Final avalanche

2. The Base Sequence

C++20
double base;
if (dim < 32) {
    base = 1.0 / std::sqrt(static_cast<double>(PRIMES[dim]));
} else {
    uint32_t p = 131 + (dim - 31) * 6;
    while (!isPrime(p)) ++p;
    base = 1.0 / std::sqrt(static_cast<double>(p));
}
The 1/√p sequence

This sampler uses the inverse of the square root of prime numbers as base. This formula comes from number theory:

xₙ = frac(x₀ + n × 1/√p)

where p is a prime number

This sequence has interesting distribution properties because 1/√p is irrational for any prime p.

p = 2
100

3. The Scrambling

C++20
inline uint32_t permute(uint32_t i, uint32_t l, uint32_t p) const {
    uint32_t w = l - 1;
    w |= w >> 1;
    w |= w >> 2;
    w |= w >> 4;
    w |= w >> 8;
    w |= w >> 16;
    do {
        i ^= p;
        i *= 0xe170893d;
        i ^= p >> 16;
        i ^= (i & w) >> 4;
        i ^= p >> 8;
        i *= 0x0929eb3f;
        i ^= p >> 23;
        i ^= (i & w) >> 1;
        i *= 1 | p >> 27;
        i *= 0x6935fa69;
        i ^= (i & w) >> 11;
        i *= 0x74dcb303;
        i ^= (i & w) >> 2;
        i *= 0x9e501cc3;
        i ^= (i & w) >> 2;
        i *= 0xc860a3df;
        i &= w;
        i ^= i >> 5;
    } while (i >= l);
    return (i + p) % l;
}

The Kensler Algorithm

This function implements a pseudo-random permutation: • Guarantees a bijection (each input has a unique output) • Avoids short cycles • Distributes values uniformly • Fast to calculate The mask w ensures staying in the range [0, l).

4. The Coordinate Generation

C++20
double generateCoordinate(uint32_t pixel, uint32_t bounce, uint32_t sample, uint32_t dim) const {
    uint64_t h1 = hash64(seed);
    h1 = hash64(h1 + pixel * 0x45d9f3b3335b369ULL);
    h1 = hash64(h1 + bounce * 0x86fe3c167e5e2d7ULL);
    h1 = hash64(h1 + dim * 0x9e3779b97f4a7c15ULL);

    uint64_t h2 = hash64(h1 + sample);

    double alpha = frac(h1 * 5.42101086242752217e-20);
    double beta = frac(h2 * 5.42101086242752217e-20);

    double base;
    if (dim < 32) {
        base = 1.0 / std::sqrt(static_cast<double>(PRIMES[dim]));
    } else {
        uint32_t p = 131 + (dim - 31) * 6;
        while (!isPrime(p)) ++p;
        base = 1.0 / std::sqrt(static_cast<double>(p));
    }

    double value = frac(alpha + sample * base);

    uint32_t scramble_seed = hash32(static_cast<uint32_t>(h2 ^ (h2 >> 32)));
    uint32_t scrambled = scramble(static_cast<uint32_t>(value * (1u << 20)), 20, scramble_seed);
    value = scrambled / static_cast<double>(1u << 20);

    value = frac(value + beta * 0.5);

    return value;
}
The Generation Pipeline

This pipeline combines several techniques I've explored:

  1. Multi-level hashing: Each parameter (pixel, bounce, dimension) is hashed separately
  2. Alpha shift: Randomizes the starting point of the sequence
  3. Base sequence: Uses 1/√p for distribution
  4. Scrambling: Breaks correlations
  5. Beta shift: Adds a final variation

Pipeline Visualization

0
0

5. The User Interface

C++20
class LDS_Sampler {
public:
    LDS_Sampler(uint64_t s = 0) : seed(s), pixel_index(0), bounce_index(0), dimension_index(0), sample_index(0) {}

    void startPixel(uint32_t pixel_idx) {
        pixel_index = pixel_idx;
        bounce_index = 0;
        dimension_index = 0;
        sample_index = 0;
    }

    void startSample(uint32_t sample_idx) {
        sample_index = sample_idx;
        bounce_index = 0;
        dimension_index = 0;
    }

    void startBounce(uint32_t bounce_idx) {
        bounce_index = bounce_idx;
        dimension_index = 0;
    }

    double get_1d() {
        double value = generateCoordinate(pixel_index, bounce_index, sample_index, dimension_index);
        dimension_index++;
        return value;
    }

    std::pair<double, double> get_2d() {
        double x = generateCoordinate(pixel_index, bounce_index, sample_index, dimension_index);
        double y = generateCoordinate(pixel_index, bounce_index, sample_index, dimension_index + 1);
        dimension_index += 2;
        return {x, y};
    }

    std::vector<double> get_nd(uint32_t n) {
        std::vector<double> values(n);
        for (uint32_t i = 0; i < n; ++i) {
            values[i] = generateCoordinate(pixel_index, bounce_index, sample_index, dimension_index + i);
        }
        dimension_index += n;
        return values;
    }
};

Hierarchical Organization

This sampler organizes samples in hierarchy: • Pixel: Each pixel of the image has its own sequence • Sample: Several samples per pixel (anti-aliasing) • Bounce: Each light bounce is independent • Dimension: Coordinates for directions, positions, etc.

Practical Usage

C++20
LDS_Sampler sampler(42);

for (uint32_t pixel = 0; pixel < width * height; ++pixel) {
    sampler.startPixel(pixel);

    for (uint32_t s = 0; s < samples_per_pixel; ++s) {
        sampler.startSample(s);

        auto [u, v] = sampler.get_2d();
        Ray ray = camera.getRay(u, v);

        for (uint32_t bounce = 0; bounce < max_bounces; ++bounce) {
            sampler.startBounce(bounce);

            auto [dir_u, dir_v] = sampler.get_2d();
            vec3 direction = sampleHemisphere(dir_u, dir_v);

            float russian_roulette = sampler.get_1d();
        }
    }
}
Usage Guidelines
  • Call startPixel for each new pixel
  • Use startSample for anti-aliasing
  • Call startBounce for each bounce
  • Consume dimensions in order
  • Reuse the same sampler for the entire image

Performance and Optimizations

Cache-Friendly

Sequential access to data, no large tables

🔧

SIMD-Ready

Can generate several samples in parallel

💾

Minimal State

Only 5 integers of state, easy to serialize

🎯

Deterministic

Reproducible results with the same seed

Concrete advantages of this sampler

Superior performance

  • Direct generation without expensive Sobol tables
  • Cache-friendly: sequential data access
  • SIMD-ready: can generate several samples in parallel
  • Minimal branching: code optimized for GPUs

🎯 Improved quality

  • Fast convergence thanks to 1/√p sequences
  • Advanced scrambling avoids visible correlations
  • Uniform distribution even in high dimension
  • No repetitive patterns in renders

Bonus: Simplicity of implementation

Unlike Sobol samplers which require precomputed tables and complex algorithms, this sampler boils down to a few hash functions and simple mathematical calculations. Ideal for integration into existing render engines.

C++20 - Full Code
#include <iostream>
#include <vector>
#include <cmath>
#include <random>
#include <iomanip>
#include <chrono>
#include <algorithm>
#include <numeric>
#include <cassert>

class LDS_Sampler {
private:
    static constexpr double PHI = 1.618033988749895;
    static constexpr double PHI_CONJ = 0.618033988749895;
    static constexpr double SQRT2 = 1.4142135623730951;
    static constexpr double SQRT3 = 1.7320508075688772;
    static constexpr double SQRT5 = 2.23606797749979;

    static constexpr uint32_t PRIMES[32] = {
        2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53,
        59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131
    };

    uint64_t seed;
    uint32_t pixel_index;
    uint32_t bounce_index;
    uint32_t dimension_index;
    uint32_t sample_index;

    inline uint64_t hash64(uint64_t x) const {
        x ^= x >> 33;
        x *= 0xff51afd7ed558ccdULL;
        x ^= x >> 33;
        x *= 0xc4ceb9fe1a85ec53ULL;
        x ^= x >> 33;
        return x;
    }

    inline uint32_t hash32(uint32_t x) const {
        x ^= x >> 16;
        x *= 0x85ebca6b;
        x ^= x >> 13;
        x *= 0xc2b2ae35;
        x ^= x >> 16;
        return x;
    }

    inline double frac(double x) const {
        return x - std::floor(x);
    }

    inline bool isPrime(uint32_t n) const {
        if (n <= 1) return false;
        if (n <= 3) return true;
        if (n % 2 == 0 || n % 3 == 0) return false;
        for (uint32_t i = 5; i * i <= n; i += 6) {
            if (n % i == 0 || n % (i + 2) == 0) return false;
        }
        return true;
    }

    inline uint32_t permute(uint32_t i, uint32_t l, uint32_t p) const {
        uint32_t w = l - 1;
        w |= w >> 1;
        w |= w >> 2;
        w |= w >> 4;
        w |= w >> 8;
        w |= w >> 16;
        do {
            i ^= p;
            i *= 0xe170893d;
            i ^= p >> 16;
            i ^= (i & w) >> 4;
            i ^= p >> 8;
            i *= 0x0929eb3f;
            i ^= p >> 23;
            i ^= (i & w) >> 1;
            i *= 1 | p >> 27;
            i *= 0x6935fa69;
            i ^= (i & w) >> 11;
            i *= 0x74dcb303;
            i ^= (i & w) >> 2;
            i *= 0x9e501cc3;
            i ^= (i & w) >> 2;
            i *= 0xc860a3df;
            i &= w;
            i ^= i >> 5;
        } while (i >= l);
        return (i + p) % l;
    }

    inline uint32_t scramble(uint32_t value, uint32_t bits, uint32_t seed) const {
        uint32_t n = 1u << bits;
        uint32_t digit = static_cast<uint32_t>(value * n);
        digit = permute(digit, n, seed);
        return digit;
    }

    double generateCoordinate(uint32_t pixel, uint32_t bounce, uint32_t sample, uint32_t dim) const {
        uint64_t h1 = hash64(seed);
        h1 = hash64(h1 + pixel * 0x45d9f3b3335b369ULL);
        h1 = hash64(h1 + bounce * 0x86fe3c167e5e2d7ULL);
        h1 = hash64(h1 + dim * 0x9e3779b97f4a7c15ULL);

        uint64_t h2 = hash64(h1 + sample);

        double alpha = frac(h1 * 5.42101086242752217e-20);
        double beta = frac(h2 * 5.42101086242752217e-20);

        double base;
        if (dim < 32) {
            base = 1.0 / std::sqrt(static_cast<double>(PRIMES[dim]));
        } else {
            uint32_t p = 131 + (dim - 31) * 6;
            while (!isPrime(p)) ++p;
            base = 1.0 / std::sqrt(static_cast<double>(p));
        }

        double value = frac(alpha + sample * base);

        uint32_t scramble_seed = hash32(static_cast<uint32_t>(h2 ^ (h2 >> 32)));
        uint32_t scrambled = scramble(static_cast<uint32_t>(value * (1u << 20)), 20, scramble_seed);
        value = scrambled / static_cast<double>(1u << 20);

        value = frac(value + beta * 0.5);

        return value;
    }

public:
    LDS_Sampler(uint64_t s = 0)
        : seed(s), pixel_index(0), bounce_index(0), dimension_index(0), sample_index(0) {}

    void setSeed(uint64_t s) { seed = s; }

    void startPixel(uint32_t pixel_idx) {
        pixel_index = pixel_idx;
        bounce_index = 0;
        dimension_index = 0;
        sample_index = 0;
    }

    void startSample(uint32_t sample_idx) {
        sample_index = sample_idx;
        bounce_index = 0;
        dimension_index = 0;
    }

    void startBounce(uint32_t bounce_idx) {
        bounce_index = bounce_idx;
        dimension_index = 0;
    }

    void resetDimension() {
        dimension_index = 0;
    }

    double get_1d() {
        double value = generateCoordinate(pixel_index, bounce_index, sample_index, dimension_index);
        dimension_index++;
        return value;
    }

    std::pair<double, double> get_2d() {
        double x = generateCoordinate(pixel_index, bounce_index, sample_index, dimension_index);
        double y = generateCoordinate(pixel_index, bounce_index, sample_index, dimension_index + 1);
        dimension_index += 2;
        return {x, y};
    }

    std::vector<double> get_nd(uint32_t n) {
        std::vector<double> values(n);
        for (uint32_t i = 0; i < n; ++i) {
            values[i] = generateCoordinate(pixel_index, bounce_index, sample_index, dimension_index + i);
        }
        dimension_index += n;
        return values;
    }

    uint32_t getCurrentDimension() const { return dimension_index; }
    uint32_t getCurrentBounce() const { return bounce_index; }
    uint32_t getCurrentPixel() const { return pixel_index; }
    uint32_t getCurrentSample() const { return sample_index; }
};

Concrete Results

This LDS sampler combines ideas from number theory, cryptographic hashing, and numerical optimization to create an efficient sampling tool.

The implementation is designed for path tracers, particle systems, or other algorithms requiring high-quality quasi-random sampling.

Sample Render with this Sampler

3D render using the LDS Sampler - Metallic spheres in a Cornell Box

Render Details

Technique: Path tracing with this LDS sampler
Resolution: 1600x1600 pixels
Samples: 128 samples per pixel
Materials: Metal, glass, diffuse

This image demonstrates the sampler's ability to produce clean renders with fast convergence. Note the absence of structured noise and the quality of reflections and refractions.