Gaussian Splatting Is Eating 3D Reconstruction
The quiet revolution in 3D capture
For the past few years, NeRF (Neural Radiance Fields) dominated the conversation around AI-powered 3D reconstruction. The idea was elegant: train a tiny MLP to memorize a scene from a set of photos, then render it from any angle. The problem? Rendering was painfully slow — seconds per frame, even on a high-end GPU.
Enter 3D Gaussian Splatting (3DGS), introduced at SIGGRAPH 2023. In under two years, it has become the dominant approach for real-time novel-view synthesis. And not by a small margin — we’re talking 100–1000× faster rendering than NeRF-based methods, often hitting 60+ FPS at 1080p.
What makes it so fast?
The core insight is refreshingly simple: don’t use a neural network at all.
Instead of asking an MLP to predict color and density for every sampled point along a ray, 3DGS represents the scene as a collection of anisotropic 3D Gaussians — essentially tiny, fuzzy ellipsoids floating in space. Each Gaussian carries:
- Position (mean) — where it lives in 3D space
- Covariance (3×3 matrix) — how it’s stretched and oriented
- Opacity (α) — how transparent it is
- Spherical harmonics — view-dependent color
Rendering becomes a rasterization problem: project all Gaussians onto the image plane, sort them front-to-back, and alpha-blend. This maps beautifully to modern GPU pipelines. No MLP evaluation per sample. No volumetric ray marching. Just projection, sorting, and blending.
How training works
The optimization loop is straightforward:
- Start with a sparse point cloud (from SfM via COLMAP)
- Initialize Gaussians at each point
- Render from a training viewpoint
- Compare to ground truth with L1 + SSIM loss
- Backpropagate through the rasterizer
- Adaptive density control — clone Gaussians in under-reconstructed regions, prune low-opacity ones
The adaptive control is key. After every few hundred iterations, the algorithm looks at the gradient of each Gaussian and decides whether to clone (fill gaps) or split (add detail). This lets it grow Gaussians where the scene needs them and cull them where it doesn’t.
Why it matters for XR
Gaussian Splatting lands at a perfect moment for spatial computing:
- Apple Vision Pro and Meta Quest 3 can capture real environments and reconstruct them as 3DGS scenes
- Real-time rendering at native headset framerates (72–120 FPS) makes pre-captured scenes feel like native geometry
- Compact storage — a typical scene is 50–200 MB of Gaussians, far smaller than traditional mesh + texture pipelines for photorealistic captures
- Easy integration with rasterization-based engines means Gaussian Splatting renderers are popping up inside Unity, Unreal, and WebGPU
The catch
It’s not all perfect. The primary pain points:
- View-dependent effects can be inconsistent — reflections and transparency are baked into the spherical harmonics
- Geometry extraction is non-trivial — Gaussians don’t form a watertight surface, so you can’t easily use them for collision or physics
- Storage grows with scene complexity — a detailed city block might need gigabytes
- Editability — you can’t easily delete a chair from a GS scene without artifacts
What’s next?
The research community is moving fast. Some exciting directions:
- 4D Gaussian Splatting for dynamic scenes (video)
- Compressed representations using hash grids and anchor-based encoding
- Surface-aligned Gaussians (SuGaR) for better geometry
- Gaussian SLAM — real-time mapping on headsets
- GaussianAnything — extending to generative models
If 2023 was the year Gaussian Splatting arrived and 2024 was the year it matured, 2025–2026 is the year it ships in real products. Apple, Meta, and Google are all investing heavily. Don’t sleep on it.
Have you worked with Gaussian Splatting? Planning to? I’ll be posting more deep dives on real-time graphics soon — stay tuned.