SLAM

Simultaneous Localization and Mapping — build a map of the environment while simultaneously tracking the robot’s position within it. The fundamental chicken-and-egg problem of mobile robotics: you need a map to localize, but you need your position to build a map.

Why It Matters

Any autonomous robot operating in an unknown environment needs SLAM. Self-driving cars, delivery robots, vacuum cleaners, drones — all use some form of SLAM. It’s the foundation that enables autonomous navigation without pre-built maps or GPS (which doesn’t work indoors).

The Problem

Robot moves:   odometry says "I moved 1m forward"
               (but wheel slip, drift → actual movement is 0.95m at 2° angle)

Robot senses:  sensor sees landmarks (walls, corners, features)
               (but measurements are noisy)

Challenge:     estimate true position AND landmark positions
               from noisy motion + noisy sensors

Errors accumulate over time. Without correction, the estimated position drifts. Loop closure — recognizing a previously visited place — dramatically reduces accumulated error.

Approaches

EKF-SLAM

Maintain one large state vector: [robot_x, robot_y, robot_θ, lm1_x, lm1_y, lm2_x, lm2_y, ...]

The Extended Kalman Filter updates all positions when any landmark is observed. Cross-correlations between landmarks improve estimates globally.

Limitation: state vector grows with every new landmark. The covariance matrix is O(n²) — too expensive for environments with thousands of landmarks.

Particle Filter SLAM (FastSLAM)

Each particle represents one hypothesis for the robot’s path. Each particle maintains its own map (independent landmark estimates).

1. Move particles according to motion model (with noise)
2. Each particle updates its map with new observations
3. Weight particles by measurement likelihood
4. Resample: high-weight particles survive, low-weight die

Advantage: handles nonlinear motion models well. Each particle’s map is independent → O(n log n) instead of O(n²).

Graph-Based SLAM (Modern Standard)

Build a graph where:

  • Nodes = robot poses at different times
  • Edges = constraints (odometry between consecutive poses, landmark observations, loop closures)

Then optimize the graph to find the pose configuration that best satisfies all constraints.

 P0 ---odometry--- P1 ---odometry--- P2 --- ... --- Pn
  |                 |                                 |
  +---landmark---   +---landmark---     loop closure--+
                                        (P0 ≈ Pn !)

Optimization minimizes the sum of squared constraint errors. Solved with Gauss-Newton or Levenberg-Marquardt. Libraries: g2o, GTSAM, Ceres.

Advantage: can re-optimize the entire trajectory when a loop closure is detected, fixing drift retroactively.

Visual SLAM

Uses cameras instead of (or in addition to) LiDAR:

SystemApproachNotes
ORB-SLAM3Feature-based (ORB keypoints)Mono/stereo/RGB-D, loop closure, map reuse
LSD-SLAMDirect (pixel intensities)Dense reconstruction, no features
RTAB-MapAppearance-based loop closureRGB-D, memory management for large maps
VINS-MonoVisual-inertialCamera + IMU fusion, works outdoors

Visual SLAM pipeline:

  1. Feature extraction: detect keypoints (ORB, SIFT) in each frame
  2. Feature matching: track points between frames
  3. Motion estimation: compute camera movement from matched points
  4. Mapping: triangulate 3D positions of tracked points
  5. Loop closure: detect revisited places, correct drift

Loop Closure

The most important part of SLAM. When the robot returns to a previously visited area:

  1. Detection: recognize the place (bag-of-words on visual features, scan matching for LiDAR)
  2. Verification: confirm it’s a true match (not a perceptual alias)
  3. Correction: add a constraint to the pose graph, re-optimize

Without loop closure, odometry error grows without bound. With loop closure, the entire trajectory is corrected retroactively.

Sensor Comparison for SLAM

SensorCostRangeAccuracyConditionsTypical Use
2D LiDAR50012-30m±3cmAll lightingIndoor robots
3D LiDAR75k100m+±2cmAll lightingSelf-driving cars
Mono camera50Vision rangeScale ambiguityNeeds texture/lightDrones, phones
Stereo camera5000.5-20m±1-5cmNeeds texture/lightIndoor/outdoor
RGB-D (depth)3000.3-10m±1cmIndoor (IR interference outdoors)Indoor robots
IMU50N/A (motion)DriftsAnyAlways used as supplement