Memory Management
Virtual memory gives every process its own private address space (0 to 2^48 on x86-64). The MMU (Memory Management Unit) translates virtual addresses to physical frame addresses using page tables, with a TLB cache for speed.
Why It Matters
Virtual memory enables process isolation (one process can’t read another’s memory), lets you run programs larger than physical RAM (demand paging), and makes fork() fast (copy-on-write). Understanding it explains segfaults, mmap behavior, and why your process shows 2GB RSS but the system is fine.
Virtual → Physical Translation
Virtual address (48 bits on x86-64):
┌─────┬─────┬─────┬─────┬────────────┐
│PML4 │PDPT │ PD │ PT │ Offset │
│ 9b │ 9b │ 9b │ 9b │ 12b │
└──┬──┴──┬──┴──┬──┴──┬──┴─────┬──────┘
│ │ │ │ │
└──→ 4-level page table walk
│
↓
Physical frame number + Offset = Physical address
Each level is a 512-entry table (9 bits = 512 entries). Leaf entries contain the physical frame number plus permission bits (read/write/execute/user).
TLB (Translation Lookaside Buffer)
A hardware cache of recent virtual→physical translations. TLB hit = ~1 cycle. TLB miss = page table walk (~10-100 cycles). TLB flush on context switch is one reason switches are expensive.
perf stat -e dTLB-loads,dTLB-load-misses ./program # measure TLB missesPage Faults
When the CPU accesses a virtual page with no valid mapping:
| Fault Type | Cause | Kernel Action |
|---|---|---|
| Minor | Page allocated but not yet mapped (first access to mmap’d region) | Map a zero page, no IO |
| Major | Page swapped to disk | Read from swap, expensive |
| Invalid | Access to unmapped address | SIGSEGV → segfault |
/usr/bin/time -v ./program 2>&1 | grep "page faults"
# Minor (reclaiming): 1234
# Major (requiring I/O): 0Copy-on-Write (COW)
fork() doesn’t physically copy memory. Parent and child share all pages, marked read-only. On the first write:
- CPU raises a (minor) page fault — page is read-only
- Kernel sees it’s a COW page
- Kernel copies just that one page (4KB)
- Both processes get their own writable copy
This is why fork() is O(page table size), not O(memory size).
Key Memory Regions
High addresses
┌────────────────────┐
│ Kernel space │ [not accessible from user mode]
├────────────────────┤ 0x7FFF...
│ Stack ↓ │ grows downward, RLIMIT_STACK (8MB default)
│ │
│ mmap region ↓ │ shared libraries, mmap'd files
│ │
│ Heap ↑ │ grows upward via brk/sbrk
├────────────────────┤
│ BSS │ uninitialized globals (zeroed)
├────────────────────┤
│ Data │ initialized globals
├────────────────────┤
│ Text (r-x) │ code, read-only + execute
└────────────────────┘
Low addresses
Inspect live: cat /proc/PID/maps shows every mapped region with permissions.
$ cat /proc/self/maps
00400000-00401000 r-xp program (text)
00601000-00602000 rw-p program (data)
7f8a1000-7f8c3000 r-xp /lib/libc.so.6
7ffd4000-7ffd6000 rw-p [stack]Swapping and Page Replacement
When physical memory is full, the kernel evicts pages to swap (disk). Replacement policies:
| Algorithm | How | Notes |
|---|---|---|
| LRU | Evict least recently used | Ideal but expensive to track exactly |
| Clock (second chance) | Circular list with reference bit | Approximation of LRU, used in practice |
| LFU | Evict least frequently used | Good for some workloads |
Linux uses a two-list approach: active and inactive page lists with aging.
free -h # check swap usage
swapon --show # list swap devices
vmstat 1 # watch page-in/page-out in real timeRelated
- Memory Allocation — user-space malloc/free built on top of mmap/brk
- Processes and Threads — each process has its own address space
- Pointers and Memory — stack vs heap from the programmer’s view
- Containers and Namespaces — memory cgroup limits