File Systems

A file system organizes data on a storage device into named files and directories. It maps human-readable paths to blocks of data on disk, tracking metadata like permissions, timestamps, and ownership.

Why It Matters

Every program reads and writes files. Understanding how file systems work explains why fsync matters for databases, why deleting a 1GB file is instant, why hard links exist, and how journaling prevents data corruption after a power failure.

Inodes and Directory Entries

An inode stores all metadata about a file except its name:

Inode 42:
  type: regular file
  size: 8192 bytes
  mode: 0644 (rw-r--r--)
  uid/gid: 1000/1000
  timestamps: atime, mtime, ctime
  link count: 1
  data blocks: [100, 101] (or extents)

A directory is a file mapping names → inode numbers:

Directory inode 10:
  "hello.txt" → inode 42
  "notes.md"  → inode 57
  ".."        → inode 2

This is why renaming a file in the same directory is instant — it only updates the directory entry, not the data. Hard links create a second name pointing to the same inode.

ext4 Disk Layout

┌────────────┬──────────────┬──────────────┬─────────────┐
│ Superblock │ Block Group 0│ Block Group 1│    ...      │
│ (metadata) │              │              │             │
└────────────┴──────────────┴──────────────┴─────────────┘

Block Group:
┌──────┬──────┬───────────┬─────────────┐
│Bitmap│Bitmap│ Inode     │ Data Blocks │
│(block│(inode│ Table     │             │
│ alloc│alloc)│           │             │
└──────┴──────┴───────────┴─────────────┘

Superblock: filesystem-wide metadata (block size, total blocks, free count)
Block groups: divide disk into manageable sections with local allocation
Extents: contiguous block ranges — ext4 stores “blocks 100-199” instead of 100 individual pointers
Block size: typically 4KB (matches page size)

Journaling

Problem: a crash mid-write can leave the filesystem inconsistent (allocated block but no inode pointing to it, or vice versa).

Solution: write-ahead logging. Changes go to a journal first, then to their actual locations.

1. Write metadata changes to journal
2. Write data to actual location
3. Mark journal entry as committed
4. If crash before step 3 → replay journal on mount

ext4 journals metadata by default. Full data journaling (data=journal) is safer but slower. Most databases (PostgreSQL, SQLite) implement their own journaling/WAL on top.

VFS (Virtual Filesystem)

Linux supports many filesystems through a common abstraction layer:

User:      open("file.txt")
             ↓
Kernel:    VFS (virtual filesystem switch)
             ↓ dispatch based on mount point
           ext4 / xfs / btrfs / nfs / tmpfs / procfs
             ↓
           Block device / network / memory

Everything uses the same syscall interface. open/read/write/close work identically whether the file is on ext4, NFS, or /proc.

Key Operations and Their Cost

Operation	What Happens	Notes
`open()`	Walk path, load inode into inode cache	Cached after first access
`read()`	Map offset → blocks via inode, read from page cache	Most reads hit cache
`write()`	Allocate blocks if needed, write to page cache, journal	Actual disk write is async
`fsync()`	Flush page cache + journal to disk	Guarantees durability
`unlink()`	Remove dir entry, decrement link count	File deleted when link_count=0 AND no open fds
`rename()`	Update directory entries (atomic on same fs)	Used by databases for atomic file replacement

Practical Commands

stat file.txt              # show inode details
ls -i                      # show inode numbers
df -h                      # disk usage per filesystem
du -sh dir/                # directory size
mount                      # list mounted filesystems
debugfs /dev/sda1          # inspect ext4 internals (read-only safe)

File IO in C — the syscall interface to files
Signals and IPC — named pipes (FIFOs) are filesystem objects
Memory Management — page cache sits between VFS and disk

Engineering Notes

Explorer

File Systems

File Systems

Why It Matters

Inodes and Directory Entries

ext4 Disk Layout

Journaling

VFS (Virtual Filesystem)

Key Operations and Their Cost

Practical Commands

Graph View

Table of Contents

Backlinks

Engineering Notes

Explorer

File Systems

File Systems

Why It Matters

Inodes and Directory Entries

ext4 Disk Layout

Journaling

VFS (Virtual Filesystem)

Key Operations and Their Cost

Practical Commands

Related

Graph View

Table of Contents

Backlinks