Processes and Threads

A process is a running program — code, data, heap, stack, open file descriptors, and a PID, all wrapped in its own virtual address space. A thread is an execution context within a process — its own stack and registers, but sharing the same memory space with other threads.

Why It Matters

Everything you run is a process. Web servers fork processes, databases use thread pools, shells pipe processes together. Understanding the process model is how you reason about isolation, concurrency, and resource management at the OS level.

Process Memory Layout

High addresses
┌───────────────────┐
│ Stack (grows ↓)   │  ← local variables, return addresses
│                   │
│     [unmapped]    │
│                   │
│ Heap (grows ↑)    │  ← malloc'd memory
├───────────────────┤
│ BSS               │  ← uninitialized globals (zeroed)
├───────────────────┤
│ Data              │  ← initialized globals
├───────────────────┤
│ Text (read-only)  │  ← compiled machine code
└───────────────────┘
Low addresses

Each process gets its own virtual address space (0 to 2^48 on x86-64). The kernel maps virtual pages to physical frames via page tables.

fork + exec (How Shells Work)

#include <unistd.h>
#include <sys/wait.h>
#include <stdio.h>
 
int main(void) {
    pid_t pid = fork();  // create child — copy of parent
 
    if (pid < 0) {
        perror("fork");
        return 1;
    } else if (pid == 0) {
        // Child process (pid == 0 in child's view)
        execvp("ls", (char*[]){"ls", "-l", NULL});  // replace with new program
        perror("exec failed");  // only reached if exec fails
        _exit(1);
    } else {
        // Parent process (pid == child's PID)
        int status;
        waitpid(pid, &status, 0);  // wait for child to finish
        if (WIFEXITED(status))
            printf("child exited with %d\n", WEXITSTATUS(status));
    }
    return 0;
}

fork() returns twice — once in the parent (child’s PID) and once in the child (0). The child is a copy-on-write clone: pages are shared read-only until one process writes, then just that page is copied.

Process Lifecycle

fork() → RUNNING → exit()/signal → ZOMBIE → parent wait() → REAPED
                         ↑
                    READY ⇄ BLOCKED (waiting for IO/lock)
  • Zombie: process exited but parent hasn’t called wait(). Shows as Z in ps. The PID and exit status are kept until reaped.
  • Orphan: parent exits before child. The child is adopted by init/systemd (PID 1), which reaps it automatically.

Threads

All threads in a process share heap, globals, and file descriptors. Each thread has its own stack and register state.

#include <pthread.h>
#include <stdio.h>
 
void *worker(void *arg) {
    int id = *(int *)arg;
    printf("Thread %d running\n", id);
    return NULL;
}
 
int main(void) {
    pthread_t threads[4];
    int ids[4];
 
    for (int i = 0; i < 4; i++) {
        ids[i] = i;
        pthread_create(&threads[i], NULL, worker, &ids[i]);
    }
    for (int i = 0; i < 4; i++) {
        pthread_join(threads[i], NULL);
    }
    return 0;
}

Compile with gcc -pthread program.c.

Process vs Thread

AspectProcessThread
MemorySeparate address spaceShared address space
Creation costHigh (fork + COW pages)Low (~8KB stack)
CommunicationIPC: pipes, shared mem, socketsDirect: shared variables
Crash isolationYes — one crash doesn’t kill othersNo — one crash kills all
Context switchExpensive (TLB flush)Cheaper (same address space)
Use caseIsolation (web workers, sandboxing)Parallelism (thread pools)

Practical: Inspecting Processes

ps aux                     # list all processes
ps -eLf                    # list all threads (-L shows LWP)
cat /proc/PID/maps         # memory layout of a process
cat /proc/PID/status       # threads, memory, state
ls -l /proc/PID/fd/        # open file descriptors