System Calls
The interface between user programs and the kernel. When you call open(), read(), or fork(), a thin libc wrapper triggers a syscall that switches the CPU from user mode to kernel mode, executes privileged code, and returns the result.
Why It Matters
Syscalls are the only way user programs can interact with hardware, files, networks, or other processes. Everything — malloc, printf, socket connections — eventually bottlenecks through syscalls. Understanding them helps you debug performance issues, write low-level code, and understand what strace is showing you.
How a Syscall Works
User space Kernel space
──────────────────────────────────────────────────
Your code calls read(fd, buf, n)
→ libc wrapper:
mov rax, 0 (syscall number: 0 = read)
mov rdi, fd (arg 1)
mov rsi, buf (arg 2)
mov rdx, n (arg 3)
syscall (trap to kernel)
→ CPU switches to ring 0
→ Kernel looks up handler in syscall table
→ Executes sys_read()
→ Copies data to user buffer
→ Returns byte count in rax
← libc checks rax
if negative → set errno, return -1
else → return rax
Arguments go in rdi, rsi, rdx, r10, r8, r9 (not rcx — it’s clobbered by syscall). Return value in rax; negative values encode -errno.
Common Syscalls (x86-64 Linux)
| Syscall | Number | Purpose | libc Wrapper |
|---|---|---|---|
| read | 0 | Read from fd | read() |
| write | 1 | Write to fd | write() |
| open | 2 | Open file | open() |
| close | 3 | Close fd | close() |
| mmap | 9 | Map memory/files | mmap() |
| brk | 12 | Set heap break | Used by malloc() |
| ioctl | 16 | Device control | ioctl() |
| fork | 57 | Create child process | fork() |
| execve | 59 | Replace process image | execve() |
| exit | 60 | Terminate process | _exit() |
| kill | 62 | Send signal | kill() |
| socket | 41 | Create socket | socket() |
Full list: ausyscall --dump or /usr/include/asm/unistd_64.h.
Raw Syscall (No libc)
You can invoke syscalls directly without libc using inline assembly:
#include <sys/syscall.h>
#include <unistd.h>
// Write "hello\n" using raw syscall
int main(void) {
const char msg[] = "hello\n";
syscall(SYS_write, 1, msg, sizeof(msg) - 1);
syscall(SYS_exit, 0);
}This is what libc does internally. Useful for understanding, but in practice always use the libc wrappers — they handle errno and are portable.
errno and Error Handling
Syscalls don’t throw exceptions. They return -1 and set the global errno:
#include <errno.h>
#include <string.h>
int fd = open("missing.txt", O_RDONLY);
if (fd < 0) {
printf("error %d: %s\n", errno, strerror(errno));
// "error 2: No such file or directory"
perror("open"); // shorthand: prints "open: No such file or directory"
}Common errno values: ENOENT (file not found), EACCES (permission denied), ENOMEM (out of memory), EINTR (interrupted by signal — retry).
Tracing with strace
strace intercepts every syscall a program makes — invaluable for debugging:
$ strace -e trace=open,read,write cat /etc/hostname
open("/etc/hostname", O_RDONLY) = 3
read(3, "myhost\n", 131072) = 7
write(1, "myhost\n", 7) = 7
$ strace -c ls # summary: count and time per syscall
% time calls syscall
------ -------- --------
42.00 12 write
28.00 8 openat
15.00 20 mmap
...| strace Flag | Purpose |
|---|---|
-e trace=open,read | Filter to specific syscalls |
-c | Summary with counts and timing |
-p PID | Attach to running process |
-f | Follow child processes (after fork) |
-t | Timestamp each syscall |
Related
- File IO in C —
open/read/writeare thin syscall wrappers - Processes and Threads —
fork,execve,waitare syscalls - Memory Allocation —
brkandmmapsyscalls back malloc - Memory Management — kernel side of
mmapand page faults