File IO in C
In Unix, everything is a file — regular files, pipes, sockets, devices, even /proc entries. File descriptors are small integers that index into a per-process table maintained by the kernel.
Why It Matters
Every program that reads input, writes output, or communicates over a network uses file descriptors. Understanding the POSIX file IO model is foundational for systems programming — it’s the interface between your code and the kernel.
File Descriptors
Standard Descriptors
| fd | Stream | Default |
|---|---|---|
| 0 | stdin | Terminal input |
| 1 | stdout | Terminal output |
| 2 | stderr | Terminal error output |
New descriptors from open() get the lowest available number (typically 3, 4, 5…).
Open, Read, Write, Close
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main(void) {
// Open for reading
int fd = open("data.txt", O_RDONLY);
if (fd < 0) { perror("open"); exit(1); }
// Read into buffer
char buf[4096];
ssize_t n = read(fd, buf, sizeof(buf));
// n = bytes read, 0 = EOF, -1 = error
// Open for writing (create if missing, truncate if exists)
int out = open("copy.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
write(out, buf, n);
close(fd);
close(out);
return 0;
}Seeking
off_t pos = lseek(fd, 0, SEEK_CUR); // get current position
lseek(fd, 100, SEEK_SET); // jump to byte 100
lseek(fd, -10, SEEK_END); // 10 bytes before endPipes and sockets are not seekable — lseek returns -1 with errno = ESPIPE.
Buffered vs Unbuffered IO
| API | Buffered? | Example |
|---|---|---|
read()/write() | No — direct syscall every call | Low-level, precise control |
fread()/fwrite() | Yes — stdio buffer (4-8KB) | Fewer syscalls, faster for small ops |
printf() | Line-buffered to terminal, fully buffered to file | Convenient formatting |
fflush(stdout) forces buffered output. Important before fork() or you’ll get duplicated output.
Redirection with dup2
// Redirect stdout to a file
int logfd = open("log.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
dup2(logfd, STDOUT_FILENO); // fd 1 now points to log.txt
close(logfd); // original fd no longer needed
printf("this goes to log.txt\n");This is how shells implement > redirection.
Memory-Mapped File IO
#include <sys/mman.h>
#include <sys/stat.h>
int fd = open("data.bin", O_RDONLY);
struct stat st;
fstat(fd, &st);
// Map entire file into memory
char *data = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd); // fd can be closed after mmap
// Access file contents like an array
printf("first byte: %c\n", data[0]);
printf("last byte: %c\n", data[st.st_size - 1]);
munmap(data, st.st_size);mmap lets the kernel handle caching via page faults — often faster than read() for random access patterns on large files.
Pipes
int pipefd[2];
pipe(pipefd); // pipefd[0] = read end, pipefd[1] = write end
if (fork() == 0) {
close(pipefd[0]);
write(pipefd[1], "hello", 5);
close(pipefd[1]);
_exit(0);
} else {
close(pipefd[1]);
char buf[16];
read(pipefd[0], buf, sizeof(buf));
close(pipefd[0]);
}Related
- System Calls —
open,read,writeare thin wrappers around syscalls - Processes and Threads — fd table is per-process, inherited across
fork() - Socket Programming — sockets are file descriptors too
- Memory Management —
mmapmaps files into virtual address space