Build a Mini Shell

Goal: Build a working command-line shell in C that can run programs, handle arguments, and pipe commands — using fork, exec, waitpid, pipe, and dup2.

Prerequisites: Processes and Threads, System Calls, File IO in C, Signals and IPC


What a Shell Actually Does

1. Print prompt
2. Read a line of input
3. Parse it into command + arguments
4. fork() a child process
5. Child: exec() the command (replace itself with the new program)
6. Parent: waitpid() for child to finish
7. Go to 1

That’s it. Bash is this loop plus 50 years of features.


Step 1: The Read-Eval Loop

// shell.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>
 
#define MAX_LINE 1024
#define MAX_ARGS 64
 
int main(void) {
    char line[MAX_LINE];
 
    while (1) {
        printf("$ ");
        fflush(stdout);
 
        if (!fgets(line, sizeof(line), stdin))
            break;   // EOF (Ctrl+D)
 
        // Strip trailing newline
        line[strcspn(line, "\n")] = '\0';
 
        if (strlen(line) == 0)
            continue;
 
        // Built-in: exit
        if (strcmp(line, "exit") == 0)
            break;
 
        // TODO: parse and execute
    }
    return 0;
}

Step 2: Parse Input into Arguments

Split the line on spaces into an argv-style NULL-terminated array:

int parse(char *line, char **argv) {
    int argc = 0;
    char *token = strtok(line, " \t");
    while (token && argc < MAX_ARGS - 1) {
        argv[argc++] = token;
        token = strtok(NULL, " \t");
    }
    argv[argc] = NULL;   // execvp requires NULL-terminated array
    return argc;
}

For ls -la /tmp, this produces: argv = {"ls", "-la", "/tmp", NULL}.


Step 3: Fork and Exec

void run_command(char **argv) {
    pid_t pid = fork();
 
    if (pid < 0) {
        perror("fork");
        return;
    }
 
    if (pid == 0) {
        // Child process: replace with new program
        execvp(argv[0], argv);
        // If we get here, exec failed
        perror(argv[0]);
        _exit(127);
    }
 
    // Parent: wait for child to finish
    int status;
    waitpid(pid, &status, 0);
 
    if (WIFEXITED(status) && WEXITSTATUS(status) != 0)
        fprintf(stderr, "(exit %d)\n", WEXITSTATUS(status));
}

Why execvp and not execv

execvp searches PATH for the command — so ls finds /usr/bin/ls automatically. execv requires the full path.

Why _exit and not exit

In the child, after a failed exec, use _exit() — it skips stdio buffer flushing. exit() would flush buffers that belong to the parent, causing duplicate output.


Step 4: Wire It Together

Add this to the main loop after the exit check:

        char *argv[MAX_ARGS];
        int argc = parse(line, argv);
        if (argc == 0)
            continue;
 
        // Built-in: cd (can't be external — must change THIS process's directory)
        if (strcmp(argv[0], "cd") == 0) {
            if (argv[1] == NULL)
                argv[1] = getenv("HOME");
            if (chdir(argv[1]) != 0)
                perror("cd");
            continue;
        }
 
        run_command(argv);

Why cd must be built-in

cd in a child process would change the child’s directory, then the child exits. The parent’s directory is unchanged. So cd must run in the shell process itself.


Step 5: Test It

gcc -Wall -Wextra -g -o shell shell.c
./shell
$ ls -la
$ echo hello world
$ cd /tmp
$ pwd
$ cat /etc/hostname
$ exit

You now have a working shell.


Step 6: Add Pipes

Support ls | grep .c | wc -l by splitting on | and connecting stdout→stdin between processes.

void run_pipeline(char *line) {
    // Split line on '|'
    char *commands[MAX_ARGS];
    int ncmds = 0;
    char *cmd = strtok(line, "|");
    while (cmd && ncmds < MAX_ARGS) {
        commands[ncmds++] = cmd;
        cmd = strtok(NULL, "|");
    }
 
    if (ncmds == 1) {
        char *argv[MAX_ARGS];
        parse(commands[0], argv);
        run_command(argv);
        return;
    }
 
    int prev_fd = -1;   // read end of previous pipe
 
    for (int i = 0; i < ncmds; i++) {
        int pipefd[2] = {-1, -1};
        if (i < ncmds - 1)
            pipe(pipefd);    // create pipe for this stage
 
        char *argv[MAX_ARGS];
        parse(commands[i], argv);
 
        pid_t pid = fork();
        if (pid == 0) {
            // Connect stdin to previous pipe's read end
            if (prev_fd != -1) {
                dup2(prev_fd, STDIN_FILENO);
                close(prev_fd);
            }
            // Connect stdout to this pipe's write end
            if (pipefd[1] != -1) {
                dup2(pipefd[1], STDOUT_FILENO);
                close(pipefd[0]);
                close(pipefd[1]);
            }
            execvp(argv[0], argv);
            perror(argv[0]);
            _exit(127);
        }
 
        // Parent: close used fds
        if (prev_fd != -1) close(prev_fd);
        if (pipefd[1] != -1) close(pipefd[1]);
        prev_fd = pipefd[0];   // save read end for next stage
    }
 
    // Wait for all children
    for (int i = 0; i < ncmds; i++)
        wait(NULL);
}

Replace run_command(argv) in main with run_pipeline(line) (before parsing).

How pipes work

pipe() creates two file descriptors: pipefd[0] for reading, pipefd[1] for writing. dup2(pipefd[1], STDOUT) redirects stdout into the pipe. The next process reads from pipefd[0] as its stdin.


Test Pipes

$ ls | grep shell
shell.c
shell
$ echo hello world | wc -w
2
$ cat /etc/passwd | grep root | head -1
root:x:0:0:root:/root:/bin/bash

Exercises

  1. Background jobs: Support command & — fork and don’t wait. Track background PIDs. Add a jobs built-in to list them.

  2. Redirection: Support command > output.txt and command < input.txt using open() + dup2().

  3. Signal handling: Handle Ctrl+C (SIGINT) in the shell — it should kill the foreground child, not the shell itself. Use sigaction.

  4. Environment variables: Implement export VAR=value (sets in the shell’s environment with setenv) and $VAR expansion in commands.


Next: 05 - TCP Echo Server from Scratch — build a network server using sockets.