Build a Thread Pool in C
Goal: Implement a fixed-size thread pool with a work queue using pthreads, mutexes, and condition variables. Submit tasks and process them concurrently.
Prerequisites: Concurrency and Synchronization, Processes and Threads, Memory Allocation
The Problem
Creating a new thread per task is expensive (stack allocation, kernel overhead). A thread pool creates N worker threads once, then feeds them work through a shared queue. This is how web servers, databases, and game engines handle concurrent work.
Main thread: Worker threads:
submit(task_a) ──→ [queue] ──→ thread 0: executes task_a
submit(task_b) ──→ [queue] ──→ thread 1: executes task_b
submit(task_c) ──→ [queue] ──→ thread 0: executes task_c (after task_a finishes)
Step 1: Data Structures
// threadpool.h
#ifndef THREADPOOL_H
#define THREADPOOL_H
#include <pthread.h>
#include <stdbool.h>
typedef void (*task_fn)(void *arg);
typedef struct task {
task_fn fn;
void *arg;
struct task *next;
} task;
typedef struct {
pthread_t *threads;
int nthreads;
task *head; // queue front (dequeue here)
task *tail; // queue back (enqueue here)
pthread_mutex_t lock;
pthread_cond_t notify;
bool shutdown;
} threadpool;
threadpool *threadpool_create(int nthreads);
void threadpool_submit(threadpool *pool, task_fn fn, void *arg);
void threadpool_destroy(threadpool *pool);
#endifStep 2: Worker Thread Function
Each worker loops: lock → wait for work → dequeue → unlock → execute.
// threadpool.c
#include "threadpool.h"
#include <stdlib.h>
#include <stdio.h>
static void *worker(void *arg) {
threadpool *pool = arg;
while (1) {
pthread_mutex_lock(&pool->lock);
// Wait until there's work or shutdown
while (pool->head == NULL && !pool->shutdown)
pthread_cond_wait(&pool->notify, &pool->lock);
if (pool->shutdown && pool->head == NULL) {
pthread_mutex_unlock(&pool->lock);
break;
}
// Dequeue a task
task *t = pool->head;
pool->head = t->next;
if (pool->head == NULL)
pool->tail = NULL;
pthread_mutex_unlock(&pool->lock);
// Execute outside the lock — don't hold the lock during work!
t->fn(t->arg);
free(t);
}
return NULL;
}Why while and not if
pthread_cond_wait can wake up spuriously (without a signal). The while loop re-checks the condition. This is the standard condition variable pattern.
Step 3: Create and Destroy
threadpool *threadpool_create(int nthreads) {
threadpool *pool = calloc(1, sizeof(threadpool));
pool->nthreads = nthreads;
pool->threads = malloc(nthreads * sizeof(pthread_t));
pthread_mutex_init(&pool->lock, NULL);
pthread_cond_init(&pool->notify, NULL);
for (int i = 0; i < nthreads; i++)
pthread_create(&pool->threads[i], NULL, worker, pool);
return pool;
}
void threadpool_destroy(threadpool *pool) {
pthread_mutex_lock(&pool->lock);
pool->shutdown = true;
pthread_cond_broadcast(&pool->notify); // wake ALL waiting workers
pthread_mutex_unlock(&pool->lock);
for (int i = 0; i < pool->nthreads; i++)
pthread_join(pool->threads[i], NULL);
// Free any remaining tasks
task *t = pool->head;
while (t) {
task *next = t->next;
free(t);
t = next;
}
pthread_mutex_destroy(&pool->lock);
pthread_cond_destroy(&pool->notify);
free(pool->threads);
free(pool);
}Why broadcast, not signal
pthread_cond_signal wakes one thread. On shutdown, we need all workers to wake up and check the shutdown flag. broadcast wakes them all.
Step 4: Submit Work
void threadpool_submit(threadpool *pool, task_fn fn, void *arg) {
task *t = malloc(sizeof(task));
t->fn = fn;
t->arg = arg;
t->next = NULL;
pthread_mutex_lock(&pool->lock);
if (pool->tail)
pool->tail->next = t;
else
pool->head = t;
pool->tail = t;
pthread_cond_signal(&pool->notify); // wake one idle worker
pthread_mutex_unlock(&pool->lock);
}Step 5: Test It
// main.c
#include "threadpool.h"
#include <stdio.h>
#include <unistd.h>
void compute(void *arg) {
int id = *(int *)arg;
printf("[thread %lu] task %d start\n", pthread_self() % 1000, id);
usleep(100000); // simulate 100ms of work
printf("[thread %lu] task %d done\n", pthread_self() % 1000, id);
free(arg);
}
int main(void) {
threadpool *pool = threadpool_create(4);
// Submit 20 tasks to 4 workers
for (int i = 0; i < 20; i++) {
int *id = malloc(sizeof(int));
*id = i;
threadpool_submit(pool, compute, id);
}
sleep(1); // let tasks complete
threadpool_destroy(pool);
printf("Pool destroyed, all done.\n");
return 0;
}gcc -Wall -Wextra -g -pthread -o pool main.c threadpool.c
./pool
# You'll see 4 tasks running concurrently, then the next 4, etc.
valgrind --tool=helgrind ./pool # check for data racesExercises
-
Wait for completion: Add
threadpool_wait(pool)that blocks until all submitted tasks finish. Hint: track pending task count with a counter + condition variable. -
Futures/results: Modify submit to return a
futurehandle. Addfuture_get(future)that blocks until the task completes and returns the result. -
Dynamic sizing: Add
threadpool_resize(pool, new_size)that adds or removes worker threads while the pool is running. -
Benchmark: Submit 100,000 lightweight tasks (increment a shared atomic counter). Compare 1, 2, 4, 8 threads. Graph throughput vs thread count. Where does it plateau?
Next: 07 - Bare Metal Blinky on STM32 — cross the hardware boundary.