Message Queues

Message queues decouple producers from consumers. Instead of one component waiting synchronously for another, the producer writes work to a queue and the consumer processes it when capacity is available.

Why They Matter

Queues absorb spikes, isolate failures, and let slow work happen asynchronously. They are a common answer when a request path includes email sending, file processing, retries, or integration with flaky third-party systems.

What a Queue Gives You

  • Temporal decoupling: producer and consumer do not need to be online at the same moment
  • Backpressure: queue depth shows when consumers cannot keep up
  • Retries: failed jobs can be requeued without blocking the caller
  • Fan-out: one event can feed multiple downstream consumers via topics or streams

Core Tradeoffs

ChoiceBenefitCost
At-most-onceSimple, no duplicatesLost work on failure
At-least-onceSafer deliveryConsumers must be idempotent
Exactly-onceNice abstractionUsually expensive or narrower than it sounds

For most systems, at-least-once + idempotent consumers is the realistic design.

Queue vs Pub/Sub

  • Work queue: one job should be handled by one worker
  • Pub/sub topic: one event should be seen by many subscribers
  • Log/stream: ordered history that consumers replay at their own offsets

This distinction matters. If you use a queue where you need replay, observability suffers. If you use a stream where you only need background jobs, complexity grows for no benefit.

Failure Modes to Design For

  • Poison messages that always fail
  • Consumers slower than producers
  • Duplicate delivery after timeout or retry
  • Messages that depend on state that has already changed

Typical mitigations:

  1. Dead-letter queue for repeated failures.
  2. Explicit retry policy with jitter.
  3. Idempotency key or natural dedupe key.
  4. Metrics on queue depth, processing lag, and age of oldest message.

Minimal Worker Pattern

def process(order_id: str) -> None:
    if already_done(order_id):
        return
    charge_customer(order_id)
    reserve_inventory(order_id)
    mark_done(order_id)

The important part is not the library. The important part is that process() can run twice without causing double charge or double shipment.

When Not to Use a Queue

  • The caller truly needs an immediate answer
  • The work is tiny and synchronous code is simpler
  • Ordering requirements are strict and hard to preserve
  • You do not have monitoring for queue buildup and stuck consumers