SPECIFICATION · CONCEPT BRIEFDWG · 05-SMOOTH-THE-BURST

Smooth the Burst

Real traffic isn't smooth. Queues turn spikes into a steady drip.

§01The problem with bursts

Servers and databases have a fixed concurrent capacity. When a burst arrives — say, 5× your normal traffic for a few seconds — synchronous designs collapse: in-flight requests pile up past capacity, new arrivals are dropped, and users see errors.

⚑ CAUTION

Why provisioning for the peak is expensive

If your peak is 10× the average, you'd need 10× the hardware sitting idle most of the time. Queues let you provision closer to the average and absorb the rest.

§02What a queue does

A queue accepts work immediately (acknowledging the producer) and stores it until a consumer is ready to process it. Producers don't wait for the consumer; consumers pull at their own pace.

High capacity (~200 in flight, plus a large pending buffer).
Low service latency (~1 tick) — it just stores and forwards.
Decouples producer rate from consumer rate.

§03Smoothing in practice

During the spike, the queue absorbs the excess (its pending depth grows). When the spike ends, the consumer keeps draining at its steady rate until the backlog clears. Latency goes up briefly, but nothing is dropped.

⚑ CHEATSHEET · QUICK REFERENCE

Place a queue between a fast producer and a slow consumer.
Pair a queue with multiple servers behind a load balancer — one server can't drain a serious burst alone.
A growing pendingDepth that doesn't shrink = your consumer is permanently too slow, not bursty. Scale the consumer.

▸ THE EXERCISE

Traffic isn't uniform — bursts will overload a synchronous service. Spread the work across multiple servers behind a load balancer, and place a queue between them and the database so spikes are absorbed instead of dropped.

▸ START EXERCISE BACK TO DRAWING SET