← BACK TO DRAWING SET
SPECIFICATION · CONCEPT BRIEFDWG · 11-TAME-THE-SPIKE

Tame the Spike

When you can't scale fast enough, throttle the firehose.

§01What a rate limiter does

A rate limiter caps how fast traffic flows downstream. The most common implementation is a token bucket: tokens refill at a fixed rate (say, 30 per tick), each request takes one, and arrivals with no token are rejected immediately. The bucket size lets short bursts squeak through; sustained overload is throttled.

§02Tokens vs bucket size

  • tokensPerTick = sustained rate. This is what your downstream can survive.
  • bucketSize = burst tolerance. Higher = more headroom for spiky traffic.
  • Drops here are intentional — they protect what's behind them.
CAUTION
Match the limit to the bottleneck
Set tokensPerTick a little under what the protected service can handle. Too high and the limiter is decorative; too low and you're rejecting traffic that could have succeeded.
⚑ CHEATSHEET · QUICK REFERENCE
  • Token bucket: refill = sustained rate, bucket = burst tolerance.
  • Use it in front of a service that can't scale (third-party API, fragile downstream, expensive op).
▸ THE EXERCISE

Bursty traffic overwhelms the downstream server. Insert a rate limiter to throttle arrivals at a sustainable rate; the limiter drops the excess so the server stays healthy.

▸ START EXERCISEBACK TO DRAWING SET