When one database can't keep up, slice the data across many.
A single database has a maximum throughput — eventually you'll saturate it no matter how big the machine. Read replicas help with reads, but writes still need to land on the primary.
Sharding partitions your data across multiple databases by a shard key — typically a hash of the user id, account id, or whatever the request is keyed on. Shard 0 holds keys hashing to 0, shard 1 holds keys hashing to 1, and so on.
A shard router (also called a coordinator) sits in front. Given a request's key, it computes the hash and forwards to the correct shard. Aggregate throughput scales nearly linearly with shard count.
One database can't keep up. Add a load balancer in front of multiple servers, then a shard router that deterministically partitions writes across multiple databases by key. More shards = more aggregate throughput.