(no title)
smadge
|
8 months ago
I have experience with token bucket and leaky bucket (or at least a variation where a request leaves the bucket when the server is done processing it) to prevent overload of backend servers. I switched from token bucket to leaky bucket. Token bucket is “the server can serve X requests per second,” while leaky bucket is the “the server can process N requests concurrently.” I found the direct limit on concurrency much more responsive to overload and better controlled delay from contention of shared resources. This kind of makes sense because imagine if your server goes from processing 10 QPS to 5 QPS. If the server has a 10 QPS token bucket limit it keeps accepting requests and the request queue and response time goes to infinity.
No comments yet.