Thinking this could be useful in a multi tenants service where you need to fairly allocate job processing capacity across tenants to a number of background workers (like data export api requests, encoding requests etc.)
That was my first thought as well. However, in a lot of real world cases, what matters is not the frequency of requests, but the duration of the jobs. For instance, one client might request a job that takes minutes or hours to complete, while another may only have requests that take a couple of seconds to complete. I don't think this library handles such cases.
The text suggests a method for managing GPU or rate-limited resources across multiple clients. It highlights the problem of spikey workloads, where a client might generate a large number of events (e.g., from a CSV upload) causing resource starvation. The text advises against using naive solutions like FIFO, which could disadvantage clients with steady live traffic.
Rate limiters are used to protect servers from overload and to prevent attackers--or even legitimate but unintentionally greedy tenants--from starving other tenants of resources. They are a key component of a resilient distributed system.
codaphiliac|1 year ago
jawns|1 year ago
itake|1 year ago
mnadkvlb|1 year ago
I am very intrigued to find out how this would fit in, if at all.
otterley|1 year ago
See, e.g., https://docs.aws.amazon.com/wellarchitected/latest/framework...
This project, however, looks like a concurrency limiter, not a rate limiter. I'm also not sure how it works across a load-balanced cluster.