Aperture (https://github.com/fluxninja/aperture) takes a slightly opinionated take on queue size. Instead of defining queue sizes, you put a timeout on each request which is the amount of time the request is willing to wait in the queue. Then we run a weighted fair queuing algorithm that ensures relative allocation across workloads, e.g. 90% capacity for critical requests. But the capacity is allowed to burst when there is low demand.
No comments yet.