maniacalhack0r | 9 months ago | on: Congratulations on creating the one billionth repository on GitHub
maniacalhack0r's comments
maniacalhack0r | 10 months ago | on: Decomposing Transactional Systems
maniacalhack0r | 1 year ago | on: Kafka at the low end: how bad can it get?
maniacalhack0r | 1 year ago | on: Kafka at the low end: how bad can it get?
maniacalhack0r | 1 year ago | on: Kafka at the low end: how bad can it get?
maniacalhack0r | 1 year ago | on: Kafka at the low end: how bad can it get?
eg we built a system at my last company to process 150 million objects / hour, and we modeled this using a postgres-backed queue with multiple processes pulling from the queue.
we observed that, whenever there were a lot of locked rows (ie lots of work being done), Postgres would correctly SKIP these rows, but having to iterate over and skip that many locked rows did have a noticeable impact on CPU utilization.
we worked around this by partitioning the queue, indexing on partition, and assigning each worker process a partition to pull from upon startup. this reduced the # of locked rows that postgres would have to skip over because our queries would contain a `WHERE partition=X` clause.
i had some great graphs on how long `SELECT FOR UPDATE ... SKIP LOCKED` takes as the number of locked rows in the queue increases, and how this partiton work around reduced the time to execute the SKIP LOCKED query, but unfortunately they are in the hands of my previous employer :(
i have no idea if its possible to calculate the rate at which repos are being created and time your repo creation to hit vanity numbers