Application level connection pools are not enough if you're using something like k8s and have your "application" running across hundreds of pods, each with their own application level connection pool. pgBouncer helps tremendously in that situation because all those pods will use a single pool. We cut down avg open connections dramatically by doing that from over 1000 to less than 400.
__s|2 years ago
Deadron|2 years ago
Deadron|2 years ago
dkhenry|2 years ago
I would confidently state that most large scale applications have run into the situation where they have scale up their application worker fleet, and then crashed their database with too many connections. Coordinating the size of the worker connection pool in a world where we have elastic application workers is enough of a task that deploying a proxy is really the simplest and best solution
film42|2 years ago
aeyes|2 years ago
I pool ~10.000 connections down to under 500. I can't do application level pooling because the application is just thousands of individual processes, you often see this with Python, PHP or NodeJS applications.
500 open connections is way less overhead than 10.000 on the Postgres server. I'm very happy to "pass the buck" of connection pooling to separate machines with pgBouncer.
brazzy|2 years ago
Additionally, the "transaction" mode of PgBouncer can increase the utilization of connections further by making them available to a different application when one application holds a connection while doing something different (e.g. waiting for a call to an external service).
dharmab|2 years ago
stickfigure|2 years ago
If you have a single "real" connection pool, the idle pool is effectively shared among all application instances. You can have 100 active connections and 10 (or maybe 20) idle.
I have run into this problem, but I solved it with careful tuning of automatic instance scaling parameters, connection pool size and timeout. But this only gets you so far; a single real connection pool would be more effective (at the cost of added complexity).
aidenn0|2 years ago
But yeah, for less distributed applications, just have N worker threads and don't close the DB connection after each job.
perrygeo|2 years ago
That's a big assumption. If your app is just an HTTP wrapper for SQL queries against a poorly-optimized database, sure.
But there are plenty of applications that spend time the majority of their time doing other stuff (http requests, FFI calls, g/cpu-intensive processing, etc.) where the database interaction is a small part of the overall performance profile - certainly not the bottleneck, even with hundreds of concurrent clients.
In those cases, rather than artificially throttling the concurrency to stay within your global max_connections, you would ideally run as many as you can (ie fully utilize your infrastructure) and grab database connections from an external pool when you need them.