top | item 15410389

(no title)

anonetal | 8 years ago

Yes, I was just reading through that. The server is still single-threaded though -- they are getting the parallelism by starting multiple processes to do independent chunks of work. This makes sense for PostgreSQL, but has some fundamental limitations (e.g., it requires duplicated copies of a hash table to parallelize a hash join).

discuss

order

endorphone|8 years ago

>The server is still single-threaded though -- they are getting the parallelism by starting multiple processes to do independent chunks of work.

So...it isn't single threaded then? I mean that is exactly how the most advanced competitors operate (Oracle, SQL Server) as well -- a given connection stays on one thread, with the advantages that confers, unless the planner decides to parallelize.

adzm|8 years ago

To be technical, MSSQL uses its own bespoke scheduling, and will preempt the thread for io. All io is nonblocking. The physical thread can vary for this reason. PGSQL really does use synchronous io and a single thread though. The former is probably more scalable but the latter has been serving PGSQL fine, too.

jeffdavis|8 years ago

No, processes don't create fundamental limitations. They can still share memory, it's just an "opt-in" choice.

Postgres processes share memory for all kinds of things. Hash tables may be duplicated, but not due to any fundamental limitations.

halayli|8 years ago

PostgreSQL uses shared memory, it doesn't copy the hash table.

anarazel|8 years ago

In the specific case of hashjoins, it does build them independently right now. There's a patch to rectify that though, by putting the hashtable also into shared memory. The coordination necessary to make multi phase batch joins and other such funny bits work, unfortunately made it infeasible to get into 10.