top | item 42873422

(no title)

lidavidm | 1 year ago

To get a really precise answer you'd have to profile or benchmark. I'd say it's also hard to do an apples to apples comparison (if you only replace the data format in the wire protocol, the database probably still has to transpose the data to ingest it). And it's hard to do a benchmark in the first place since probably your database's wire protocol is not really exposed for you to do a benchmark.

You can sort of see what benefits you might get from a post like this, though: https://willayd.com/leveraging-the-adbc-driver-in-analytics-...

While we're not using Arrow on the wire here, the ADBC driver uses Postgres's binary format (which is still row oriented) + COPY and can get significant speedups compared to other Postgres drivers.

The other thing might be to consider whether you can just dump to Parquet files or something like that and bypass the database entirely (maybe using Iceberg as well).

discuss

order

majoe|1 year ago

Thanks for the answer.

We will start a refactoring of the application in a few weeks to get rid of the performance problems. I will keep your advice in mind and do some thorough benchmarks in the meantime.