top | item 40652403

(no title)

j_not_j | 1 year ago

There is a real art to doing "benchmarks", more correctly called "synthetic benchmarks" since they don't reflect actual usage but are intended for comparisons.

I had tried pgvector 0.6.2 on an OCI free node (2cpu 64GB) and noticed a few things:

- pgvector build environment does NOT use -O3

- cosine indexing with/without -03 was 1h:6h elapsed time (10M 128 x fp64 table)

- memory consumption for indexing is huge, I estimated 2x table size

- you can do parts of tables (maintenance_work_mem=) substituting disk io for memory and this only doubles elapsed time

My general comment would be: prospective users need effective guidance (beyond the great advice already on the pgvector website) about memory, cpu, and disk.

I really like the pgvectorscale possibilities for faster lookups; some great ideas there.

discuss

order

jamesgresql|1 year ago

Great comments!

This is exactly where we see ourselves contributing: both making pgvector faster and more efficient through pgvectorscale, and working to make the AI on Postgres developer experience first class.