top | item 47020859

(no title)

mceachen | 15 days ago

I maintain a fork of sqlite-vec (because there hasn't been activity on the main repo for more than a year): sqlite-vec is great for smaller dimensionality or smaller cardinality datasets, but know that it's brute-force, and query latency scales exactly linearly. You only avoid full table scans if you add filterable columns to your vec0 table and include them in your WHERE clause. There's no probabilistic lookup algorithm in sqlite-vec.

discuss

order

luoxiaojian|14 days ago

You're absolutely right—sqlite-vec currently only supports brute-force search, and its latency does scale linearly with dataset size. We did some rough comparisons using its benchmark tools: on the SIFT dataset, latency was around 100ms; on GIST, it was closer to 1000ms. In contrast, with zvec's HNSW implementation, we get ~1ms latency on SIFT and ~3ms on GIST, while achieving recall@100 of 99.9% on SIFT and 97.7% on GIST.

mceachen|14 days ago

FWIW "You're absolutely right" broadly declares "a human is not piloting the keyboard"