Benchmarking for this project is a bit weird, since 1) only linear scans are supported, and 2) it's an "embeddable" vector search tool, so it doesn't make a lot of sense to benchmark against "server" vector databases like qdrant or pinecone.
That being said, ~generally~ I'd say it's faster than using numpy and tools like txtai/chromadb. Faiss and hnswlib (bruteforce) are faster because they store everything in memory and use multiple threads. But for smaller vector indexes, I don't think you'd notice much of a difference. sqlite-vec has some support for SIMD operations, which speeds things up quite a bit, but Faiss still takes the cake.
Author of txtai here - great work with this extension.
I wouldn't consider it a "this or that" decision. While txtai does combine Faiss and SQLite, it could also utilize this extension. The same task was just done for Postgres + pgvector. txtai is not tied to any particular backend components.
alexgarcia-xyz|1 year ago
That being said, ~generally~ I'd say it's faster than using numpy and tools like txtai/chromadb. Faiss and hnswlib (bruteforce) are faster because they store everything in memory and use multiple threads. But for smaller vector indexes, I don't think you'd notice much of a difference. sqlite-vec has some support for SIMD operations, which speeds things up quite a bit, but Faiss still takes the cake.
dmezzetti|1 year ago
I wouldn't consider it a "this or that" decision. While txtai does combine Faiss and SQLite, it could also utilize this extension. The same task was just done for Postgres + pgvector. txtai is not tied to any particular backend components.