Author here, happy to answer any questions! Been working on this for a while, so I'm very happy to get this v0.1.0 "stable" release out.
sqlite-vec works on MacOS, Linux, Windows, Raspberry Pis, in the browser with WASM, and (theoretically) on mobile devices. I focused a lot on making it as portable as possible. It's also pretty fast - benchmarks are hard to do accurately, but I'd comfortable saying that it's a very very fast brute-force vector search solution.
One experimental feature I'm working on: You can directly query vectors that are in-memory as a contiguous block of memory (ie NumPy), without any copying or cloning. You can see the benchmarks for that feature here under "sqlite-vec static", and it's competitive with faiss/usearch/duckdb https://alexgarcia.xyz/blog/2024/sqlite-vec-stable-release/i...
On the releases page https://github.com/asg017/sqlite-vec/releases/tag/v0.1.0 can you explain what is vec0.dll vs sqlite-vec-0.1.0-loadable-windows-x86_64.tar.gz, which also contains a similarly named vec0.dll but of a different size?
Great to see this. Seems simple enough, but I can't wait until ORMs like peewee incorporate support alongside things like FTS, etc. just for the sake of case of use.
I feel like I've touched a lot of things where something like this is useful (hobby projects). In my case I've done a recommendation engine, music matching (I specifically use it for matching anime to their data), and perceptual hash matching.
Really curious to hear about what kind of music embedding models/tools you used! I've tried finding some good models before but they were all pretty difficult to use
I absolutely love this, great work! For those that might find it useful, I created a Python notebook that shows how to extend this to perform Hybrid Search (Vector + BM25 based Full Text search) https://github.com/liamca/sqlite-hybrid-search
Duckdb is an excellent choice for this task, and it’s incredibly fast!
We’ve also added vector search to our product, which is really useful.
OpenAI’s official examples of embedding search use cosine similarity. But here’s the cool part: since OpenAI embeddings are unit vectors, you can just run the dot product instead!
DuckDB has a super fast dot product function that you can use with SQL.
In our product, we use duckdb-wasm to do vector searches on the client side.
I love this. I know how much work addressing the dependencies must be, but you’re really attacking the right problems. Looking forward to trying this out with my project.
No, libsql added custom vector search directly into their library, while sqlite-vec is a separate SQLite extension.
The libsql vector feature only works in libsql, sqlite-vec works in all SQLite versions. The libsql vector feature works kindof like pgvector, while sqlite-vec works more like the FTS5 full text SQLite extension.
I'd say try both and see which one you like more. sqlite-vec will soon be a part of Turso's and SQLite Cloud's products.
vec0 virtual tables have a hard-coded max of 8192 dimensions, but I can raise that very easily (I wanted to reduce resource exhaustion attacks). But if you're comparing vectors manually, then the `vec_distance_ls()` and related functions have no limits (besides SQLite's 1GB blob limit)
[+] [-] alexgarcia-xyz|1 year ago|reply
sqlite-vec works on MacOS, Linux, Windows, Raspberry Pis, in the browser with WASM, and (theoretically) on mobile devices. I focused a lot on making it as portable as possible. It's also pretty fast - benchmarks are hard to do accurately, but I'd comfortable saying that it's a very very fast brute-force vector search solution.
One experimental feature I'm working on: You can directly query vectors that are in-memory as a contiguous block of memory (ie NumPy), without any copying or cloning. You can see the benchmarks for that feature here under "sqlite-vec static", and it's competitive with faiss/usearch/duckdb https://alexgarcia.xyz/blog/2024/sqlite-vec-stable-release/i...
[+] [-] bambax|1 year ago|reply
The link on See Installing sqlite-vec for more details. https://alexgarcia.xyz/sqlite-vec/installing.html is a a 404 (the correct link is https://alexgarcia.xyz/sqlite-vec/installation.html presumably).
The datasette link https://datasette.io/plugins/datasette-sqlite-vec is an error 500.
On the releases page https://github.com/asg017/sqlite-vec/releases/tag/v0.1.0 can you explain what is vec0.dll vs sqlite-vec-0.1.0-loadable-windows-x86_64.tar.gz, which also contains a similarly named vec0.dll but of a different size?
[+] [-] rcarmo|1 year ago|reply
[+] [-] cyanydeez|1 year ago|reply
[+] [-] simonw|1 year ago|reply
[+] [-] rsingel|1 year ago|reply
[+] [-] Cieric|1 year ago|reply
[+] [-] alexgarcia-xyz|1 year ago|reply
[+] [-] yard2010|1 year ago|reply
[+] [-] cotega|1 year ago|reply
[+] [-] pjot|1 year ago|reply
https://github.com/patricktrainer/duckdb-embedding-search
[+] [-] youngbum|1 year ago|reply
We’ve also added vector search to our product, which is really useful.
OpenAI’s official examples of embedding search use cosine similarity. But here’s the cool part: since OpenAI embeddings are unit vectors, you can just run the dot product instead!
DuckDB has a super fast dot product function that you can use with SQL.
In our product, we use duckdb-wasm to do vector searches on the client side.
[+] [-] bodantogat|1 year ago|reply
[+] [-] marvel_boy|1 year ago|reply
[+] [-] 1yefuwang1|1 year ago|reply
I'd like to do a benchmark to compare it with sqlite-vec, but I guess it is not a fair comparison given that sqlite-vec uses brute-force only.
One thing I'd recommend is to include recall rate in your benchmark data.
Brute force approach is a good starting point but doesn't scale with serious production workload.
[+] [-] dang|1 year ago|reply
I’m writing a new vector search SQLite Extension - https://news.ycombinator.com/item?id=40243168 - May 2024 (85 comments)
[+] [-] deepsquirrelnet|1 year ago|reply
[+] [-] huevosabio|1 year ago|reply
I've been looking for something like this for a while.
[+] [-] bcjordan|1 year ago|reply
[+] [-] nattaylor|1 year ago|reply
My pyenv python3.12.2's sqlite won't load extensions even after installing with what I think are the correct command line flags. Argh!
My brew installed python3.12's sqlite will load extensions though, so I can proceed.
[+] [-] mic47|1 year ago|reply
[+] [-] pietz|1 year ago|reply
[+] [-] alexgarcia-xyz|1 year ago|reply
The libsql vector feature only works in libsql, sqlite-vec works in all SQLite versions. The libsql vector feature works kindof like pgvector, while sqlite-vec works more like the FTS5 full text SQLite extension.
I'd say try both and see which one you like more. sqlite-vec will soon be a part of Turso's and SQLite Cloud's products.
Turso's version: https://turso.tech/vector
[+] [-] haolez|1 year ago|reply
[+] [-] alexgarcia-xyz|1 year ago|reply
[+] [-] tareqx3|1 year ago|reply
[deleted]
[+] [-] fsndz|1 year ago|reply
[+] [-] remram|1 year ago|reply