top | item 34691949 (no title) tomd | 3 years ago You might be interested in https://datasette.io/plugins/datasette-faiss, which I'm using alongside openai-to-sqlite for similarity search of embeddings, following @simonw's excellent instructions at https://simonwillison.net/2023/Jan/13/semantic-search-answer... discuss order hn newest uh_uh|3 years ago Thanks, but the index being in-memory makes it unsuitable for large data sets :/ simonw|3 years ago There is a way of running disk-backed FAISS indexed that don't all fit in memory but I've not quite figured out how to do that yet: https://github.com/facebookresearch/faiss/issues/2675 load replies (1) iandanforth|3 years ago Can you say more? Usually projects that gravitate to SQLlite are not those that require massive scale and a FAISS index of a few GB covers a lot of documents. load replies (2)
uh_uh|3 years ago Thanks, but the index being in-memory makes it unsuitable for large data sets :/ simonw|3 years ago There is a way of running disk-backed FAISS indexed that don't all fit in memory but I've not quite figured out how to do that yet: https://github.com/facebookresearch/faiss/issues/2675 load replies (1) iandanforth|3 years ago Can you say more? Usually projects that gravitate to SQLlite are not those that require massive scale and a FAISS index of a few GB covers a lot of documents. load replies (2)
simonw|3 years ago There is a way of running disk-backed FAISS indexed that don't all fit in memory but I've not quite figured out how to do that yet: https://github.com/facebookresearch/faiss/issues/2675 load replies (1)
iandanforth|3 years ago Can you say more? Usually projects that gravitate to SQLlite are not those that require massive scale and a FAISS index of a few GB covers a lot of documents. load replies (2)
uh_uh|3 years ago
simonw|3 years ago
iandanforth|3 years ago