If you are new to vector databases Weaviate is nice to get started with. With docker you are up and running in a few minutes. It is very easy to integrate with embedding and generative models from openai (direct and Azure), cohere and google (PaLM). All you need is enter your API keys. This makes prototyping very fast because you don't need to worry about infrastructure to interface with the models. The documentation is pretty good and the tutorial gets your feet wet quickly.
Weaviate is great for experimenting with generative and hybrid search scenarios across different models and vendors. It appears robust enough for production use on small and maybe medium size datasets. I have my doubts whether it would scale to billions of documents for large scale production use however.
My biggest gripe is that I need to reindex all my data when wanting to attach a new module (vectorizer, generator, etc) to a data class. While this is fine for prototyping and small datasets, this really cramps your style when you are talking millions to billions of docs. I also wish that when I have an array of strings (`text[]`) the vectorizer would create a vector for each element instead of just one for the entire array.
One feature I haven’t seen people write about is the ref2vec capability. I find this to be an interesting way to get some knowledge graph-like capabilities out of Weaviate.
Posting here to see if someone sees it by happenstance and writes an awesome article about it someday so I can read it.
[+] [-] ftkftk|2 years ago|reply
Weaviate is great for experimenting with generative and hybrid search scenarios across different models and vendors. It appears robust enough for production use on small and maybe medium size datasets. I have my doubts whether it would scale to billions of documents for large scale production use however.
My biggest gripe is that I need to reindex all my data when wanting to attach a new module (vectorizer, generator, etc) to a data class. While this is fine for prototyping and small datasets, this really cramps your style when you are talking millions to billions of docs. I also wish that when I have an array of strings (`text[]`) the vectorizer would create a vector for each element instead of just one for the entire array.
[+] [-] imaurer|2 years ago|reply
Posting here to see if someone sees it by happenstance and writes an awesome article about it someday so I can read it.
https://weaviate.io/blog/ref2vec-centroid
[+] [-] peter_d_sherman|2 years ago|reply
[+] [-] jeadie|2 years ago|reply