Makes a lot of sense to me to combine embedding, retrieval and reranking — I can imagine this being a way that they can differentiate themselves from the popular databases that have added support for vector search
I assumed that a specific flavour of LLM was needed, an “embedding model” to generate the vectors. Is this announcement that pinecone is adding their own?
Normally you take your content and run it through an embedding model, inserting the resulting vectors into the vector DB. On a query, for instance, you run the query through the embedding model and query the vector database for the most similar hits to the resulting embedding vector. Similarly reranking is when get get the broad hits from the embedding similarity search and/or BM25, and then a reranker uses the looked up source material to rank the results more finely.
This is building it into the vector DB such that you send it the content and it is "built in".
Seems silly. It's like bundling a stove with cookware. But cookware fit specific niches and have different life cycles. I get that it might cater to some "drop in solution" targets, but seems of no value for most engineered, long-term solutions.
> Is this announcement that pinecone is adding their own?
TLDR: they trained their own embeddings model and rely on Cohere for ranking. Pinecone (the database) uses this model automatically to generate and store embeddings.
> I assumed that a specific flavour of LLM was needed, an “embedding model” to generate the vectors.
You're mostly right, with one caveat: embeddings models aren't really LLMs in that they're not very large: they just map semantic meaning to numerical space.
This is the golden question. As far as I know, there is no appropriate benchmarking/eval data about this. I think the real value is the first-class integration between their model and their service.
txtai (https://github.com/neuml/txtai) has had inline vectorization since 2020. It supports Transformers, llama.cpp and LLM API services. It also has inline integration with LLM models and a built-in RAG pipeline.
[+] [-] tejaskumar_|1 year ago|reply
After reading the article, it seems Pinecone just now supports in-DB vectorization, a feature that is shared by:
- DataStax Astra DB: https://www.datastax.com/blog/simplifying-vector-embedding-g... (since May 2024)
- Weaviate: https://weaviate.io/blog/introducing-weaviate-embeddings (as of yesterday)
[+] [-] jeadie|1 year ago|reply
Timescale most recently added it but, yes a bunch of others: Weaviate, Spice AI, Marqo, etc.
[+] [-] bobismyuncle|1 year ago|reply
Weaviate seems to have added a similar capability — kind of wild that they announced on the same day.
Looks like Pinecone also includes reranking as part of the same process — did Weaviate add that as well?
[+] [-] bobismyuncle|1 year ago|reply
Makes a lot of sense to me to combine embedding, retrieval and reranking — I can imagine this being a way that they can differentiate themselves from the popular databases that have added support for vector search
[+] [-] kingkongjaffa|1 year ago|reply
I assumed that a specific flavour of LLM was needed, an “embedding model” to generate the vectors. Is this announcement that pinecone is adding their own?
Is it better or worse than the models here: https://ollama.com/search?c=embedding For example?
[+] [-] llm_nerd|1 year ago|reply
This is building it into the vector DB such that you send it the content and it is "built in".
Seems silly. It's like bundling a stove with cookware. But cookware fit specific niches and have different life cycles. I get that it might cater to some "drop in solution" targets, but seems of no value for most engineered, long-term solutions.
[+] [-] tejaskumar_|1 year ago|reply
> Is this announcement that pinecone is adding their own?
TLDR: they trained their own embeddings model and rely on Cohere for ranking. Pinecone (the database) uses this model automatically to generate and store embeddings.
> I assumed that a specific flavour of LLM was needed, an “embedding model” to generate the vectors.
You're mostly right, with one caveat: embeddings models aren't really LLMs in that they're not very large: they just map semantic meaning to numerical space.
> Is it better or worse than the models here: https://ollama.com/search?c=embedding For example?
This is the golden question. As far as I know, there is no appropriate benchmarking/eval data about this. I think the real value is the first-class integration between their model and their service.
[+] [-] tech2trees|1 year ago|reply
I've played around with Weaviate & Astra DB but Marqo is the best and easiest solution imo.
[+] [-] dmezzetti|1 year ago|reply
[+] [-] unknown|1 year ago|reply
[deleted]