top | item 36944230

(no title)

anthlax | 2 years ago

Coming at this from a diffeeent angle, does anyone have any links to tutorials for use-cases? I’d love to see what vectorDB hype is about but as a regular engineer I’m unable to even grasp how to use a vectorDB

discuss

empath-nirvana|2 years ago

I'll give you an example of something i did with a vector database.

I was playing around with making my own UI for interfacing with chatgpt. I saved the chat transcripts in a normal postgres DB, along with the open AI embeddings for each message in a vector db, with a pointer to the message id in postgres in the vector DB metadata.

Then as you chatted, i had chatgpt continuously creating a summary of the current conversation you were having in the background and doing a search in the vector db for previous messages about whatever we're talking about, and it would inject that into the chat context invisibly. So you can do something like say: "Hey do you remember when we talked about baseball" and it would find a previous conversation where you talked about so and so hitting a home run into the context and the bot would have access to that, even though you never mentioned the word "baseball" in the previous conversation -- home run is semantically similar enough that it finds it.

If you're using openai embeddings as your vectors, it's _extremely_ impressive how well it finds similar topics, even when the actual words used are completely different.

unknown|2 years ago

[deleted]

gk1|2 years ago

We made an entire learning center for interested folks like you: https://www.pinecone.io/learn/

I recommend starting at https://www.pinecone.io/learn/vector-database/

simonw|2 years ago

I wrote one here: https://simonwillison.net/2023/Jan/13/semantic-search-answer...

estreeper|2 years ago

I recently wrote a tutorial on making a vector driven semantic search app using all open source tools (pgvector, Instructor, and Flask) that might be helpful: https://revelry.co/insights/open-source-semantic-vector-sear...

ZephyrBlu|2 years ago

Not a tutorial, but TLDR vector DBs are specialized DBs that store embeddings. Embeddings are vector representations of data (E.g. text or images), which means you can compare them in a quantifiable way.

This enables use cases like semantic search and Retrieval-Augmented Generation (RAG) as mentioned in the article.

Semantic search is: I search for "royal" and I get results that mention "king" or "queen" because they are semantically similar.

RAG is: I make a query asking, "tell me about the English royal family", semantically similar information is fetched using semantic search and provided as context to an LLM to generate an answer.