top | item 38009263

(no title)

llmllmllm | 2 years ago

I have found success with chunking with 100 tokens, preceeded by the last 10 tokens of the previous chunk, and the first 10 tokens of the next chunk, 120 tokens total. I generate an embedding for each, then compare that to embedding(s) derived from the input query.

How to generate embeddings from the input query well is where one's focus should be IMO. An example: "don't mention x" being turned into filtering out / de-emphasizing chunks that align with the embedding for x.

I've been using these techniques along with pgvector and OpenAI's embeddings for https://flowch.ai and it works really well. A user uploads a document or uses the Chrome Extension on a webpage and FlowChai chunks up the content, generates embeddings, builds up a RAG context and then produces a report based on the user's prompt.

I hope that helps show a real world example. You're welcome to play with FlowChai for free to see how it works in practice at the application level.

discuss

No comments yet.