(no title)
manishsharan | 4 months ago
Chunking strategy is a big issue. I found acceptable results by shoving large texts to to gemini flash and have it summarize and extract chunks instead of whatever text splitter I tried. I use the method published by Anthropic https://www.anthropic.com/engineering/contextual-retrieval i.e. include full summary along with chunks for each embedding.
I also created a tool to enable the LLM to do vector search on its own .
I do not use Langchain or python.. I use Clojure+ LLMs' REST APIs.
crassT|4 months ago
I've struggled to find a target market though. Would you mind sharing what your use case is? It would really help give me some direction.
esafak|4 months ago
manishsharan|4 months ago
Not sensitive to latency at all. My users would rather have well researched answers than poor answers.
Also, I use batch mode APIs for chunking .. it is so much cheaper.