top | item 46901397

(no title)

The RAG was setup on a bunch of documents, most of them were manuals containing steps about measurements, troubleshooting, and replacing components of industrial machines.

The issue was that most of these steps were long (above 512 tokens). So the typical chunk window wouldn't capture the full steps. We added a tool calling capability by which LLM can request nearby chunks of a given chunk. This worked well in practice, but burned more $$.

discuss

No comments yet.