top | item 46899384

(no title)

Djamba | 25 days ago

Interesting. Can you elaborate a bit more please

discuss

The RAG was setup on a bunch of documents, most of them were manuals containing steps about measurements, troubleshooting, and replacing components of industrial machines.

The issue was that most of these steps were long (above 512 tokens). So the typical chunk window wouldn't capture the full steps. We added a tool calling capability by which LLM can request nearby chunks of a given chunk. This worked well in practice, but burned more $$.