top | item 46635422

Show HN: RagTune – EXPLAIN ANALYZE for your RAG retrieval layer

1 points| metawake | 1 month ago |github.com

CLI tool to debug and benchmark RAG retrieval without LLM calls.

- `ragtune explain "query"` → see what was retrieved with scores - `ragtune simulate` → batch eval with recall/MRR metrics - `ragtune compare` → compare embedders or chunk sizes - CI/CD mode for quality gates

Works with Qdrant, pgvector, Weaviate, Chroma, Pinecone.

Built because I kept guessing why retrieval was bad. Now I can see exactly what's happening.

1 comment

reena_signalhq|1 month ago

[deleted]

metawake|1 month ago

Thanks! To answer your questions:

*Backends:* Currently supports Qdrant, pgvector, Weaviate, Chroma, and Pinecone. Adding more is straightforward since it's just implementing a Store interface. Let me know if I missed some good backend!

*Relevance scoring:* No LLM-as-judge — that's intentional. RagTune focuses on retrieval-layer metrics only:

- Vector similarity scores (what the DB returns) - Recall@K, MRR against your golden set - Score distribution diagnostics

The philosophy is: debug retrieval separately from generation. If your retrieval is broken, no amount of prompt engineering will fix it.

For chunk size/overlap optimization — exactly the use case! `ragtune compare --chunk-sizes 256,512,1024` lets you see the impact directly.

Happy to hear feedback if you try it!