(no title)
gillesjacobs | 1 year ago
Interestingly, going both ways: generate hypothetical answers for the query, and also generate hypothetical questions for the text chunk at ingestion both increase RAG performance in my experience.
Though LLM-based query-processing is not always suitable for chat applications if inference time is a concer (like near-real time customer support RAG), so ingestion-time hypothetical answer generation is more apt there.
No comments yet.