top | item 42309510

(no title)

wolfgarbe | 1 year ago

For the latency benchmarks we used vanilla BM25 (SimilarityType::Bm25f for a single field) for comparability, so there are no differences in terms of accuracy.

For SimilarityType::Bm25fProximity which takes into account the proximity between query term matches within the document, we have so far only anecdotal evidence that it returns significantly more relevant results for many queries.

Systematic relevancy benchmarks like BeIR, MS MARCO are planned.

discuss

order

ghita_|1 year ago

got it - i think the anecdotal evidence is what intrigued me a little bit looking forward to seeing the systematic relevancy benchmarks