top | item 46413589

BM25 search and Claude = efficient precision

2 points| marwamc | 2 months ago |github.com

4 comments

order

marwamc|2 months ago

When using AI coding assistants to refactor symbols across large codebases (6k+ files), developers face a binary choice: precision (LSP-based tools) or efficiency (grep/ripgrep). Shebe attempts to address this trade-off by way of a good old BM25 index, which is surprisingly fast and efficient.

icsa|2 months ago

How well does this approach work with C++ source code - which is notoriously difficult to parse, given context-dependent semantics?

marwamc|2 months ago

shebe asks the simple question: "where does this symbol appear as text?". For C++ codebases that heavily use templates and macros, shebe will struggle. But I'm curious how it would actually perform, so I'm currently performing a search on https://gitlab.com/libeigen/eigen. Will report the results shortly.