top | item 42073073

(no title)

skp1995 | 1 year ago

ohh inserting.. I tried it on couple of big repos and it was a bit of a miss to me. How large are the codebases on which you work? I want to get a sense check on where the behavior detoriates with embedding + gpt3.5 based reranker search (not sure if they are doing more now!)

discuss

order

yen223|1 year ago

Largest repo I used with Cursor was about 600,000 lines long

skp1995|1 year ago

that's a good metric to aim for... creating a full local index for 600k lines is pretty expensive but there are a bunch of huristics which can take us pretty far

- looking at git commits - making use of recently accesses files - keyword search

If I set these constraints and allow for maybe around 2 LLM round trips we can get pretty far in terms of performance.