top | item 47007452

(no title)

Curious to hear if you've seen this work with 100k+ LoC codebases (i.e. what you could expect at a job). I've had some good experiences with high autonomy agents in smaller codebases and simpler systems but the coherency starts to fizzle out when the system gets complicated enough that thinking it through is the hard part as opposed to hammering out the code.

discuss

sensanaty|16 days ago

I'd estimate we're near a million LoC (will double check tomorrow, but wouldn't be surprised if it was over that to be honest). Huge monorepo, ~1500 engineers, all sorts of bespoke/custom tooling integrated, fullstack (including embedded code), a mix of languages (predominantly Java & JS/TS though).

In my case the AI is actively detrimental unless I hand hold it with every single file it should look into, lest it dive into weird ancient parts of the codebase that bear no relevance to the task at hand. Letting the latest and "greatest" agents loose is just a recipe for frustration and disaster despite lots of smart people trying their hardest to make these infernal tools be of any use at all. The best I've gotten out of it was some light Vue refactoring, but even then despite AGENTS.md, RULES.md and all the other voodoo people say you should do it's a crapshoot.

zozbot234|16 days ago

Ask the AI to figure out your code base (or self-contained portions of it, as applicable) and document its findings. Then correct and repeat. Over time, you end up with a scaffold in the form of internal documentation that will guide both humans and AIs in making more productive edits.

wenc|16 days ago

If you vector index your code base, agents can explore it without loading it into context. This is what Cursor and Roo and Kiro and probably others do. Claude Code uses string searches.

What helps is also getting it to generate a docs of your code so that it has map.

This is actually how humans understand a large code base too. We don’t hold a large code base in memory — we navigate it through docs and sampling bits of code.

enraged_camel|16 days ago

Around 250k here. The AI does an excellent job finding its way around, fixing complex bugs (and doing it correctly), doing intensive refactors and implementing new features using existing patterns.

christkv|16 days ago

Our codebase is well over 250k and we have a hierarchy of notes for the modules so we read as much as we need for the job with a base memory that explains how the notes work

servercobra|16 days ago

cloc says ours is ~350k LoC and agents are able to implement whole features from well designed requirement docs. But we've been investing in making our code more AI friendly, and things like Devin creating and using DeepWiki helps a lot too.

sarchertech|16 days ago

If you have agents that can implement entire features, why is it only 350k loc? Each engineer should be cranking out at least 1 feature a week. If each feature is 1500-2000 lines times 10 engineers that’s 20k lines a week.

If the answer is that the AI cranks out code faster than the team can digest and review it and faster than you can spec out the features, what’s the point? I can see completely shifting your workflow, letting skills atrophy, adopting new dependencies, and paying new vendors if it’s boosting your final output 5 or 10x.

But if it’s a 20% speed up is it worth it?