top | item 44878779

(no title)

alexchamberlain | 6 months ago

I'm not sure how, and maybe some of the coding agents are doing this, but we need to teach the AI to use abstractions, rather than the whole code base for context. We as humans don't hold the whole codebase in our hear, and we shouldn't expect the AI to either.

discuss

order

LinXitoW|6 months ago

They already do, or at least Claude Code does. It will search for a method name, then only load a chunk of that file to get the method signature, for example.

It will use the general information you give it to make educated guesses of where things are. If it knows the code is Vue based and it has to do something with "users", it might seach for "src/*/User.vue.

This is also the reason why the quality of your code makes such a large difference. The more consistent the naming of files and classes, the better the AI is at finding them.

sdesol|6 months ago

LLMs (current implementation) are probabilistic so it really needs the actual code to predict the most likely next tokens. Now loading the whole code base can be a problem in itself, since other files may negatively affect the next token.

photon_lines|6 months ago

Sorry -- I keep seeing this being used but I'm not entirely sure how it differs from most of human thinking. Most human 'reasoning' is probabilistic as well and we rely on 'associative' networks to ingest information. In a similar manner - LLMs use association as well -- and not only that, but they are capable of figuring out patterns based on examples (just like humans are) -- read this paper for context: https://arxiv.org/pdf/2005.14165. In other words, they are capable of grokking patterns from simple data (just like humans are). I've given various LLMs my requirements and they produced working solutions for me by simply 1) including all of the requirements in my prompt and 2) asking them to think through and 'reason' through their suggestions and the products have always been superior to what most humans have produced. The 'LLMs are probabilistic predictors' comments though keep appearing on threads and I'm not quite sure I understand them -- yes, LLMs don't have 'human context' i.e. data needed to understand human beings since they have not directly been fed in human experiences, but for the most part -- LLMs are not simple 'statistical predictors' as everyone brands them to be. You can see a thorough write-up I did of what GPT is / was here if you're interested: https://photonlines.substack.com/p/intuitive-and-visual-guid...

nomel|6 months ago

No, it doesn’t, nor do we. It’s why abstractions and documentations exist.

If you know what a function achieves, and you trust it to do that, you don’t need to see/hold its exact implementation in your head.

anthonypasq|6 months ago

the fact we cant keep the repo in our working memory is a flaw of our brains. i cant see how you could possibly make the argument that if you were somehow able to keep the entire codebase in your head that it would be a disadvantage.

SkyBelow|6 months ago

Information tradeoff. Even if you could keep the entire code base in memory, if something else has to be left out of memory, then you have to consider the value of an abstraction verses whatever other information is lost. Abstractions also apply to the business domain and works the same.

You also have time tradeoffs. Like time to access memory and time to process that memory to achieve some outcome.

There is also quality. If you can keep the entire code base in memory but with some chance of confusion, while abstractions will allow less chance of confusion, then the tradeoff of abstractions might be worth it still.

Even if we assume a memory that has no limits, can access and process all information at constant speed, and no quality loss, there is still communication limitations to worry about. Energy consumption is yet another.

siwatanejo|6 months ago

I do think AIs are already using abstractions, otherwise you would be submitting all the source code of your dependencies into the context.

TheOtherHobbes|6 months ago

I think they're recognising patterns, which is not the same thing.

Abstractions are stable, they're explicit in their domains, good abstractions cross multiple domains, and they typically come with a symbolic algebra of available operations.

Math is made of abstractions.

Patterns are a weaker form of cognition. They're implicit, heavily context-dependent, and there's no algebra. You have to poke at them crudely in the hope you can make them do something useful.

Using LLMs feels more like the latter than the former.

If LLMs were generating true abstractions they'd be finding meta-descriptions for code and language and making them accessible directly.

AGI - or ASI - may be be able to do that some day, but it's not doing that now.

F7F7F7|6 months ago

There are a billion and one repos that claim to help do this. Let us know when you find one.

throwaway314155|6 months ago

/compact in Claude Code is effectively this.

brulard|6 months ago

Compact is a reasonable default way to do that, but quite often it discards important details. It's better to have CC to store important details, decisions and reasons in a document where it can be reviewed and modified if needed.