(no title)
Majromax | 20 days ago
This is an expected outcome of how LLMs handle large problems. One of the "scaling" results is that the probability of success depends inversely on the problem size / length / duration (leading to headlines like "AI can now automate tasks that take humans [1 hour/etc]").
If the problem is broken down, however, then it's no longer a single problem but a series of sub-problems. If:
* The acceptance criteria are robust, so that success or failure can be reliably and automatically determined by the model itself, * The specification is correct, in that the full system will work as-designed if the sub-parts are individually correct, and * The parts are reasonably independent, so that complete components can be treated as a 'black box', without implementation detail polluting the model's context,
... then one can observe a much higher overall success rate by taking repeated high-probability shots (on small problems) rather than long-odds one-shots.
To be fair, this same basic intuition is also true for humans, but the boundaries are a lot fuzzier because we have genuine long-term memory and a lifetime of experience with conceptual chunking. Nobody is keeping a million-line codebase in their working memory.
No comments yet.