top | item 44626516

(no title)

I think the "math" on reliability-over-steps will end up differently than described here in the long term because getting new factual input from the real world should improve the reliability of the end state, and we have all observed agentic systems at this point producing that behavior at least sometimes (e.g., a test failure prompts claude code to refactor correctly).

Whether or not one term in this equation currently compounds faster is a good question, or under what circumstances, etc., but presenting agentic abilities as always flawed thinking resulting in impossible long term task execution isn't right. Humans are flawed and require long, drawn out multi task thinking to get correct answers, and interacting with and getting feedback from the world outside the mind during a task execution process typically raises the chance of the correct answer being spit out in the end.

I'd agree that the agentic math isn't great at the moment, but if it's possible to reduce hallucinations or raise the strength and frequency effect of real world feedback on the model, you could see this playing out differently perhaps quite soon. There's at least a couple of examples of "we're already there".

discuss

No comments yet.