top | item 47169843

(no title)

normalocity | 4 days ago

> ... software can, in a very real sense, become self-improving.

This is referring to the software the agent is working on, not the agent.

> This creates a continuous feedback loop.

This is referring to the feedback loop of the agent effectively compressing learnings from a previous chat session into documentation it can use to more effectively bootstrap future sessions, or sub-agents. This isn't about altering the agent, but instead about creating a feedback loop between the agent and the software it's working on to improve the ability for the agent to take on the next task, or delegate a sub-task to a sub-agent.

> "... the type of self-improvement we’re talking about is far more pragmatic and much less dangerous."

This is a statement about the agent playing a part in maintaining not just the code, but other artifacts around the code. Not about the agent self-improving, nor the agent altering itself.

discuss

order

selridge|3 days ago

I think we need to invent that distinction, which is notable since the article has MANY opportunities to say it clearly. Instead we are given a picture where the improvement of the agent and the software (here docs are included) is a LOOP, and to make the loop plausible we need to imagine learning in agents that doesn't exist.

That doesn't mean your agent won't improve with a better onboarding regime, but that's a unidirectional process. You can insinuate things into context, but that's not automatically 'learned' and it can be lost at compaction and will be discarded when the session ends. An agent who is onboarded might write better onboarding docs, that's true! But "agents are onboarded mindfully with project docs, then write project docs, which are used to onboard." That's a real lift, but it's best expressed as "we should have been writing good docs and tests all along, but that shit was exhausting; now robots do it."

Don't get me wrong, a fractal onboarding regime is the way. It's just...not a self-improving loop without allowing contextual latch to stand in for learning.