top | item 46990954

(no title)

miligauss | 17 days ago

It's a more of a black box with claude, at least with this you see the proof strategy and mistakes made by the model when it decomposes the problem. I think instead of Ralph looping you get something that is top-down. If models were smarter and context windows bigger i am sure complex tasks like this one would be simpler, but braking it down into sub agents and having a collective --"we already tried this strategy and it backtracked"-- intelligence is a nice way to scope a limited context window to an independent sub problem.

discuss

raphaelmolly8|17 days ago

The Lean angle here is really interesting: most multi-agent demos dodge hard verification, but tying each agent’s output to makes the feedback loop objective. Curious how you’re handling goal-claim conflicts/duplication when two agents find competing tactic sequences for the same subgoal—do you keep both in memory with some ranking signal (time-to-verify, proof term size, etc.)?

austinbaggio|17 days ago

We use TTL-based claim locks so only one agent works on one goal at a time.

Failed strategies + successful tactics all get written to shared memory, so if a claim expires and a new agent picks it up, it sees everything the previous agent tried.

Ranking is first-verified-wins.

For competing decomposition strategies, we backtrack: if children fail, the goal reopens, and the failed architecture gets recorded so the next attempt avoids it.