top | item 47044109

(no title)

Biggest limitation I see in this paper: the framing. Any time you have a lot of proprietary knowledge or you've just sorted out the right solution when it's not readily available from the model's parametric knowledge, that's when you should add a skill. Wrap it in a CLI that's easy to inspect. You don't need to store the whole help text of the skill either. The model can inspect it and its subcommands.

Reality doesn't force us to choose between skill or no skill, reality often doesn't give us a choice. You can either make a skill for your company's proprietary system or your model has to figure it out from scratch every time by searching wikis or reading code. If you use it right, skills are a compression mechanism. Instead of the process meaning your model needs to get all of theses files dynamically, it can simply statically run.

To steel-man the paper. It is worth looking at whether you should try to code something up first or try a skill first. And it may well be valid to say try first and if you can't work it out in 5 mins, install a skill. But there's a meta point of skills as software (where you deduplicate the effort of solving regressions).

For a reductio ad absurdum, If self-generated skills with no additional context _didn't_ eventually level off in performance, then we could reach AGI by making one big skill that keeps growing and solving harder and harder tasks, including improving the capability of its own skill-builder skill, all without embedding any signals from the environment or needing to interface with the real world at all.

discuss

No comments yet.