top | item 42480973

(no title)

tshadley | 1 year ago

I always get the feeling he's subconsciously inserting a "magical" step here with reference to "synthesis"-- invoking a kind of subtle dualism where human intelligence is just different and mysteriously better than hardware intelligence.

Combining programs should be straightforward for DNNs, ordering, mixing, matching concepts by coordinates and arithmetic in learned high-dimensional embedded-space. Inference-time combination is harder since the model is working with tokens and has to keep coherence over a growing CoT with many twists, turns and dead-ends, but with enough passes can still do well.

The logical next step to improvement is test-time training on the growing CoT, using reinforcement-fine-tuning to compress and organize the chain-of-thought into parameter-space--if we can come up with loss functions for "little progress, a lot of progress, no progress". Then more inference-time with a better understanding of the problem, rinse and repeat.

discuss

No comments yet.