(no title)
tshadley | 1 year ago
Combining programs should be straightforward for DNNs, ordering, mixing, matching concepts by coordinates and arithmetic in learned high-dimensional embedded-space. Inference-time combination is harder since the model is working with tokens and has to keep coherence over a growing CoT with many twists, turns and dead-ends, but with enough passes can still do well.
The logical next step to improvement is test-time training on the growing CoT, using reinforcement-fine-tuning to compress and organize the chain-of-thought into parameter-space--if we can come up with loss functions for "little progress, a lot of progress, no progress". Then more inference-time with a better understanding of the problem, rinse and repeat.
No comments yet.