top | item 41473762 (no title) oldcai | 1 year ago That's a great conversation about the reflection 70b, but do you still have doubts about whether it's a hype or a game-changer?The link provided leads to a playground for the reflection llama 70B. discuss order hn newest magicalhippo|1 year ago In the Physics of Language Models talk[1], he shows how a LLM trained to be able to backtrack can give much better answers.However he also points out it has to be included in the initial training, you can't improve a non-backtrack-trained model by finetuning it later.So seems it's probably the way to go for training new models, but limited applicability to those already trained.[1]: https://www.youtube.com/watch?v=yBL7J0kgldU
magicalhippo|1 year ago In the Physics of Language Models talk[1], he shows how a LLM trained to be able to backtrack can give much better answers.However he also points out it has to be included in the initial training, you can't improve a non-backtrack-trained model by finetuning it later.So seems it's probably the way to go for training new models, but limited applicability to those already trained.[1]: https://www.youtube.com/watch?v=yBL7J0kgldU
magicalhippo|1 year ago
However he also points out it has to be included in the initial training, you can't improve a non-backtrack-trained model by finetuning it later.
So seems it's probably the way to go for training new models, but limited applicability to those already trained.
[1]: https://www.youtube.com/watch?v=yBL7J0kgldU