top | item 47162663 (no title) phreeza | 3 days ago But this is missing exactly the gap which OP seems to have, which is going from a next token predictor (a language model in the classical sense) to an instruction finetuned, RLHF-ed and "harnessed" tool? discuss order hn newest js8|3 days ago The book has a sequel https://www.manning.com/books/build-a-reasoning-model-from-s...It will give you an answer to the extent anybody can.
js8|3 days ago The book has a sequel https://www.manning.com/books/build-a-reasoning-model-from-s...It will give you an answer to the extent anybody can.
js8|3 days ago
It will give you an answer to the extent anybody can.