top | item 37848342

(no title)

jshmrsn | 2 years ago

An idea I hear often listening to talks about LLMs, is that training on a larger (assuming constant quality) and more various data leads to the emergence of grater generalization and reasoning (if I may use this word) across task categories. While the general quality of a model has a somewhat predictable correlation with the amount of training, the amount of training where specific generalization and reasoning capabilities emerge is much less predictable.

discuss

No comments yet.