(no title)
DougN7
|
8 days ago
Maybe I’m too naive but I can never tell when something is written by AI. If it works with next most likely token, doesn’t that mean it has encountered the patterns you’re picking out in lots and lots of text written by humans? Please educate me if I’m wrong.
JCharante|7 days ago
In pre-training data, yes
There are post-training datasets, where the weights are changed to conform to human preference. These datasets are created by groups of thousands of people all following a 40-page guide, and these guides have example. People over-index on these examples and so sample sentences with these structures are over represented in these datasets and used for post-training.
unknown|8 days ago
[deleted]
jinushaun|7 days ago
matwood|7 days ago