top | item 36255433

(no title)

abluecloud | 2 years ago

The thing is, along with all the large subreddits, it's all these niche subreddits that have helped train all these LLM to be able to do the things they can do.

If reddit is thinking that their content is king, then closing subreddits that help generate that content is not ideal for them.

discuss

order

visarga|2 years ago

Do we know for sure which LLMs have used reddit comments in training? I want to know if my comment history is in the corpus.

CamelCaseName|2 years ago

Yes, absolutely. Sam Altman has come out and said it, although specifically he said that social media wasn't of any particular importance for training data.

This can also be seen when you mention davidjl, who was a user super into r/counting. There was a thread of that yesterday I believe.

OpenAI thanks you for your Reddit contributions.