top | item 35618262

(no title)

reset-password | 2 years ago

LLMs already have problems with fact vs fiction. I don't see how Reddit of all places has "valuable data" in that regard.

discuss

order

uptownfunk|2 years ago

I think the value is in the examples it provides of language.

nekoashide|2 years ago

Top upvoted comments can filter out the useless information and then it can be trained on actual data and refined.

Arrath|2 years ago

Except when top voted comments are hivemind approved 'funny' quips/responses, or in reply to exercises in creative writing like half the posts in relationshipadvice, iwantthemanager, nuclear/pettyrevenge, etc

aydyn|2 years ago

Is this a joke that I'm missing? Top reddit posts are frequently trash filled with misinformation.

minimaxir|2 years ago

Many popular LLMs already include large amount of Reddit comment data which is (usually) cited in their respective papers.

surgical_fire|2 years ago

Reddit also has a problem with fact vs fiction.