top | item 47158921

(no title)

I wonder if the 70% vs 80% "Probably" problem comes from cultural differences between anglophone countries. The human datasets that were available were mostly American, with some Western Europe/NATO. Notably missing would be India, which simply by population I'd expect to represent a significant chunk of English-language writing available on the open internet ( and thus fed into LLM training sets).

The other phenomena I would love to test is if the act of surveying people effected their declared odds. Not sure how to get good numbers out of that, but I could see the LLM vs surveyed human discrepancy arising from people using "probably" differently in their everyday writing, as opposed to when asked point-blank what "probably" means.

discuss

No comments yet.