There has been some good research published on this topic of how RLHF, ie aligning to human preferences easily introduces mode collapse and bias into models. For example, with a prompt like: "Choose a random number", the base pretrained model can give relatively random answers, but after fine tuning to produce responses humans like, they become very biased towards responding with numbers like "7" or "42".
robwwilliams|11 months ago
https://en.wikipedia.org/wiki/42_(number)
sedatk|11 months ago
moffkalast|11 months ago
antihipocrat|11 months ago
One of my first teachers said to me that a computer won't ever output anything wrong, it will produce a result according to the instructions it was given.
LLMs do follow this principle as well, it's just that when we are assessing the quality of output we are incorrectly comparing it to the deterministic alternative, and this isn't really a valid comparison.
absolutelastone|11 months ago
aidos|11 months ago
p1necone|11 months ago
5 is exactly halfway, that's not random enough either, that's out.
2, 4, 6, 8 are even and even numbers are round and friendly and comfortable, those are out too.
9 feels too close to the boundary, it's out.
That leaves 3 and 7, and 7 is more than 3 so it's got more room for randomness in it right?
Therefore 7 is the most random number between 1 and 10.
da_chicken|11 months ago
People tend to avoid extremes, too. If you ask for a number between 1 and 10, people tend to pick something in the middle. Somehow, the ordinal values of the range seem less likely.
Additionally, people tend to avoid numbers that are in other ranges. Ask for a number from 1 to 100, and it just feels wrong to pick a number between 1 and 10. They asked for a number between 1 and 100. Not this much smaller range. You don't want to give them a number they can't use. There must be a reason they said 100. I wonder if the human RNG would improve if we started asking for numbers between 21 and 114.
Ethee|11 months ago
d4mi3n|11 months ago
d0liver|11 months ago
mynameismon|11 months ago
Shorel|11 months ago
thechao|11 months ago
https://xkcd.com/221/