top | item 43325155

(no title)

tehsauce | 11 months ago

There has been some good research published on this topic of how RLHF, ie aligning to human preferences easily introduces mode collapse and bias into models. For example, with a prompt like: "Choose a random number", the base pretrained model can give relatively random answers, but after fine tuning to produce responses humans like, they become very biased towards responding with numbers like "7" or "42".

discuss

robwwilliams|11 months ago

I assume 42 is a joke from deep history and The Hitchhiker’s Guide. Pretty amusing to read the Wikipedia entry:

https://en.wikipedia.org/wiki/42_(number)

sedatk|11 months ago

Douglas Adams picked 42 randomly though. :)

moffkalast|11 months ago

It's very funny that people hold the autoregressive nature of LLMs against them, while being far more hardline autoregressive themselves. It's just not consciously obvious.

antihipocrat|11 months ago

I wonder whether we hold LLMs to a different standard because we have a long term reinforced expectation for a computer to produce an exact result?

One of my first teachers said to me that a computer won't ever output anything wrong, it will produce a result according to the instructions it was given.

LLMs do follow this principle as well, it's just that when we are assessing the quality of output we are incorrectly comparing it to the deterministic alternative, and this isn't really a valid comparison.

absolutelastone|11 months ago

I think people tend to just not understand what autoregressive methods are capable of doing generally (i.e., basically anything an alternative method can do), and worse they sort of mentally view it as equivalent to a context length of 1.

aidos|11 months ago

Why is that? Whenever I’m giving examples I almost always use 7, something ending in a 7 or something in the 70s

p1necone|11 months ago

1 and 10 are on the boundary, that's not random so those are out.

5 is exactly halfway, that's not random enough either, that's out.

2, 4, 6, 8 are even and even numbers are round and friendly and comfortable, those are out too.

9 feels too close to the boundary, it's out.

That leaves 3 and 7, and 7 is more than 3 so it's got more room for randomness in it right?

Therefore 7 is the most random number between 1 and 10.

da_chicken|11 months ago

The theory I've heard is that the more prime a number is, the more random it feels. 13 feels more awkward and weird, and it doesn't come up naturally as often as 2 or 3 do in everyday life. It's rare, so it must be more random! I'll give you the most random number I can think of!

People tend to avoid extremes, too. If you ask for a number between 1 and 10, people tend to pick something in the middle. Somehow, the ordinal values of the range seem less likely.

Additionally, people tend to avoid numbers that are in other ranges. Ask for a number from 1 to 100, and it just feels wrong to pick a number between 1 and 10. They asked for a number between 1 and 100. Not this much smaller range. You don't want to give them a number they can't use. There must be a reason they said 100. I wonder if the human RNG would improve if we started asking for numbers between 21 and 114.

Ethee|11 months ago

Veritasium actually made a video on this concept about a year ago: https://www.youtube.com/watch?v=d6iQrh2TK98

d4mi3n|11 months ago

My guess is that we bias towards numbers with cultural or personal significance. 7 is lucky in western cultures and is religiously significant (see https://en.wikipedia.org/wiki/7#Culture). 42 is culturally significant in science fiction, though that's a lot more recent. There are probably other examples, but I imagine the mean converges on numbers with multiple cultural touchpoints.

d0liver|11 months ago

I like prime numbers. Non-primes always feel like they're about to fall apart on me.

mynameismon|11 months ago

Can you share any links about this?

Shorel|11 months ago

They choose 37 =)

thechao|11 months ago

Which is weird, because I thought we'd all agreed that the random number was 4?

https://xkcd.com/221/