top | item 40085691

(no title)

saghul | 1 year ago

Kinda surprised the 8B gets this wrong: "what's heavier a kilo of steel or two kilos of feathers?" GPT-3.5 gets it wrong too. The 70B model does get it right, so does GPT-4.

discuss

order

pennomi|1 year ago

My pet question is “Which weighs more, 1000cm^3 of styrofoam or 1cm^3 of tungsten?”

Most LLMs go through the calculation and find the styrofoam is heavier, then confidently announces that the tungsten weighs more. Strange considering it’ll say something very nearly like “The styrofoam weighs 50 g and the tungsten weighs 19.3 g, therefore the tungsten is heavier.”

fransje26|1 year ago

That's not how it responded to my query.

> What's heavier? 1 kg of lead or 2 kg of feathers?

That's a classic trick question!

The answer is: 2 kg of feathers.

Why? Because 2 kg is heavier than 1 kg, regardless of the material. The density of the material doesn't matter in this case, only the weight. So, 2 kg of feathers would weigh more than 1 kg of lead.

fransje26|1 year ago

I stand corrected. I was inadvertently on the 70b model.

wongarsu|1 year ago

That seems to be the general experience. Maybe 8B are just too few parameters to achieve higher level reasoning.

brrrrrm|1 year ago

Maybe depth rather than parameter count.