top | item 44614269

(no title)

z7 | 7 months ago

Some previous predictions:

In 2021 Paul Christiano wrote he would update from 30% to "50% chance of hard takeoff" if we saw an IMO gold by 2025.

He thought there was an 8% chance of this happening.

Eliezer Yudkowsky said "at least 16%".

Source:

https://www.lesswrong.com/posts/sWLLdG6DWJEy3CH7n/imo-challe...

discuss

sigmoid10|7 months ago

While I usually enjoy seeing these discussions, I think they are really pushing the usefulness of bayesian statistics. If one dude says the chance for an outcome is 8% and another says it's 16% and the outcome does occur, they were both pretty wrong, even though it might seem like the one who guessed a few % higher might have had a better belief system. Now if one of them had said 90% while the other said 8% or 16%, then we should pay close attention to what they are saying.

AlphaAndOmega0|7 months ago

The person who guessed 16% would have a lower Brier score (lower is better) and someone who estimated 100%, beyond being correct, would have the lowest possible value.

zeroonetwothree|7 months ago

A 16% or even 8% event happening is quite common so really it tells us nothing and doesn’t mean either one was pretty wrong.

grillitoazul|7 months ago

From a mathematical point of view there are two factors: (1) Initial prior capability of prediction from the human agents and (2) Acceleration in the predicted event. Now we examine the result under such a model and conclude that:

The more prior predictive power of human agents imply the more a posterior acceleration of progress in LLMs (math capability).

Here we are supposing that the increase in training data is not the main explanatory factor.

This example is the gem of a general framework for assessing acceleration in LLM progress, and I think its application to many data points could give us valuable information.

tunesmith|7 months ago

The whole point is to make many such predictions and experience many outcomes. The goal is for your 70% predictions to be correct 70% of the time. We all have a gap between how confident we are and how often we're correct. Calibration, which can be measured by making many predictions, is about reducing that gap.

fxwin|7 months ago

If i predict that my next dice roll will be a 5 with 16% certainty and i do indeed roll a 5, was my prediction wrong?

davidclark|7 months ago

The correctness of 8%, 16%, and 90% are all equally unknown since we only have one timeline, no?

exegeist|7 months ago

Impressive prediction, especially pre-ChatGPT. Compare to Gary Marcus 3 months ago: https://garymarcus.substack.com/p/reports-of-llms-mastering-...

We may certainly hope Eliezer's other predictions don't prove so well-calibrated.

rafaelero|7 months ago

Gary Marcus is so systematically and overconfidently wrong that I wonder why we keep talking about this clown.

causal|7 months ago

These numbers feel kind of meaningless without any work showing how he got to 16%

dcre|7 months ago

I do think Gary Marcus says a lot of wrong stuff about LLMs but I don’t see anything too egregious in that post. He’s just describing the results they got a few months ago.

shuckles|7 months ago

My understanding is that Eliezer more or less thinks it's over for humans.

andrepd|7 months ago

Context? Who are these people and what are these numbers and why shouldn't I assume they're pulled from thin air?

sailingparrot|7 months ago

> why shouldn't I assume they're pulled from thin air?

You definitely should assume they are. They are rationalists, the modus operandi is to pull stuff out of thin air and slap a single digit precision percentage prediction in front to make it seems grounded in science and well thought out.

c1ccccc1|7 months ago

You should basically assume they are pulled from thin air. (Or more precisely, from the brain and world model of the people making the prediction.)

The point of giving such estimates is mostly an exercise in getting better at understanding the world, and a way to keep yourself honest by making predictions in advance. If someone else consistently gives higher probabilities to events that ended up happening than you did, then that's an indication that there's space for you to improve your prediction ability. (The quantitative way to compare these things is to see who has lower log loss [1].)

[1] https://en.wikipedia.org/wiki/Cross-entropy

ohdeargodno|7 months ago

>Who are these people

Clowns, mostly. Yudkowski in particular, whose only job today seems to be making awful predictions and letting lesswrong eat it up when one out of a hundred ends up coming true, solidifying his position as AI-will-destroy-the-world messiah. They make money from these outlandish takes, and more money when you keep talking about them.

It's kind of like listening to the local drunkard at the bar that once in a while ends up predicting which team is going to win in football inbetween drunken and nonsensical rants, except that for some reason posting the predictions on the internet makes him a celebrity, instead of just a drunk curiosity.

meindnoch|7 months ago

>Who are these people

Be glad you don't know anything about them. Seriously.

Maxious|7 months ago

ask chatgpt

empiricus|7 months ago

16% is just a way of saying one in six chances

Xenoamorphous|7 months ago

Or just “twice as likely as the guy who said 8%”.

Workaccount2|7 months ago

One of the most worrying trends in AI has been how wrong the experts have been with overestimating timelines.

On the other hand, I think human hubris naturally makes us dramatically overestimate how special brains are.

UltraSane|7 months ago

Those percentages are completely meaningless. No better than astrology.

sailingparrot|7 months ago

Off topic, but am I the only one getting triggered every time I see a rationalist quantify their prediction of the future with single digit accuracy? It's like their magic way of trying to get everyone to forget that they reached their conclusion in completely hand-wavy way, just like every other human being. But instead of saying "low confidence" or "high confidence" like the rest of us normies, they will tell you they think there is 16.27% chance because they really really want you to be aware that they know bayes theorem.

tedsanders|7 months ago

Interestingly, this is actually a question that's been looked at empirically!

Take a look at this paper: https://scholar.harvard.edu/files/rzeckhauser/files/value_of...

They took high-precision forecasts from a forecasting tournament and rounded them to coarser buckets (nearest 5%, nearest 10%, nearest 33%), to see if the precision was actually conveying any real information. What they found is that if you rounded the forecasts of expert forecasters, Brier scores got consistently worse, suggesting that expert forecast precision at the 5% level is still conveying useful, if noisy, information. They also found that less expert forecasters took less of a hit from rounding their forecasts, which makes sense.

It's a really interesting paper, and they recommend that foreign policy analysts try to increase precision rather than retreating to lumpy buckets like "likely" or "unlikely".

Based on this, it seems totally reasonable for a rationalist to make guesses with single digit precision, and I don't think it's really worth criticizing.

c1ccccc1|7 months ago

Would you also get triggered if you saw people make a bet at, say, $24 : $87 odds? Would you shout: "No! That's too precise, you should bet $20 : $90!"? For that matter, should all prices in the stock market be multiples of $1, (since, after all, fluctuations of greater than $1 are very common)?

If the variance (uncertainty) in a number is large, correct thing to do is to just also report the variance, not to round the mean to a whole number.

Also, in log odds, the difference between 5% and 10% is about the same as the difference between 40% and 60%. So using an intermediate value like 8% is less crazy than you'd think.

People writing comments in their own little forum where they happen not to use sig-figs to communicate uncertainty is probably not a sinister attempt to convince "everyone" that their predictions are somehow scientific. For one thing, I doubt most people are dumb enough to be convinced by that, even if it were the goal. For another, the expected audience for these comments was not "everyone", it was specifically people who are likely to interpret those probabilities in a Bayesian way (i.e. as subjective probabilities).

danlitt|7 months ago

No, you are right, this hyper-numericalism is just astrology for nerds.

ben_w|7 months ago

> But instead of saying "low confidence" or "high confidence" like the rest of us normies

To add to what tedsanders wrote: there's also research that shows verbal descriptions, like those, mean wildly different things from one person to the next: https://lettersremain.com/perceptions-of-probability-and-num...

jdmoreira|7 months ago

Obviously you know nothing about a brier score.

https://en.wikipedia.org/wiki/Brier_score

also:

https://en.m.wikipedia.org/wiki/Superforecaster

mewpmewp2|7 months ago

If you take it with a grain of salt it's better than nothing. In life to express your opinion sometimes the best way is to quantify that based on intuition. To make decisions you could compile multiple experts intuitive quantities and use median or similar. There are some cases where it's more straight forward and rote, e.g. in military if you have to make distance based decisions, you might ask 8 of your soldiers to each name a number they think the distance is and take the median.

baxtr|7 months ago

No you’re definitely not the only one… 10% is ok, 5% maybe, 1% is useless.

And since we’re at it: why not give confidence intervals too?

meindnoch|7 months ago

>Off topic, but am I the only one getting triggered every time I see a rationalist

The rest of the sentence is not necessary. No, you're not the only one.

unknown|7 months ago

[deleted]

jere|7 months ago

You could look at 16% as roughly equivalent to a dice roll (1 in 6) or, you know, the odds you lose a round of Russian roulette. That's my charitable interpretation at least. Otherwise it does sound silly.

unknown|7 months ago

[deleted]

Veedrac|7 months ago

There is no honor in hiding behind euphemisms. Rationalists say ‘low confidence’ and ‘high confidence’ all the time, just not when they're making an actual bet and need to directly compare credences. And the 16.27% mockery is completely dishonest. They used less than a single significant figure.

drexlspivey|7 months ago

Yes