top | item 42418156

(no title)

killthebuddha | 1 year ago

One thing he said I think was a profound understatement, and that's that "more reasoning is more unpredictable". I think we should be thinking about reasoning as in some sense exactly the same thing as unpredictability. Or, more specifically, useful reasoning is by definition unpredictable. This framing is important when it comes to, e.g., alignment.

discuss

order

mike_hearn|1 year ago

Wouldn't it be the reverse? The word unreasonable is often used as a synonym for volatile, unpredictable, even dangerous. That's because "reason" is viewed as highly predictable. Two people who rationally reason from the same set of known facts would be expected to arrive at similar conclusions.

I think what Ilya is trying to get at here is more like: someone very smart can seem "unpredictable" to someone who is not smart, because the latter can't easily reason at the same speed or quality as the former. It's not that reason itself is unpredictable, it's that if you can reason quickly enough you might reach conclusions nobody saw coming in advance, even if they make sense.

killthebuddha|1 year ago

Your second paragraph is basically what I'm saying but with the extension that we only actually care about reasoning when we're in these kinds of asymmetric situations. But the asymmetry isn't about the other reasoner, it's about the problem. By definition we only have to reason through something if we can't predict (don't know) the answer.

I think it's important for us to all understand that if we build a machine to do valuable reasoning, we cannot know a priori what it will tell us or what it will do.

bflesch|1 year ago

they only arrive at the same conclusion if they both have the same goal.

one could be about maximising wealth while respecting other human beings, the other could be about maximising wealth without respecting other human beings.

Both could be presented same facts and 100% logical but arrive at different conclusions.

liuliu|1 year ago

I think many of replies here to you missing is the word he uses is "unpredictable". It is not "surprising", "unverifiable" or "unreasonable".

"Prediction" associated in this particular talk is about "intuition": what human can do in 0.1 second. And a most powerful reasoning model by its definition will arrive at "unintuitive" answer because if it is intuitive, it will arrive at the same answer much sooner without long chain of "reasoning". (I also want to make distinction "reasoning" here is not the same as "proof" in mathematical sense. In mathematics, an intuitive conclusion can require extrodinary proof.)

billyzs|1 year ago

To me the chess AI example he used was perhaps not the most apt. Human players may not be able to reason on as far a horizon as AI and therefore find some of AI's moves perplexing, but they can be more or less sure that a Chess AI is optimizing for the same goal under the same set of rules with them. With Reasoners, alignment is not given. They may be reasoning under an entirely different set of rules and cost functions. On more open ended questions, when Reasoners produce something that human don't understand, we can't easily say whether it's a stroke of genius, or misaligned thoughts.

bondarchuk|1 year ago

Not necessarily true when you think about e.g. finding vs. verifying a solution (in terms of time complexity).

killthebuddha|1 year ago

IMO verifying a solution is a great example of how reasoning is unpredictable. To say "I need to verify this solution" is to say "I do not know whether the solution is correct or not" or "I cannot predict whether the solution is correct or not without reasoning about it first".

bmitc|1 year ago

Are you sure that's what he was referring to? In other words, you don't think he was meaning that getting more reasoning out of models is an unpredictable process and not saying that reasoning is unpredictable.

narrator|1 year ago

Reasoning by analogy is more predictable because it is by definition more derivative of existing ideas. Reasoning from first principles though can create whole new intellectual worlds by replacing the underpinnings of ideas such that they grow in completely new directions.