top | item 11273992

AlphaGo shows its true strength in 3rd victory against Lee Sedol

271 points| luu | 10 years ago |gogameguru.com | reply

222 comments

order
[+] sdenton4|10 years ago|reply
It would be interesting to know AlphaGo's estimated probability of winning as the game progressed; presumably we can see directly how worried it was at any given point in the game. And thus get another sense of whether it ever really felt threatened by Sedol.

At this point, the match is won, but games 4 and 5 will commence. The question shifts from whether AlphaGo is better than humanity's best, to whether humanity can even have a chance of beating AlphaGo in a single game. And so far, it sounds like the answer is likely to be a resounding no.

[+] atrudeau|10 years ago|reply
I felt that Lee thought global sometimes, especially when he had time on the clock, but his mind could only take so much thinking, and time was catching up, so for a great many number of moves he thought locally.

AlphaGo doesn't tire and thinks globally on every move.

We saw that time and time again. Lee playing locally and AlphaGo surprising everyone with an unexpected move somewhere else on the board.

Obviously it is better to be globally optimal than locally optimal.

[+] robotresearcher|10 years ago|reply
For reference, AlphaGo is not globally optimal. Go is much too big for that.

(edit: also there'd be no point playing a game against it!)

[+] kotach|10 years ago|reply
AlphaGo is approximating global optimality by finding local optimality. Local optimality is already computationally very hard, but it is exactly what AlphaGo is doing.

The rollouts they are doing, evaluating every probable move, it is a search process of trying to find local optimality. You have a current state of the board and you're trying to find a decision that minimizes your future regret. It is by definition local optimality.

Global optimality would be finding a sequence of moves that wins you a game.

[+] highwind|10 years ago|reply
Doesn't principle of optimality state that if a solution is optimal than any of its sub problem must be optimal as well?
[+] leeoniya|10 years ago|reply
> Obviously it is better to be globally optimal than locally optimal.

the implication of having machines dictate our behavior [WRT environment, each other] to optimize the survival of humans far into the future seems a bit closer to reality.

[+] partycoder|10 years ago|reply
Skeptics have said that achieving this milestone would not happen within this decade, our lifetimes or even ever. It happened yesterday.

It took Lee Sedol many decades of his life to train to achieve this level. And his ability to pass on his skills is limited. Now that a computer has achieved this level, the state of the neural network behind it can be serialized and ran into an unlimited number of computers and have millions of systems that are more proficient at Go than the best player in the world.

People have said that achieving the cognitive level of the human brain requires to match its computational power. But if you take out all the parasympathetic and motor boilerplate, what is actually left for mental tasks is much less from that, most of that power is not even recruited for higher level mental tasks. That lowers the bar for strong AI.

Then, strong AI can be immortal, and never physically deteriorate from aging. Strong AI can multiply infinitely and communicate at a rate that would be equivalent to writing millions of books in a second. It could transfer all its knowledge in seconds. It can also recursively improve itself. This advantage will lower the bar for strong AI even more.

[+] andrewflnr|10 years ago|reply
What I found especially chilling was how the AI took its time when it thought it was ahead, and played exactly as aggressively as needed to win. It seems like we're seeing the full benefit of detaching logic from human emotional drives like pride, loss of nerve, etc. Which at first sight looks awesome, but in a way it's deeply terrifying, because if such a machine is ever given power in the real world, we would really like it to have some regard for human emotion. More than anything else I've seen, this makes me feel in my gut like I'm seeing my replacement.
[+] aprescott|10 years ago|reply
> the AI took its time when it thought it was ahead, and played exactly as aggressively as needed to win

Don't human professional players do exactly this? They care about playing the best move and winning. The difference is that AlphaGo is likely much more accurate at determining what "95% probability of winning" means in terms of gameplay. A human has a harder time judging the eventual outcome of a game, and so plays more "aggressively" (favouring point margins to account for variance in the estimate, etc.) than AlphaGo might.

[+] jtolmar|10 years ago|reply
On the perhaps more reassuring side, it's possible for there to be an AI that takes into account human emotion, but does not have any of its own. Given power in the real world, it would do a better job of making people happy than a human with all their biases. It wouldn't assume that other people must like things it like, for example.

(Of course, making an AI that understands emotion isn't easy. And happiness by itself is an insufficient goal if we're building ourselves a benevolent overlord.)

[+] guelo|10 years ago|reply
I don't understand how this is considered fair. AlphaGo has been trained on a database that includes every recorded game Sedol has ever played while Sedol is seeing AlphaGo's play style for the first time. Sedol should have been allowed to play against AlphaGo for a few months before the match so he could study its style.
[+] clarle|10 years ago|reply
This is more than fair.

Go AIs weren't expected to reach this level for at least another 10 years.

Before AlphaGo, Zen and Crazy Stone (the previous Go AIs) could only play against top-level professionals with a significant 4-5 stone starting handicap, and this was less than 3 years ago. A 4-5 stone handicap is basically taking control of half the board before the game has even started.

It really shows how the neural network approach made a huge difference in such a short time.

[+] krackers|10 years ago|reply
One member of the deepmind team responded to this very question during the interview at the beginning of part 3.

He said that the training data set size is much, much larger than the number of lee sedol games. It is like a drop in the ocean and not enough to significantly influence the resulting policy network.

[+] hyperpape|10 years ago|reply
I don't think this is unfair, but I think other people are replying to a suggestion that AlphaGo might be optimized to beat Lee that I don't see in the parent comment.

What does seem right is that whatever strategic insights are in Lee's play are reflected in current games--his and the younger generation who came up in his shadow. Whatever strategic novelties shape AlphaGo, they are totally new to Lee.

I don't think that would make the difference: 1) there's no trick to be learned and 2) the same thing happens with human players to some extent--when Lee Changho appeared on the international go scene, his style was misunderstood and underestimated, even when his games were public.

However, it is true that there is a real asymmetry--AlphaGo may not know Lee from other players, but it has had the opportunity to "study" the best games of the current players, and no one outside of DeepMind has had an opportunity to study its games.

[+] _yosefk|10 years ago|reply
As it happened in chess, machines will beat humans in Go no matter how much knowledge about their inner workings will be provided to the human. (I've watched this story unfold in chess and the various hopes of how humans are still somehow better. $1K sez Go is exactly the same story. Can't beat a machine in a formal universe with a defined goal.)
[+] asdfologist|10 years ago|reply
It's actually entirely possible that if the program were unsupervised, i.e. had to learn Go "from scratch" without relying on any human games, it would be even stronger than it is now.
[+] plank|10 years ago|reply
So what? I could watch every game Roger Federer played in any tennis tournament and still lose all sets to love. It used to be that computers could only do combinatorics better, but that where 'intuition' played a strong part, there was still hope for us humans... Well, guess I will have to start playing Calvinball...
[+] 21|10 years ago|reply
Lee Sedol didn't complain about this. In fact, even a few days before the match Lee Sedol was predicting a 5-0 win for him.

If someone is clearly better than you, you can study it's style all you want, it won't make a difference. You probably wouldn't even understand it's style.

[+] yeukhon|10 years ago|reply
I did not downvote you, FYI.

But I think a computer as a training buddy is indeed a good idea to improve Lee's skill. I don't know how Lee feel right now, he's defeated, there must be an enormous pressure inside of him, but I think he will appreciate the challenge because he needs more challenge! He has played against top players all over the world now for his 20+ years since earning 1 dan rank. Also, I think the computer play against itself and many random moves. Unlike human players, one would expect computers to play unconventionally since the computer can predict so far ahead about the probability of maximizing winning.

[+] eagsalazar2|10 years ago|reply
I'm curious if AlphaGo is simply winning by virtue of having more computational capability (and therefore can never be defeated consistently by humans) or if in its training it actually discovered and is now deploying interesting new tactics that humans, studying these games, can uncover and therefore use to defeat AlphaGo in the future.
[+] return0|10 years ago|reply
AlphaGo was first trained with human data only to create an initial "predict the next move" network. Afterwards, they let the network play with itself to learn how to win. So it's quite possible that it has its own tactics.
[+] hodwik|10 years ago|reply
Sedol said that it was doing tactics he had never seen before -- like the now infamous move 37 that caused Sedol to get up and walk out of the room.
[+] the_af|10 years ago|reply
I understand very little of Go beyond its basic rules and having played only a couple of matches with a friend who knows as little as I do...

I reviewed the match, and it seems that very quickly White (AlphaGo) went aggressive, and Black was trying to contain it. I suppose a human player can commit serious mistakes by playing too aggressively from the start, right? But the summary of the game speculates move 31 might have been the losing move (maybe implying the game until that moment wasn't going so bad for Lee Sedol?), while to my untrained eyes it looks as if White was constantly on the offensive from the start, and Black was playing almost exclusively to contain it.

Or am I reading this wrong?

[+] autarch|10 years ago|reply
I think that most high-level Go players (including AlphaGo) play moves that serve multiple purposes simultaneously. The ideal move reinforces your own position while forcing your opponent to defend their own.

White certainly played plenty of moves that required black to respond, but black also did the same. For example, the sequence beginning at 77 is black getting a foothold inside an area that white might've hoped to claim later. Then at move 115 black attempts an invasion into a very strong white area, ultimately failing to escape or live, leading to resignation.

That all said, I wouldn't say that either player was particularly aggressive. I've watched plenty of human games where one player or the other very intentionally picks an all-out fight. If you're really good at reading ahead locally and you think your opponent is not, this make sense. This is especially true if your large-scale game play is not as good as your tactical play.

This is also common in high handicap games. If white can't slowly eke out an advantage early on, the only remaining tactic might be to go for a huge fight and see what happens.

[+] partycoder|10 years ago|reply
In game 3, when Lee Sedol played on the left side, a capture race started. AlphaGo made his dragon look vulnerable to be captured. Lee Sedol attacked it, and AlphaGo defended it enough so it would not die, but making a huge territory on the bottom part of the board. When AlphaGo's dragon lived, Lee Sedol realized AlphaGo was ahead in score, and needed to invade the largest AlphaGo's territory in the bottom. This invasion was not feasible, and Lee Sedol was relying on a long shot strategy involving kos. This idea was finally refuted by AlphaGo, leaving him with no feasible ko threats, forcing Lee Sedol to resign.
[+] mchahn|10 years ago|reply
It was fascinating when the author experienced a weird bit of nausea when thinking about the implications of AI. I'm not sure how I feel about it yet. It does scare me a bit.
[+] ktRolster|10 years ago|reply
I saw this tweet earlier:

In 13 years, computers progressed from beating Chess to mastering Go. I hope that in 13 more they will learn to not steal focus while I type

[+] agumonkey|10 years ago|reply
Think about that we're constantly pushing toward our own replacement.
[+] raverbashing|10 years ago|reply
This is different from DeepBlue and Garry Kasparov

Chess may have a big search space, but it is "well behaved". Moves might be out of the ordinary, but not too much (weird moves that still help to win the match are rare)

AlphaGo might also have learned from past matches, but that doesn't give all the answers

The black-box aspect, the fact that you can't understand what it is thinking is curious, to say the least

[+] seanwilson|10 years ago|reply
> The black-box aspect, the fact that you can't understand what it is thinking is curious, to say the least

How true is this? Presumably you could ask AlphaGo to give you a list of the top 10 best moves it considered and perhaps, given a certain move, what it sees the board looking like 10 moves ahead.

If you're watching the commentary, one of the commentators (Michael Redmond) will try to explain the outcome of certain moves and this would seem to be equivalent although I imagine he could explain this in terms of the endgame better.

[+] asdfologist|10 years ago|reply
I don't know if that's true. "Computer moves", referring to weird moves that computers play that often turn out to be very good, tend to get mentioned a lot in chess. And you might hear about them less and less, because many of them have been absorbed into the human chess repertoire.
[+] smitherfield|10 years ago|reply
I just thought of an interesting (and somewhat more "real-world") task for AI research: Can an AI outperform a human at play-calling in [American] football?
[+] Harimwakairi|10 years ago|reply
Yahoo's fantasy football software creates millions of machine-written summaries of their fantasy football matches after each week of play. They're several paragraphs long and point out interesting things about the particulars of the teams' performance.

I would imagine that given the "All-22" camera footage from a football game, it would be possible to construct a piece of software to monitor what was going on (visually identify players by name/number, determine who has the ball, etc.) and then convert that model to a stream of English text.

Making it nuanced and entertaining, on the other hand, would be a challenge. :)

[+] cortesoft|10 years ago|reply
The problem is that it REALLY hard to determine who "wins" at play calling.
[+] bsder|10 years ago|reply
Some of the coaches already do this. Their "card" includes plays that say "In situation X, the probabilities say to call play Y."

The problem is small sample size. You only have the plays as called in the games to rely on. You can't "replay" the game with a different set of plays and see what the result would be.

This was a problem even in Moneyball. It's just that On-Base Percentage has a really good correlation to runs and therefore wins.

[+] rcthompson|10 years ago|reply
Considering that plays are often called or changed by quarterbacks on the field right before the snap based on what the QB sees from the opposing team, I think it would be difficult to provide all that input to a computer AI.
[+] clickok|10 years ago|reply
How strong is AlphaGo?

DeepMind doesn't generally comment on these sorts of things, but I think they knew it would be strong, just not how strong.

It is hard to accurately gauge the playing strength of such a program for the following reasons:

* Go is not solved, it doesn't really even admit a good heuristic for close games. In chess, given enough time, we can do a pretty good job of searching the possible outcomes from a given position, and even if we can't evaluate all the possible endgames, evaluation functions exist that give us an idea of which side is ahead when we terminate the search. Go has a much higher branching factor (more moves available at each turn, making the search more expensive) and short of a catastrophic blunder, it's hard to quantify who is ahead at each point in time[1]. So we can't (in general) quantify optimal play, and therefore cannot quantify how much AlphaGo (or anyone else) deviates from it.

* One known aspect of programs using Monte Carlo Tree Search is that they play to win, and are willing to sacrifice margin of victory to maintain or increase their odds of winning. According to some people I've talked to, this can be suboptimal, but there are methods of addressing this[2]. Note that you can't just change the objective function from "winning" to "win by as much as possible" without potentially reducing AlphaGo's strength.

* The value function learned by a deep net is hard to interpret, partly because it encodes information about the possible futures arising from a position, and partly because it involves a tremendous number of calculations to compute. We do not know what it is representing at the intermediate levels-- techniques that aim to visualize or cluster unit activations can provide a bit of insight, but there's always the possibility that we're interpreting the patterns incorrectly because we're trying to fit it into the framework of "what would a human think?". Further, the representation is somewhat monolithic-- we can't tweak the value of one thing without changing the values of others. In chess engines, we might modify the material value of, say, a knight, without affecting the value of a rook. In a convolutional net, if we adjust the value of one position, it will tend to affect the value of many others.

* We can attempt to quantify its strength by comparing it to other programs, but AlphaGo has already crushed other programs (99% win rate), so all that it tells us is that it is stronger than those programs.

Essentially, without the ability to perform a significant amount of searching, or gauge strength via margin-of-victory, or to examine the program from other angles, it's hard to gauge just how good the program is.

The only thing we can do is throw skilful opponents at it, and see if it fails eventually[3]. In its current incarnation, though, it seems like its playing strength is just going to be "stronger than you".

---

1. Hence the need for Deep Reinforcement Learning-- we learn how valuable each position is based on the results of the positions that it can lead to.

2. By modifying the search to maintain a given margin once it has been attained-- but I do not work at DeepMind, nor am I an expert on MCTS like some of my colleagues, so I don't know if this could adversely affect AlphaGo's overall strength.

3. We might be able to get a better idea by having extremely strong players play against weaker versions of AlphaGo (ones with less computing power) and then sort of telescope upwards once a good baseline has been set, but it remains to be seen if the current level of invincibility is due to it not having been around long enough for its weaknesses to become apparent[4].

4. How would an AI researcher play against it? I've talked to a few people, and the answers have been: (a) play the game to the conclusion, don't resign; (b) attempt to take the game off-the-rails into parts of the state-space it hasn't really explored before (although this is risky because the human player is more likely to make mistakes in these cases as well), and (c) let it get ahead somewhat so that it "relaxes" (see the points about margin), allowing you to catch up and overtake it.

[+] mingodad|10 years ago|reply
Out of curiosity how much electric power is alphago burning while playing ? Does someone know it ?
[+] hutzlibu|10 years ago|reply
Much more than then "stupid" human used, I guess. So for now, we are at least much more energy-efficient ...
[+] botw|10 years ago|reply
The actual match is largely effected by psycology at the time of game. The match is somewhat unfair to human player. AlphaGo and its team know all about each human player and preference. But human player knows little about AlphaGo. Also it is AlphaGo team who decides whom to choose to play against AlphaGo, not the other round. In the sense of psycology and , the game is greatly favor to AlphaGo right now. This is the difference with the match between IBM DeepBlue and the chess world champion 10 years ago. It should be that any player(human or AI) can be allowed to enter the game, with or without a fee.
[+] naveen99|10 years ago|reply
Seems like just yesterday (actually seven months ago), I was saying dbn's aren't good at go yet in an argument: https://news.ycombinator.com/item?id=10028908

i guess it's time to play with dbn's again. At least I have accumulated a few more Titan x cards in the mean time.

[+] petegrif|10 years ago|reply
Holy crap. It's a total beatdown.
[+] 71817188|10 years ago|reply
Did Lee Sedol have a reasonable idea of AlphaGo's recent strength? To put it more harshly, did DeepMind misrepresent the true strength of the program?

I find it very hard to find information about this online.

[+] hyperpallium|10 years ago|reply
Huh, Go doesn't require intelligence after all.