I consider AlphaGo - The Movie [1] to be a timeless classic that will never feel outdated. In my opinion, it surpasses even Hollywood productions, despite being based on true events and filmed live with real people. I'm ranking it as #2, though, because I still believe Steve Jobs' 2007 iPhone presentation [2] is the greatest live tech event ever captured on film. Hearing the crowd screaming when seeing some tricks and techniques ( eg. slide to unlock, pinch to zoom and scrolling up) on how to use the phone does really triggers some haptic feedback in my heart because we are now so used to these tricks that were pure magic back then.
Agreed, I've watched the Alpha Go movie, I think, 3 times by now. It elicits strong emotions in me which I get very rarely from a movie or story.
I think it's due to the subject matter and people being very relatable to me. And it's real, filmed while it happened, instead of some madeup or retold story.
Well, Bridge remains unconquered, although it is unclear whether it is because of disinterest or incapability. As I have highlighted before, the day a computer false-cards will be the day.
(False-carding - playing a certain card with the primary intention of deceiving the opponents and forcing an error)
Does Bridge have card draw from a randomized deck? Because that's most likely the issue. I'm facing similar problems when trying to build something that plays Magic The Gathering like games reasonably competent. The combinatorics explosion and dealing with bluffing/hidden knowledge is really a tough nut to crack. My current guess is that you need something like monte carlo reinforcement learning to do it.
Forcing an Error is an especially hard case because in machine vs machine matches both sides would be aware that something could force an error and would therefore not fall for it.
There are a lot of parallels between rule-based games like Go and rule-based formal systems like ZFC. It’s interesting that the same techniques used for AlphaGo have not worked nearly as well for finding proofs of famous open problems that we suspect are both 1) decidable within ZFC and 2) have a “reasonable” minimal proof length.
What aspect of efficiently exploring the combinatorial explosion in possibilities of iterated rule-based systems is the human brain still currently doing much better than machines?
I happen to have recently written up a longer history of Go AI. If you're wondering about what is special about Go in particular or what generalizes to other problems, give it a read.
Coincidentally, I just watched the hour long documentary that DeepMind made about the match [1]. It talks a lot about the two moves - though not really in detail.
To a non-go player like myself, both moves 37 and 78 seemed completely arbitrary. I mean, much of the video talks about how it's impossible to calculate all the future moves like in chess, yet move 37 of a possible ~300 move game is called out as genius, and move 78 is a God Hand.
For the layman like myself, it seemed a bit inconsistent.
The thing that made me smile was how history repeated itself. Sedol predicted a 5-0 win against the program. Kasparov was pretty cocky as well in the 1990s. You'd think someone would have warned him! "Hey Sedol. Cool your jets, these guys wouldn't be spending so much money just to embarrass themselves."
DeepMind was definitely way more polite than IBM, so that was good to see. The Deep Blue team were sorta jerks to Gary.
> I mean, much of the video talks about how it's impossible to calculate all the future moves like in chess, yet move 37 of a possible ~300 move game is called out as genius, and move 78 is a God Hand.
Every move is a choice of ~300 possibilities, and you need to calculate far ahead to know if it's a good move or not, so the number of choices you have to explore is much greater than what it seems.
If you think of the Go board as a battlefield and the stones as troops, you may get a sense of it. You're trying to secure large areas of the board. Do you spread out your forces at the risk of spreading them too thinly, or do you build solid walls with them at the risk of definitively securing only a small area.
In between these two extremes is the dance where the elegance happens. Large, seemingly secure areas get split into two. Multiple, separate battles grow and merge into larger ones. A single, well placed stone earlier in the game could prove pivotal as a battle creeps towards it.
I don't know why you'd expect to be able to read the board as a non-player. If I watch a surgeon at work a lot of the individual motions are inscrutable to me. It's like looking at sheet music if you don't play an instrument. You just don't have the mental schema to see what is interesting about even the most interesting parts.
I was watching this game with my go club and we all instantly saw the significance of 37, it was audible in the room. 78 felt tangibly different, some of us immediately read it as a clear misplay, some were taking longer to come to any conclusion, just puzzled. Our most experienced player, at the time 5 dan, gasped when he got it. But it still took him time to even intuit what it was doing. Now that it is well understood, moves of that type are common even in intermediate level play. Changed the game forever.
Interesting that Lee Sedol losing at Go was the big opening act in the modern AI wave, but it ended up coming from a completely different technology that has effectively faded into the background.
They used deep neural networks, reinforcement learning, and Monte Carlo tree search. All except the MCTS are critical components of modern LLMs. MCTS is a form of planning which you can argue has parallels to "reasoning" models, although that's pretty tenuous I admit.
modern post-training uses RL and immense amounts of synthetic data to iteratively bootstrap better performance. if you squint this is extremely similar to the AlphaZero approach of iteratively training using RL over data generated through self-play
redbell|10 months ago
______________________________
1.https://www.youtube.com/watch?v=WXuK6gekU1Y
2.https://www.youtube.com/watch?v=VQKMoT-6XSg
nom|10 months ago
I think it's due to the subject matter and people being very relatable to me. And it's real, filmed while it happened, instead of some madeup or retold story.
canistel|10 months ago
IlikeKitties|10 months ago
Forcing an Error is an especially hard case because in machine vs machine matches both sides would be aware that something could force an error and would therefore not fall for it.
gsf_emergency|10 months ago
https://xcancel.com/polynoamial?lang=en
https://arxiv.org/abs/2301.09159
(To be fair, re (card) games: I'm also only interested in seeing Cyborg-on-Cyborg action. Lee vs a-G almost qualified :)
bondarchuk|10 months ago
tmtvl|10 months ago
They're ordered by date from newest to oldest, so it's the 3rd and 4th games v Lee Sedol from the top down.
Xcelerate|10 months ago
What aspect of efficiently exploring the combinatorial explosion in possibilities of iterated rule-based systems is the human brain still currently doing much better than machines?
brilee|10 months ago
I happen to have recently written up a longer history of Go AI. If you're wondering about what is special about Go in particular or what generalizes to other problems, give it a read.
russellbeattie|10 months ago
To a non-go player like myself, both moves 37 and 78 seemed completely arbitrary. I mean, much of the video talks about how it's impossible to calculate all the future moves like in chess, yet move 37 of a possible ~300 move game is called out as genius, and move 78 is a God Hand.
For the layman like myself, it seemed a bit inconsistent.
The thing that made me smile was how history repeated itself. Sedol predicted a 5-0 win against the program. Kasparov was pretty cocky as well in the 1990s. You'd think someone would have warned him! "Hey Sedol. Cool your jets, these guys wouldn't be spending so much money just to embarrass themselves."
DeepMind was definitely way more polite than IBM, so that was good to see. The Deep Blue team were sorta jerks to Gary.
1. https://www.youtube.com/watch?v=WXuK6gekU1Y
lawn|10 months ago
Every move is a choice of ~300 possibilities, and you need to calculate far ahead to know if it's a good move or not, so the number of choices you have to explore is much greater than what it seems.
oniony|10 months ago
In between these two extremes is the dance where the elegance happens. Large, seemingly secure areas get split into two. Multiple, separate battles grow and merge into larger ones. A single, well placed stone earlier in the game could prove pivotal as a battle creeps towards it.
giraffe_lady|10 months ago
I was watching this game with my go club and we all instantly saw the significance of 37, it was audible in the room. 78 felt tangibly different, some of us immediately read it as a clear misplay, some were taking longer to come to any conclusion, just puzzled. Our most experienced player, at the time 5 dan, gasped when he got it. But it still took him time to even intuit what it was doing. Now that it is well understood, moves of that type are common even in intermediate level play. Changed the game forever.
sevg|10 months ago
bpodgursky|10 months ago
fancyfredbot|10 months ago
mastazi|10 months ago
If by that you mean reinforcement learning, that's not the case; e.g. see https://arxiv.org/abs/2501.12948
brian_cloutier|10 months ago
modern post-training uses RL and immense amounts of synthetic data to iteratively bootstrap better performance. if you squint this is extremely similar to the AlphaZero approach of iteratively training using RL over data generated through self-play