(no title)
rcheu | 7 years ago
That said, I'm not sure I agree that it was winning mainly due to better decision making. For context, I've been ranked in the top 0.1% of players and beaten pros in Starcraft 2, and also work as a machine learning engineer.
The stalker micro in particular looked to be above what's physically possible, especially in the game against Mana where they were fighting in many places at once on the map. Human players have attempted the mass stalker strategy against immortals before, but haven't been able to make it work. The decisions in these fights aren't "interesting"--human players know what they're supposed to do, but can't physically make the actions to do it.
While they have similar APM to SC2 pros, it's probably far more efficient and accurate so I don't think that alone is enough. For example, human players have difficulty macroing while they attack because it takes valuable time to switch context, but the AI didn't appear to suffer from that and was extremely aggressive in many games.
gamegoblin|7 years ago
I think a far more interesting limitation would be to cap APM at 150 or so, or to artificially limit action precision with some sort of virtual mouse that reduced accuracy as APM increased.
wnevets|7 years ago
IIRC OpenAI limits the reaction time to ~200ms when playing DoTA2. AI employing better strategies than humans will always be more interesting than AI that can out click humans.
a_wild_dandan|7 years ago
The skepticism in this thread is absolutely justified but I think it's important to note the lengths to which DeepMind has gone to address and assuage the fears of superhuman mechanical skills being employed in these games.
taneq|7 years ago
simmanian|7 years ago
I like the idea of having action noise that's linearly related to APM
pesmhey|7 years ago
sayno3|7 years ago
[deleted]
pmontra|7 years ago
cm2012|7 years ago
CydeWeys|7 years ago
The ceiling here is going to be incredibly high, much higher than the level of play that people are capable of, even when restricted to a single window.
Jyaif|7 years ago
Cookingboy|7 years ago
"The AlphaStar league was run for 14 days, using 16 TPUs for each agent. During training, each agent experienced up to 200 years of real-time StarCraft play. "
MaNa probably played less than 2-3 years of Starcraft in his whole life (by that I mean 24hr x 365d x 3), and was learning with a much less focused/rigorous methodology.
derefr|7 years ago
Humans don't have to learn to process, recognize, and classify objects in visual sense-data, for example. We can do that from the moment we're born, because we already have hundreds of precisely-tuned "layers" laying around in our brains for doing just that. We just need to transfer-learn the relevant classes.
nopinsight|7 years ago
This gives them reserves when attacked and some workers killed. They can also ramp up mining at a new base quickly by moving the extra workers there.
Apparently the benefits outweigh the costs for these workers for AlphaStar. It will be interesting to see if some pros decide to adopt the technique and if it improves human performance as well.
Disclaimer: I do not have much Starcraft experience.
jammygit|7 years ago
Let's say you make 4 extra at a cost of 200 minerals and then lose 4 workers to harassment. You are out 200 minerals in both cases, but the prebuilt workers in the prebuilt case will mine an extra... 100 minerals? (40 + 30 + 20 + 10).
This doesn't take chronoboost into account though. I don't know, the gain is marginal, and the opportunity cost is having a smaller army (2 zealots for example)
Please correct my numbers if I've made a mistake, I forget build times and havent played since hots
TulliusCicero|7 years ago
proc0|7 years ago
rkagerer|7 years ago
I'd love to watch the results of constraining the AI so instead of seeing the whole map at once it has to pan around the same way a human would to get updated information on each battle. Counting those "info-gathering" window pans against the actions tally might yield slightly fairer APM metrics. (EDIT: Turns out they built a new agent for game 11 to do just that)
One of my biggest beefs with strategy games of this genre occurred around the time sprites went 3D and the player viewports got smaller (presumably to showcase all the cosmetic detail, and since it became harder to distinguish between visuals when zoomed out farther). I always feel too constrained on the modern games - like I can't see enough of the map at once. In my opinion that "full size viewport" gives a multi-tasking edge to the engine that the player doesn't share (beyond the human cognitive overhead from context switching you already pointed out).
On the other hand I find it fascinating our AI's have become strong enough at our games that we're having to handicap them to avoid players crying foul that they're not fair.
fandango|7 years ago
sciyoshi|7 years ago
andreyk|7 years ago
hughzhang|7 years ago
kolinko|7 years ago
celeritascelery|7 years ago
freeflight|7 years ago
olliej|7 years ago
methodover|7 years ago
In StarCraft 2, the game IS the interface. That is to say, the developers have constructed the game in such a way as to be difficult to control; and human mastery of the interface is a large percentage of the game. Strategy in the game is important, of course -- but this is not chess, where human beings are not limited by the interface of the game. In StarCraft, you are intentionally given a limited interface to monitor and control a gigantic game while under incredibly tight time controls.
And I should also note that Blizzard is extremely reluctant to add features that make it easier to control the game. I have a friend who works on the StarCraft 2 team. We talked at length about this one feature that he designed and proposed for the team to make a specific aspect of the game friendlier towards players. It was turned down for exactly the reasoning above -- the game is the interface. By making the game easier to control, it disrupts the entire experience; an StarCraft 2 that is easier to control is no longer StarCraft 2.
notSupplied|7 years ago
I would say yes, because StarCraft was very clearly balanced for human players. We already saw some indication that when played with super-human micro, mass blink stalkers is a stronger strategy than when humans are in control. Without the active intervention of game balancing, RTS metas tend to devolve into "mass one or two units" which was what happenes to every Command & Conquer game (and why SC is a respected eSport while C&C is not).
I suspect this will happen when you have agents playing parameters that don't match what the game was balanced for. The strategic landscape will shrivel up and the game cease to captivate us.
stared|7 years ago
sytelus|7 years ago
ygra|7 years ago
kibibu|7 years ago
Thaxll|7 years ago
knicholes|7 years ago
hughzhang|7 years ago
unknown|7 years ago
[deleted]
javier2|7 years ago
It was also extremely active with the stalkers, deciding to split them in three and not let Mana cross the map with his immortals.
throwawaymath|7 years ago
What's that hireability like?
sidusknight|7 years ago
ajuc|7 years ago
Damn I really need to watch these games :)
throwaway415415|7 years ago
porky|7 years ago
pesmhey|7 years ago
cjbprime|7 years ago
ehsankia|7 years ago
Wasn't the APM closer to half that of the pros?
https://storage.googleapis.com/deepmind-live-cms/images/SCII...
arcticfox|7 years ago
During the fights, the critical moments in when MaNa would top out at ~600 humanly inaccurate APM (this is 10 inputs per second), the AI would jump up to over 1000 - we don't know exactly what it was doing, but it was presumably pixel-precise. Meanwhile the physical inertia of the mouse is a challenge for humans at that speed - imagine trying to click five totally different places with perfect precision in a single second.
mactrey|7 years ago
biohazardpb4|7 years ago