Sorry but this is extremely shoddy work. M:tG will never be "solved" like
this.
>> Using data from drafts carried out by humans, I trained a neural network to
predict what card the humans would take out of each pack. It reached 60%
accuracy at this task.
Going by what's in the linked notebook, the model was evaluated on its ability
to match the decks in its training set card-for-card.
Without any attempt to represent game semantics in the model, the fact that
the deck sometimes "predicts" different picks than the actual picks in the
dataset tells us nothing. It probably means the model has some variance that
causes it to make "mistakes" in its attempt to exactly reproduce its dataset.
It certainly doesn't say that the model can draft a good M:tG deck, certainly
not in any set other than Guilds of Ravnica.
>> The model definitely understands the concept of color. In MTG there are 5
colors, and any given draft deck will likely only play cards from 2 or 3 of
those colors. So if you’ve already taken a blue card, you should be more
likely to take blue cards in future picks. We didn’t tell the model about
this, and we also didn’t tell it which cards were which color. But it learned
anyway, by observing which cards were often drafted in combination with each
other.
This is a breathakingly brash misinterpretation of the evidence. The model's
representation of a M:tG card is its index in the Guilds of Ravnica card set.
It has no representation of any card characteristic, including colour. If it
had learned to represent "the concept of colour" in M:tg in this way, it
wouldn't be a neural net, it would be a magick spell.
The author suggests that the model "understands" colour because it drafts
decks of specific colours. Well, its dataset consists of decks with cards of
specific colours. It learned to reproduce those decks. It didn't learn
anything about why those decks pick particular cards, or what particular
cards are. All it has is a list of numbers that it has to learn to put
together in specific ways.
This is as far from "understanding the concept of colour", or anything, as can
be.
There are many more "holes" in the article's logic, that just go to show that
you can train a neural net, but you can't do much with it unless you
understand what you're doing.
Apologies to the author for the harsh critique, if he's reading this.
>> Using data from drafts carried out by humans, I trained a neural network to predict what card the humans would take out of each pack. It reached 60% accuracy at this task. And in the 40% when the human and the model differ, the model’s predicted choice is often better than what the human picked.
How the model's pick is "better than what the humn picked" is never made clear, but since accuracy is measured by the model's ability to match its training set, I assume that's also what is meant by "better": the model was better than a human in memorising and reproducing the decks it saw during training.
Well, you'd never evaluate a human's deckbuilding skills by how well they can reproduce a deck they've seen before. Given the same deck archetype, 10 humans will probably make 10 different card choices, for reasons of their own. It's like trying to evaluate how people style their hair by measuring how similar their hair looks to some examples of particular hair styles. It's a concrete measure, but it's also entirely meaningless.
This effort really suffers in terms of evaluation, and so we have learned nothing about how good the model is, which is a shame.
> The author suggests that the model "understands" colour because it drafts decks of specific colours. Well, its dataset consists of decks with cards of specific colours. It learned to reproduce those decks. It didn't learn anything about why those decks pick particular cards, or what particular cards are. All it has is a list of numbers that it has to learn to put together in specific ways.
> This is as far from "understanding the concept of colour", or anything, as can be.
It is very arguably bad feature engineering - if you have the information readily available, don't make the network infer it - but I think the description is fair.
Word2vec uses a similar model. It starts out knowing nothing about each word except an arbitrary numeric index, and learns everything else by predicting words that appear next to each other. By the end of the training it clearly has internal representations of concepts like "color", "verb", "gender", etc.
The same concept should apply here - by observing what cards are used in similar decks, with enough training data it should eventually associate concepts like card type, color and mana costs to each card.
In this case there isn't enough training data for that kind of resolution, but it has learned that blue cards go with blue cards, and red cards with red cards, and there's no hard lines from there to the concept of color.
Sure this isn't going to "solve" MtG, and I don't think it is a particularly good approach for the problem statement, but I think the idea is workable, and the network could already contain a proto-concept of "color" that would be refined with more training.
This is a lot of anger over a choice of vocabulary. The author certainly didn’t mean that the model had a deep understanding of the nuances of the concept of color - just that it had identified clusters that in real life correspond to the color of cards.
Interesting there is no mention of the new Arena MTG game and the fact that you draft against bots as opposed to humans like on Magic the Gathering Online. With each new set that has come out, they have had to adjust the bots because they would allow a player to always draft a particularly powerful deck. Ryan Spain, who was involved in development of Arena, said on the Limited Resources podcast that the bots are essentially trained to make a few first pics and than "stay in their lanes". He said that each bot has a preference and attempts to fulfill it. In contrast, human players show more variance in their draft pics.
There is also more nuance in draft. For example, if you see that a particular color set is closes (i.e. people are drafting it), you might grab a particularly powerful card in that color set just to prevent them from obtaining it.
This is cool stuff. Have you checked out mtgtop8.com? Some code for it from back in Kaladesh: https://github.com/ivodopyanov/MTGML. I asked the dev a couple of years ago and he said he had a newer version, but it wasn't on github.
Thinking about your jellybean guessing (wisdom of the crowds) analogy, I don't want an unweighted average of a Ben Stark and a random player. Years ago mtgo elo ratings were public - seems like that could be useful. With something like the deckmaster twitch extension you could (theoretically) grab drafts from stream replays of great players. Especially with the MPL players required to stream there must be some great data on video.
Given the information that this AI now has and the ability it's displayed, should WotC/Hasbro consider actively banning players from using AI to build standard-format Constructed decks? After all, if the author actually is correct and draftbot here picks better than humans, wouldn't it stand to reason that it could build optimal decks if given a "draft" which included multiple iterations of the program working in tandem over an entire set?
I'm not super well versed in MtG (I've played, but never competitively), but a similar thing is common in Dominion. Any discussion about strategy in a particular game is usually settled by running a simulator pitting two strategies against each other a few thousand times. Eventually, these simulators have made their way into play testing potential new cards.
At the end of the day, these are games sufficiently complex that no single strategy will be optimal. Dominion has something on the order of 10^15 possible game configurations. Simulations are just a tool available to prepare and test potential strategies. I suspect MtG would land in a similar space if an "optimal deck builder" surfaced. You still have to play the game and beat your opponent.
Constructed players already (for a very long time) use the collective data of other players — this has lots of names, it is commonly called “net-decking”. It turns out, that the best players and the best deck builders have a lot of overlap, but not completely so, and it’s common for multiple high-ranking players to play the same decks. A key skill to winning tournaments is to figure out what decks are going to be played by the other top players, and how to give yourself the best chance against them.
In other words, MtG is a game where deck-construction and deck-playing are both important and distinct skills, and high-level tournaments generally assume a high level of both, so a helper bot for the former is unlikely to give an advantage (because it’s compared to “the best efforts of a lot of people over a lot of time”).
The author evaluted his model's "accuracy" by looking at how good it was at reproducing the decks in its training set card-for-card.
That is no way to evaluate the quality of an M:tG deck. For instance, it can never tell us anything about "sleeper" decks, or about the value to an existing deck of new cards that are added to a format as sets rotate and so on.
All that the accuracy metric used by the author can do is tell us how good the model is at representing the past. I am of the firm belief that WoTC will be laughing in their tea cups in the tought of banning something as pointless as this. In M:tG the past is about as valuable as a hat made of ice in the tropics.
Edit: For some added context. The way M:tG metagames work is that at the start of a season, there are some "decks to beat" that are usually the most obvious ones in the format. As the format progresses, players often find strategies to beat the decks to beat, initially known as "rogue" or sleepers. These can't be predicted by representing the current decks to beat. Some level of understanding of the game and what's important in a format in terms of strategy etc, is required.
Famous example. "The solution" by Zwi Mowshowitz [1] that dominated the 1st Pro Tour–Tokyo 2001 Invasion Block Constructed. Mowshowitz noticed that the dominant aggro decks' clocks (sources of damage) were predominantly red, so he stuffed a deck with anti-red cards, shutting down the dominant aggro decks.
That requires way, way more than modelling the current metagame at any point in time.
> This seems to me to be similar to poker, and considered a game of luck
It's like half-luck, half-skill. Deck construction is largely skill-based (both directly and metatextually), but decks are randomized during shuffling, which is obviously luck-based.
When they say “Constructed” is expensive because you have to acquire the cards ahead of time... why don’t tournaments allow you to play with cards where you’ve just scribbled with Sharpie whatever card you want it to act as?
YeGoblynQueenne|7 years ago
>> Using data from drafts carried out by humans, I trained a neural network to predict what card the humans would take out of each pack. It reached 60% accuracy at this task.
Going by what's in the linked notebook, the model was evaluated on its ability to match the decks in its training set card-for-card.
Without any attempt to represent game semantics in the model, the fact that the deck sometimes "predicts" different picks than the actual picks in the dataset tells us nothing. It probably means the model has some variance that causes it to make "mistakes" in its attempt to exactly reproduce its dataset. It certainly doesn't say that the model can draft a good M:tG deck, certainly not in any set other than Guilds of Ravnica.
>> The model definitely understands the concept of color. In MTG there are 5 colors, and any given draft deck will likely only play cards from 2 or 3 of those colors. So if you’ve already taken a blue card, you should be more likely to take blue cards in future picks. We didn’t tell the model about this, and we also didn’t tell it which cards were which color. But it learned anyway, by observing which cards were often drafted in combination with each other.
This is a breathakingly brash misinterpretation of the evidence. The model's representation of a M:tG card is its index in the Guilds of Ravnica card set. It has no representation of any card characteristic, including colour. If it had learned to represent "the concept of colour" in M:tg in this way, it wouldn't be a neural net, it would be a magick spell.
The author suggests that the model "understands" colour because it drafts decks of specific colours. Well, its dataset consists of decks with cards of specific colours. It learned to reproduce those decks. It didn't learn anything about why those decks pick particular cards, or what particular cards are. All it has is a list of numbers that it has to learn to put together in specific ways.
This is as far from "understanding the concept of colour", or anything, as can be.
There are many more "holes" in the article's logic, that just go to show that you can train a neural net, but you can't do much with it unless you understand what you're doing.
Apologies to the author for the harsh critique, if he's reading this.
YeGoblynQueenne|7 years ago
>> Using data from drafts carried out by humans, I trained a neural network to predict what card the humans would take out of each pack. It reached 60% accuracy at this task. And in the 40% when the human and the model differ, the model’s predicted choice is often better than what the human picked.
How the model's pick is "better than what the humn picked" is never made clear, but since accuracy is measured by the model's ability to match its training set, I assume that's also what is meant by "better": the model was better than a human in memorising and reproducing the decks it saw during training.
Well, you'd never evaluate a human's deckbuilding skills by how well they can reproduce a deck they've seen before. Given the same deck archetype, 10 humans will probably make 10 different card choices, for reasons of their own. It's like trying to evaluate how people style their hair by measuring how similar their hair looks to some examples of particular hair styles. It's a concrete measure, but it's also entirely meaningless.
This effort really suffers in terms of evaluation, and so we have learned nothing about how good the model is, which is a shame.
tveita|7 years ago
> This is as far from "understanding the concept of colour", or anything, as can be.
It is very arguably bad feature engineering - if you have the information readily available, don't make the network infer it - but I think the description is fair.
Word2vec uses a similar model. It starts out knowing nothing about each word except an arbitrary numeric index, and learns everything else by predicting words that appear next to each other. By the end of the training it clearly has internal representations of concepts like "color", "verb", "gender", etc.
The same concept should apply here - by observing what cards are used in similar decks, with enough training data it should eventually associate concepts like card type, color and mana costs to each card.
In this case there isn't enough training data for that kind of resolution, but it has learned that blue cards go with blue cards, and red cards with red cards, and there's no hard lines from there to the concept of color.
Sure this isn't going to "solve" MtG, and I don't think it is a particularly good approach for the problem statement, but I think the idea is workable, and the network could already contain a proto-concept of "color" that would be refined with more training.
itsdrewmiller|7 years ago
j_m_b|7 years ago
There is also more nuance in draft. For example, if you see that a particular color set is closes (i.e. people are drafting it), you might grab a particularly powerful card in that color set just to prevent them from obtaining it.
_raoulcousins|7 years ago
Thinking about your jellybean guessing (wisdom of the crowds) analogy, I don't want an unweighted average of a Ben Stark and a random player. Years ago mtgo elo ratings were public - seems like that could be useful. With something like the deckmaster twitch extension you could (theoretically) grab drafts from stream replays of great players. Especially with the MPL players required to stream there must be some great data on video.
zackwitten|7 years ago
_raoulcousins|7 years ago
Endy|7 years ago
phamilton|7 years ago
At the end of the day, these are games sufficiently complex that no single strategy will be optimal. Dominion has something on the order of 10^15 possible game configurations. Simulations are just a tool available to prepare and test potential strategies. I suspect MtG would land in a similar space if an "optimal deck builder" surfaced. You still have to play the game and beat your opponent.
c256|7 years ago
Constructed players already (for a very long time) use the collective data of other players — this has lots of names, it is commonly called “net-decking”. It turns out, that the best players and the best deck builders have a lot of overlap, but not completely so, and it’s common for multiple high-ranking players to play the same decks. A key skill to winning tournaments is to figure out what decks are going to be played by the other top players, and how to give yourself the best chance against them.
In other words, MtG is a game where deck-construction and deck-playing are both important and distinct skills, and high-level tournaments generally assume a high level of both, so a helper bot for the former is unlikely to give an advantage (because it’s compared to “the best efforts of a lot of people over a lot of time”).
YeGoblynQueenne|7 years ago
That is no way to evaluate the quality of an M:tG deck. For instance, it can never tell us anything about "sleeper" decks, or about the value to an existing deck of new cards that are added to a format as sets rotate and so on.
All that the accuracy metric used by the author can do is tell us how good the model is at representing the past. I am of the firm belief that WoTC will be laughing in their tea cups in the tought of banning something as pointless as this. In M:tG the past is about as valuable as a hat made of ice in the tropics.
Edit: For some added context. The way M:tG metagames work is that at the start of a season, there are some "decks to beat" that are usually the most obvious ones in the format. As the format progresses, players often find strategies to beat the decks to beat, initially known as "rogue" or sleepers. These can't be predicted by representing the current decks to beat. Some level of understanding of the game and what's important in a format in terms of strategy etc, is required.
Famous example. "The solution" by Zwi Mowshowitz [1] that dominated the 1st Pro Tour–Tokyo 2001 Invasion Block Constructed. Mowshowitz noticed that the dominant aggro decks' clocks (sources of damage) were predominantly red, so he stuffed a deck with anti-red cards, shutting down the dominant aggro decks.
That requires way, way more than modelling the current metagame at any point in time.
_____________________
[1] http://www.starcitygames.com/magic/misc/22483_Innovations_Th...
yazr|7 years ago
This seems to me to be similar to poker, and considered a game of luck. But i am not familiar enough.
I also found this https://boardgamegeek.com/article/29969194#29969194
maxsilver|7 years ago
Yes, routinely. https://magic.wizards.com/en/articles/mythic-invitational although the prize money amounts can vary wildly, most are nowhere near that high.
> This seems to me to be similar to poker, and considered a game of luck
It's like half-luck, half-skill. Deck construction is largely skill-based (both directly and metatextually), but decks are randomized during shuffling, which is obviously luck-based.
duado|7 years ago
praptak|7 years ago
ThePadawan|7 years ago
zokier|7 years ago
https://mtg.gamepedia.com/Proxy
But of course the whole concept of MTG is collectible card game, buying and owning cards is core part of it
maxsilver|7 years ago
It's like saying, "why do movie theaters pay so much for films? Why don't they just download the film from a BitTorrent like everyone else?"
BentFranklin|7 years ago
InsertHome|7 years ago
bunkydoo|7 years ago
[deleted]