This doesn't seem to be correct. The article talks about reinforcement learning agents optimizing a communication game to trade off color description accurately vs. effort.
The words they use aren't real words, they're partitions of the color space, and the researchers found that the partitions the agents came up with to win the game were similar to human partitioning of the color space.
Now, did the design of the game and the reward function smuggle in human notions of reasonableness that made the outcome a foregone conclusion? Maybe that's a more reasonable criticism, I don't know.
The study relies on "colors" defined by human perception, which could be interpreted as a form of training, when all inputs are restricted to that definition.
The efficiency/complexity insight doesn't rely on that data, but the human-like output produced by human-like color data combined with communications limitations does rely on it, and that's what the article is all about.
AI intended to replicate human behavior replicates human behavior
Edit: It’s like domesticating wolves for the purpose of training them to run around an obstacle course at Westminster to showcase what it would be like if they were still wolves
A quick read of the article leads me to believe that the AIs were inventing language tokens of their own. What makes you think it was trained on human data?
Seems so, but that's less important than the title suggests. The paper is far beyond me (I can't even figure out what the "IB plane" is) but the key insight is how complexity tracks with communications efficiency when their models make their own communication methods between each other.
habitue|5 years ago
The words they use aren't real words, they're partitions of the color space, and the researchers found that the partitions the agents came up with to win the game were similar to human partitioning of the color space.
Now, did the design of the game and the reward function smuggle in human notions of reasonableness that made the outcome a foregone conclusion? Maybe that's a more reasonable criticism, I don't know.
readflaggedcomm|5 years ago
The efficiency/complexity insight doesn't rely on that data, but the human-like output produced by human-like color data combined with communications limitations does rely on it, and that's what the article is all about.
ksm1717|5 years ago
Edit: It’s like domesticating wolves for the purpose of training them to run around an obstacle course at Westminster to showcase what it would be like if they were still wolves
zellyn|5 years ago
readflaggedcomm|5 years ago
The article summarizing it underplays that part.
iaw|5 years ago
gnu8|5 years ago