top | item 15073422

(no title)

There are really a few problems here that I think the article is getting a little mixed together, and I wanted to lay them out.

First, that the image corpuses used for machine learning have a strong gender bias, perhaps more than exists in the "real world." More images of men than women, more images of men working on computers, more images of women in kitchens, etc.

These images are sourced from http://imsitu.org/ which is sourced from http://image-net.org/, which (after some digging) looks like it gets most of its images from Flickr, stock photography sites, and random corporate sites. Are these representations of the "real world"? I would argue not. Professional photography, stock photography, and photos taken for the purpose of being used in an unknown future context, and/or to appeal to the most people, tends to err on the side of being "universally applicable" and emphasizing the "common idea of a thing" rather than how the thing actually is, with all its variations. An image of a man in a kitchen be perceived as more controversial and may be less universally usable than an image of a woman in a kitchen. So if you want to take a photo with as many possible uses as possible, you'd tend to fall back on established social norms MORE often than they might actually occur.

Second, machine learning tends to emphasize small differences when it has nothing else to go on, or is improperly trained. If you have a dataset featuring people in kitchens where 75% of the time the person in a photo is a woman, you could get an algorithm that is 75% accurate simply by saying "the person in the photo is a woman" every single time. While the dataset reflects 75% women, the algorithm reflects 100% women. It emphasizes small differences in order to gain accuracy.

This isn't just hypothetical. Many times, I've worked on a categorization/labeling dataset that turns out to have no actual underlying pattern, but I wind up, after many hours, getting a best fit algorithm that, say, predicts the dataset correctly 85.166667% of the time... only to realize that my dataset is spectacularly unbalanced and exactly (EXACTLY) 85.166667% of the dataset is in a single category. It's amazing how it just sort of snaps to that when you start layering the machine learning algorithms and you realize that the real problem is that there's no real pattern in the data (something data scientists don't often like to admit).

Third, sometimes the algorithms just get it wrong in ways that seem minor and rare from a data science perspective, but have large social consequences. Like improperly labeling a black couple as gorillas. It might actually be the case that the algorithm was improperly trained because it lacked photos of black people and photos of gorillas and didn't have much to go on (an example of the first issue) but I don't know enough about the situation to say for sure.

And fourth, of course, is that these patterns DO exist to some degree in the "real world," and this is a point that's been hammered on over and over again on Hacker News. The problem is that machine learning is a sort of big leveler that finds these patterns wherever it can and applies them universally (and often while emphasizing the differences for the reasons stated above). I mean, that's the point of it, after all! But knowing this fact, I think it makes sense to be careful where and how it's used.

discuss

No comments yet.