Accurate, automatic descriptions of snapshots of peoples lives?? This is surely sending shock waves down the data mining community. Additionally Facebook are saying this was built as an aid for blind people, but this is surely just a cover for being able to take targeted ads to the next level.
This is surely sending shock waves down the data mining community.
Not really. From a technical point of view there is nothing impressive about this, except that they spend the time and money to collect all the training data. Also that it says "this image may contain:" tells you something about how accurate this actually is.
If that was the case wouldn't they choose to simply not announce anything?
I'm not ruling out that they might use it for targeting eventually, but if this was solely done as a cover it would be the equivalent of a terrorist entering an airport shouting out "My suitcase is just really heavy, I do not have a bomb in it at all!"
This has been possible for, at a minimum, three years. There's an effort gap between building an ad targeting version and building a blind enabling version.
The only new thing here is that Facebook released it to the public.
I always find Facebook's example feed to be funny since it's a completely unrealistic depiction of what their site actually is for most users.
Just quickly looking at the top few posts in my feed, I see someone celebrating their two year friendship with someone I don't know, one person sharing a link to a new airplane, four people sharing videos, one person liking a sponsored video, and finally one person updating their profile picture.
I wish I could see actual updates from people instead of being kept abreast as to what piece of third party content they've liked at some point in time, what third party content they're sharing, etc.
How is this unrealistic? This is exactly the kind of stuff my feed shows. The people you follow probably aren't posting any updates (that you like to see)... and sharing content is an update, that's what that person decided to post.
> I wish I could see actual updates from people instead of being kept abreast as to what piece of third party content they've liked at some point in time, what third party content they're sharing, etc.
This nicely captures the problem. Facebook (and probably most social media systems - I'm looking at you LinkedIn) are primarily interested in keeping you up to date with Facebook via the medium that is your human relationships. So Facebook wants you to know what your friends did on Facebook today so that you might do that same Facebook thing. This increases Facebook interactions which in turn become more information to propagate to others on Facebook. In the limit there is no need for Facebook because all anyone is ever doing is Facebook.
Last July, I lead a project in support of a federal agency to analyze current business processes and identify weaknesses in the agency Section 508 office. My work focused primarily on externally accessible internet sites and one of the most common 508 violations that we encountered was the lack of ALT-text on images. This agency utilized a number of automated scanning tools and processes, but lacked any ability to efficiently remediate these errors. While we never talked more than from a conceptual standpoint, a coworker and I discussed something along the lines of what Facebook has accomplished here through the use of Google's Neural Networks. Very cool to see this advancement come to life.
This kind of systems/algorithims also allows to asign a certain semantic component to images (with a grain of salt, of course), which might enable further developments that weren't considered posible yet.
Sadly, it also brings another complete set of cases to the oh-so-anoying "but Facebook/Google/Twitter/Amazon does it!" clichés that we'll now have to deal with...
I'm waiting for the day this is just a library you pass an image to and it returns an array. No, not a SaaS. Then on my own pump.io, diaspora, redMatrix etc. it just works. My data, my images, my network. I'm not against the tech at all though. Neat
There are already pre-trained networks out there. TensorFlow comes with an example command line tool that you can pass any image and it will tell you what is in the image.
The classes that it can detect are from ImageNet, so that might be limiting.
This might be asking for too much, but why not use more of the image meta-data than these computer vision techniques?
If I were blind, I really wouldn't care that this is an image of "two people, smiling". Facebook has facial recognition, tagging, and locations. It would be much more valuable to me to say "Peter and Laura smiling at Channel Islands State Park."
I'd like to compare Facebook's image tagging with Google Cloud Vision API https://cloud.google.com/vision/ I think it would be interesting to see which one is more accurate or verbose.
I suppose it'll be like YouTube's automatic subtitles for audio. It'll do a bad, but passable, job: at least the blind and visually impaired have some idea of what the image contains.
The price for that would be $7.50 per 1,000 images for the first million images.
I have some 60,000 images on the site I run and don't happen to have $450 in loose change laying around (the whole site costs less than that to run each month).
The techniques already exist[0] for some sophisticated trolling, though it may be hard to achieve in practice without direct access to the classifier being used.
Ah, the Twitter app for Android (beta version) recently added a feature that allows you to add a description to pictures you upload for impaired people.
[+] [-] hacker_9|10 years ago|reply
[+] [-] ma2rten|10 years ago|reply
Not really. From a technical point of view there is nothing impressive about this, except that they spend the time and money to collect all the training data. Also that it says "this image may contain:" tells you something about how accurate this actually is.
[+] [-] coroutines|10 years ago|reply
[+] [-] hanspeter|10 years ago|reply
I'm not ruling out that they might use it for targeting eventually, but if this was solely done as a cover it would be the equivalent of a terrorist entering an airport shouting out "My suitcase is just really heavy, I do not have a bomb in it at all!"
[+] [-] zappo2938|10 years ago|reply
[+] [-] leoalves|10 years ago|reply
The problem I see here is that now they are giving this data to spammers.
[+] [-] DonHopkins|10 years ago|reply
[+] [-] iaw|10 years ago|reply
The only new thing here is that Facebook released it to the public.
[+] [-] speedyapoc|10 years ago|reply
Just quickly looking at the top few posts in my feed, I see someone celebrating their two year friendship with someone I don't know, one person sharing a link to a new airplane, four people sharing videos, one person liking a sponsored video, and finally one person updating their profile picture.
I wish I could see actual updates from people instead of being kept abreast as to what piece of third party content they've liked at some point in time, what third party content they're sharing, etc.
[+] [-] manigandham|10 years ago|reply
Your feed is what you make it.
[+] [-] Toenex|10 years ago|reply
This nicely captures the problem. Facebook (and probably most social media systems - I'm looking at you LinkedIn) are primarily interested in keeping you up to date with Facebook via the medium that is your human relationships. So Facebook wants you to know what your friends did on Facebook today so that you might do that same Facebook thing. This increases Facebook interactions which in turn become more information to propagate to others on Facebook. In the limit there is no need for Facebook because all anyone is ever doing is Facebook.
[+] [-] _qbjt|10 years ago|reply
[+] [-] seanalltogether|10 years ago|reply
[+] [-] ospfer|10 years ago|reply
[+] [-] dr_zoidberg|10 years ago|reply
Sadly, it also brings another complete set of cases to the oh-so-anoying "but Facebook/Google/Twitter/Amazon does it!" clichés that we'll now have to deal with...
[+] [-] verusfossa|10 years ago|reply
[+] [-] ma2rten|10 years ago|reply
The classes that it can detect are from ImageNet, so that might be limiting.
[+] [-] skrjon|10 years ago|reply
https://research.facebook.com/blog/how-blind-people-interact...
Including a link to the publication that was written on the technology here.
https://research.facebook.com/publications/how-blind-people-...
I think its exciting and an honest attempt to make peoples lives better.
[+] [-] sidcool|10 years ago|reply
[+] [-] bla2|10 years ago|reply
[+] [-] shogun21|10 years ago|reply
If I were blind, I really wouldn't care that this is an image of "two people, smiling". Facebook has facial recognition, tagging, and locations. It would be much more valuable to me to say "Peter and Laura smiling at Channel Islands State Park."
[+] [-] visarga|10 years ago|reply
Demo: http://googleresearch.blogspot.ro/2014/11/a-picture-is-worth...
[+] [-] chippy|10 years ago|reply
[+] [-] TazeTSchnitzel|10 years ago|reply
[+] [-] whatever_dude|10 years ago|reply
[+] [-] visarga|10 years ago|reply
[+] [-] SimeVidas|10 years ago|reply
[+] [-] buro9|10 years ago|reply
But it's way too expensive.
All I wanted was keywords for alt-text, dimensions for placeholder, and the dominant colour for placeholder background.
https://cloud.google.com/vision/
The price for that would be $7.50 per 1,000 images for the first million images.
I have some 60,000 images on the site I run and don't happen to have $450 in loose change laying around (the whole site costs less than that to run each month).
I guess I don't care about alt tags that much.
[+] [-] fudged71|10 years ago|reply
[+] [-] dflock|10 years ago|reply
[+] [-] cphoover|10 years ago|reply
[+] [-] tlrobinson|10 years ago|reply
Obvious next step: build this into the OS/browser/screen reader.
[+] [-] Spearchucker|10 years ago|reply
[+] [-] Crespyl|10 years ago|reply
[0] http://karpathy.github.io/2015/03/30/breaking-convnets/
[+] [-] nickysielicki|10 years ago|reply
[+] [-] odinduty|10 years ago|reply
[+] [-] unknown|10 years ago|reply
[deleted]
[+] [-] d33|10 years ago|reply
It's impressive that people don't really connect the dots and see that as a huge threat to their freedom.