top | item 8696478

Images to Text – Toronto Deep Learning Demos

75 points| benanne | 11 years ago |deeplearning.cs.toronto.edu | reply

17 comments

order
[+] JacobEdelman|11 years ago|reply
Looks amazing. The fact that its just returns the "Cannot connect to server of image2text models" makes me very sad.
[+] YoukaiCountry|11 years ago|reply
So far I keep getting the error "Cannot connect to server of image2text models"

Anyone having any luck?

[+] bootynuke|11 years ago|reply
I think it must be getting slammed; I was able to get a couple of descriptions out of it, but that was balanced by probably 2 times as many instances of the above error.
[+] finin|11 years ago|reply
http://www.skunkieacres.com/images/rabbit_box.jpg

A picture of a rabbit in a wooden box => "a cat looking into a bin full of apples"

Mistaking a rabbit for a cat is not too bad. A bin is like a box, I suppose. I'm not sure where the apples came from.

[+] thomasahle|11 years ago|reply
Perhaps it's been trained with pictures of apples in boxes...
[+] tly_alex|11 years ago|reply
Rekognition API released similar image to text API and it's much more reliable than this. At least the demo works smooth and response fast. https://rekognition.com/demo/concept
[+] teraflop|11 years ago|reply
Even leaving aside the reliability issue (which can be chalked up to the fact that this one is a demo of a non-commercial project that got overloaded), you're comparing two entirely different things.

Check out the "static demo" pages, e.g. http://www.cs.toronto.edu/~nitish/nips2014demo/results/79133...

For this image, the University of Toronto software generates sentences like "a cow is standing in the grass by a car", whereas Rekognition only produces a ranked list of categories. ("sports_car", "car_wheel", etc.)

EDIT: this is an even better example: http://www.cs.toronto.edu/~nitish/nips2014demo/results/89407... I'm cherry-picking the cases where the algorithm does well, of course. But even if it's unreliable, the fact that this works at all is impressive.

[+] CardinalAgnelo|11 years ago|reply
The demo is clearly designed for the small community of machine learning researchers to play around with it to better evaluate the papers they wrote. They aren't selling a product and probably have a hard time justifying using a lot of computing resources to host the demo. Furthermore, the models are probably optimized for result quality, not speed.
[+] CardinalAgnelo|11 years ago|reply
Doesn't look to be designed for a lot of traffic, be gentle.
[+] misiti3780|11 years ago|reply
Very cool:

Comment: If you click on source code right now it gives me to javascript alerts that were trying to print out JSON objects.

[+] vonnik|11 years ago|reply
I'm curious to hear how much this is read as a sign of strong AI.
[+] cmyr|11 years ago|reply
My brief survey suggests that their training sample did not include very much hardcore pornography.

"a man and a girl are learning to play with a small pool", while poetic, is a stretch in this case.

[+] JacobEdelman|11 years ago|reply
Already after 1 hour of this being posted on hn... Reminders abound of how evolution only made us good tool makers to help us to reproduce more.