Path-breaking Papers About Image Classification

[+] falcolas|8 years ago|reply

Some really cool information, but this concluding bit annoyed me:

> By Moore’s law, we will reach computing power of human brain by 2025 and all of the humanity by 2050.

Their graph does show exponential growth, but the data points cut off at the year 2000. Not surprising, given that Moore's law has reached its end in the last decade. ML improvements now depend upon better algorithms to make them more parallel, and the economies of scale which make more parallel computation units available. I don't think we're anywhere near that exponential graph, however, and we'll keep getting further from it.

Perhaps quantum computing will become a widespread reality and blow the field open, but I'm not holding my breath that it will happen in the next few decades.

[+] scj|8 years ago|reply

> Their graph does show exponential growth, but the data points cut off at the year 2000. Not surprising, given that Moore's law has reached its end in the last decade.

I think the graph was originally produced for Ray Kurzweil's 1999 book "The Age of Spiritual Machines".

[+] londons_explore|8 years ago|reply

Moore’s law is still alive and well, you just have to move over to the parallel architectures like GPU's.

Considering machine learning is all on GPU's and TPU's now, I think this is still a fair assessment.

[+] donovanr|8 years ago|reply

specious extrapolations aside, the plot itself is deceptive -- exponential growth in a log-linear plot should be a straight line, not one accelerating upward super-exponentially.

[+] kayoone|8 years ago|reply

GPUs still improve a lot year over year, so i think Moore's law still holds true for at least a few more years.

[+] AndrewOMartin|8 years ago|reply

The caption for the top graph appears a bit out of whack.

It states "exponential decline in top 5 error rate", the decline looks more like diminishing returns to me, especially if you push the 2017 data point out to where it should be (they've omitted 2016).

It's nice that the error rate is low, but the caption appears to oversell it.

This graph reminds me of a very closely related one I saw in a talk a few years ago [1]. It was showing decline in voice recognition error rates over time, with a highlighted band for "human performance".

The speaker, Roger Moore (the academic, not the actor, and not the Moore with the law), pointed out that this line, while encouraging, hid two important points.

1) For linear improvement, exponentially more training data was needed. 2) No insight into how living beings solve the same task.

These aren't necessarily fatal flaws, but they're worth remembering.

[1] https://www.youtube.com/watch?v=iYbVsvxd3bE

[+] zitterbewegung|8 years ago|reply

A more accurate idea of what a computer sees is actually that ML models figure out what parts of the signal to throw away and pay attention to. This is why you can slightly perturb the image so that humans see a picture of two hot dogs while an ML model can be confused into two different things (hot dog and an egg plant).

[+] kushankpoddar|8 years ago|reply

Sometimes I wonder why is the top-5 image classification task so difficult. If you are giving me 5 chances to look at an image and correctly classify it from ~1000 Imagenet classes, I can surely do better than 5-10% error rate.

Also, now that the top-5 error rate been brought down considerably, what is the next benchmark for the research community to beat? A new dataset, top-1 error rate on Imagenet?

[+] parths|8 years ago|reply

A large majority of human errors come from fine-grained categories(such as correctly identifying two similar cat species) and class unawareness. I would recommend this article by Andrej Karpathy, where he talks about his learning from competing against GoogLeNet: http://karpathy.github.io/2014/09/02/what-i-learned-from-com...

[+] T_D_K|8 years ago|reply

Does anyone have insight as to why they're still doing top 5? It seems to me like the error rates have dropped low enough that they could move on to top 3 or even single guess challenges. Is there data that shows how these same models perform in such tasks? Though I suppose, if I was motivated, all the needed tools are available to find out for myself.

[+] pulkitkumar1995|8 years ago|reply

Is densenet the one which won the best oaper award in CVPR this year?

And which framework would you recommend to code these in?

[+] parths|8 years ago|reply

Yes! Facebook's Densenet won the best paper award in CVPR this year. I would recommend PyTorch framework to code these in as it extends the numpy, scipy ecosystem and is simpler to use.

[+] mongodude|8 years ago|reply

Squeeze and excitation network by momenta.ai has been a watershed moment for Chinese AI prowess and I'll watch out for such Chinese startups to dominate AI landscape for a while. What amuses me is why Google haven't participated in the last couple imagenets?

[+] tanilama|8 years ago|reply

Imagenet as a competition is losing its importance ever since 2016. No idea like ResNet that is widely effective and inspiring from that year. I feel people just over engineered their network structure to claim the state of art by marginal gain.

Google since brought up their Neural Architecture search that can automatically design network, which I think is way ahead of rest of the competitors here.

[+] muktabh|8 years ago|reply

Google has its own huge internal datasets for image classification. You can check for its mention in Chollet's ExceptionNet paper. That may be the reason why they are not really interested in working on imagenet.

[+] unknown|8 years ago|reply

[deleted]

[+] AndrewKemendo|8 years ago|reply

If anyone is interested here are the official ILSVRC2017 results:

http://image-net.org/challenges/LSVRC/2017/results

[+] lalp2119|8 years ago|reply

It would be great if you can share the links to pretrained weights if the networks mentione here in python framework.

[+] sanxiyn|8 years ago|reply

Here are some. They all have pretrained weight download.

ResNet: https://github.com/KaimingHe/deep-residual-networks Wide ResNets: https://github.com/szagoruyko/wide-residual-networks ResNeXt: https://github.com/facebookresearch/ResNeXt DenseNet: https://github.com/liuzhuang13/DenseNet

[+] unknown|8 years ago|reply

[deleted]

29 comments