top | item 23143160

Ask HN: Is deep learning obsession in college ill founded?

43 points| muazzam | 5 years ago

Background:

I'm a CS junior (about to become a senior) and in our last year, people choose their capstone project that they work on for the entire year. For some years (say since 2017), deep learning projects completely dominate the other projects in term of number and the awards that they go on to win. I understand the principles behind it, even find it cool, but the whole inscrutable nature of it is problematic to me.

36 comments

order
[+] ageitgey|5 years ago|reply
Deep learning is what is cool in CS right now. It lets you do new things that you couldn't do before. Based on that, it's going to be over-represented in projects by undergrads looking to do "cool" projects and show off their new-found skills.

But that isn't really a problem. In most cases, the projects you do as an undergrad don't affect your professional life in any way after you get your first job. Very, very few undergrad projects turn into real projects that anyone uses after the student graduates.

So don't worry too much about it. Ten years ago, every senior project was an app. Twenty years ago, every senior project was a website. It's just a sign of the times and doesn't matter in the long run.

[+] rvz|5 years ago|reply
> I understand the principles behind it, even find it cool, but the whole inscrutable nature of it is problematic to me.

Spot on.

On top of that we have 'AI' models getting fooled over adversarial attacks which just involve a single pixel change. As long as these issues are not tackled or not researched well enough, then we'll be pretty much be heading into another AI Winter and the hype cycle will go through its trough of disillusionment phase. Being unable to inspect the black-box nature of such deep-learning systems is why highly regulated industries involving danger to life such as healthcare or other safety critical industries label deep-learning solutions as unsafe for them.

Sure, all you see right now are other students and startups 'applying' deep learning everywhere, but they are hardly advancing the field unlike DeepMind and OpenAI are. In terms of learning, it's something good to learn as a student at college, but creating a AI startup now requires using Google, Amazon or Microsofts data center's for training which is clearly not sustainable anyway.

Security related projects and research are always where it's at.

[+] visarga|5 years ago|reply
> As long as these issues are not tackled or not researched well enough, then we'll be pretty much be heading into another AI Winter and the hype cycle will go through its trough of disillusionment phase.

It's not all or nothing as you present it. ML models can be useful even if they are imperfect - and we should not forget humans aren't perfect either. For example, a model could reduce 50% of the time necessary to enter an invoice into the database. It's imperfect, yet useful.

A model need not run alone without any safety. It can have plain old programming rules to validate its outputs, or use human in the loop.

> Sure, all you see right now are other students and startups 'applying' deep learning everywhere, but they are hardly advancing the field unlike DeepMind and OpenAI are.

On the contrary, I would say that what DeepMind and OpenAI are doing is largely irrelevant for industry. There is a huge number of domains where no ML model has been created, and that is because there are so few people who can make them. The low hanging fruit hasn't been picked yet. It's like electricity at the beginning of the 20th century. The work these students and startups are doing is the good, useful work. You don't need DeepMind grade models to solve most real problems.

> creating a AI startup now requires using Google, Amazon or Microsofts data center's for training

You can train most useful models on a single machine today. Some, like Logistic Regression, train in seconds or minutes. Others take an hour, or a day. Some heavy ones take a week to train. If you don't do hyper-parameter search or cutting edge research you only need a few runs to get a working model. It's data tagging that usually takes months or years.

[+] _bxg1|5 years ago|reply
Personally I think deep-learning is a bubble, and it will soon collapse to its natural place in computer science. Which is not to say that it's a fad that will disappear, only that it will retreat to being just a regular tool among the many tools we have for solving different kinds of problems. Its inscrutable nature is definitely problematic for some use-cases, and not so problematic for others.
[+] sin7|5 years ago|reply
I've been doing the data thing for a while. During one of my defenses of R, someone brought up that R was a black hole. That if you programmed in R, you were a user who just filled in the correct function arguments and it just spit out the answer. And that was when my thoughts on machine learning changed.

The vast majority of us are users. We massage the data to be in a certain shape, then feed it through a machine that someone else created. We can change the parameters. We can change the data. But few of us are going to look in to the code of a random forest function.

I've switched tracks and started doing web development. Playing with the hyper parameters in machine learning is no different than changing the feel of a drop down by changing the colors, fonts and other things to fit a certain aesthetic.

I could be wrong, but I have yet to meet anyone that has done anything besides use packages created by others to call themselves data scientists. I think that opens it up to becoming just another tool no different than Excel.

[+] visarga|5 years ago|reply
While I agree with your sentiment regarding ML engineers - they are just another kind of devs, and that's where it will go - I think DL is not just a tool like any other from the software toolbox. It's more like a paradigm changer, like the print, the engine, electricity, communication and computing. It tends to eat the world.
[+] tuatoru|5 years ago|reply
> Its inscrutable nature is definitely problematic for some use-cases, and not so problematic for others.

It's a problem whereever reliable operation is required, or analytic tractability (explanation) is required, or where resources available for data labeling are limited.

Its niche appears quite small, unless and until solid mathematical foundations are developed for it.

[+] uoaei|5 years ago|reply
NNs are "computer science" only insofar as numerical algorithms are. Which is to say, beyond the question of big-O, it's all math.
[+] uoaei|5 years ago|reply
Not ill-founded so much as jumping the gun.

To understand why neural networks work, you will have to understand how a whole host of smaller, simpler ML models work in excruciating detail. Multiple linear regression, logistic regression, etc. What they mean, how they work, what's really going on "inside", what the underlying probabilistic model represents, etc.

Neural networks are great because it takes basically all of those smaller ideas and concatenates them into a super flexible statistical machine. It's really cool to see the "in->out" but it's even cooler once you have a good grasp on what's going on in the intermediate steps.

In my experience, almost everyone working with neural networks don't have those details down. This goes 100-fold for non-research roles. They learned the Keras API and are happy stacking layers, and as long as the output looks nice they push to production. For most cases empirical validation is probably enough, because NNs usually can achieve some incremental improvement just by virtue of the fact they have so many damn degrees of freedom. But to get a well-performing, well-founded model, you need to know the ins and outs.

[+] deuslovult|5 years ago|reply
I'm an ML engineer, and I agree with you- deep learning is by far the most common approach for new problems in informatics.

Imo deep learning is so popular because it "works". For a classification problem, if you try a linear baseline and a deep learning model, and you do a reasonable job of hyperparameter tuning and experimental design, it's likely you will outperform a simpler model. This holds true across many problem spaces.

I think the issue is that modern DL frameworks make it a little too easy to get pretty good performance on new problems. Other techniques generally require more background knowledge to make reasonable modeling assumptions, and still frequently perform worse than a naively applied DL approach.

I think DL will remain, in practice and education, a very popular tool. But it is essential to learn traditional statistical inference and other background to appropriately contextualize DL models so it isn't just some form of black magic.

[+] mattkrause|5 years ago|reply
A lot of those comparisons strike me as shaky.

It's easy to beat a naive logistic regression model with a good neural network, but the gap often closes once you start trying to tune the logistic model too. (And it's not like the neural networks aren't tuned either--architecture search, data augmentation, etc).

Recent review on medical data: https://www.sciencedirect.com/science/article/abs/pii/S08954...

[+] poulsbohemian|5 years ago|reply
22 years ago when I was in your shoes, distributed systems were the topic of the day, and all of us were going to be building systems with CORBA and DCOM... so guess what my project and paper were about? That's right, things I never touched in my career, but darn it if they didn't help me get my first job because they were hot topics of the moment.

So, pick something in "AI" that is the hotness of the moment, learn what you can, do your best, and then get on with life and career.

[+] md2020|5 years ago|reply
My situation is the same as yours, CS junior heading into my capstone project next semester, and my opinion is a resounding yes. The deep learning obsession is almost certainly a hype bubble. I have observed the same here at my university, the "But what if we did it with deep learning?" projects are almost reaching meme status. It's rather disheartening as someone who actually is interested in AGI, but I've been driven away from wanting to pursue the field since the current research seems lacking in ambition and substance. My previous summer internship had me reading a lot of deep learning papers on arXiv, and the vast majority of them seem to be tweaking a single parameter in a DNN, achieving a 0.3% increase in score on an arbitrary benchmark, and calling it a meaningful result. I'd personally like to see more people doing work like the kind DeepMind does that seems to actually achieve breakthroughs informed by knowledge from neuroscience, but I have a feeling we won't see that anytime soon since DeepMind gets their pick of the best researchers in the world. I'm just an undergrad though, would love to hear the opinions of more knowledgeable people! Specifically, I'd like to hear arguments against the sentiment that "Deep learning right now is pretty much alchemy". How is the work in deep learning helping us understand the nature of intelligence, rather than just helping Facebook and Amazon better target advertisements and product recommendations?
[+] visarga|5 years ago|reply
> How is the work in deep learning helping us understand the nature of intelligence

Neural networks performance on a problem is a benchmark of its real difficulty. It gives us insight, a new perspective.

In millennia of deliberations what have philosophers have discovered about the nature of intelligence? And then .. a neural net beats us at all board games, another can solve differential equations, another can translate, another can see, and so on. Have we really not learned anything by these inventions?

Another advantage of DL is that it frames the problem of intelligence in mathematical concepts and rigorous evaluation.

I, for one, have reconsidered all my spiritual beliefs after learning about the agent-environment-reward model of reinforcement learning. A new way of framing the agent and life, so parsimonious and powerful. And it does not require a soul, or a god, or anything outside out real environment, and yet can explain so much.

The whole machine learning paradigm is another powerful concept through which we can understand how we might function. Previously you might wonder how emotion, thought, sensation, imagination and will relate to each other. Now we can understand how they might be implemented and wired together, and what principles support their function.

[+] Irishsteve|5 years ago|reply
It's the current trendy technology (For a lot of good reasons). So its only normal students gravitate towards it.

It's not ideal - but if it wasn't DL it would be another topic / application.

[+] en4bz|5 years ago|reply
> During the gold rush it's a good time to be in the pick and shovel business
[+] makapuf|5 years ago|reply
Sure, nvidia is doing pretty well indeed.
[+] AnimalMuppet|5 years ago|reply
For your purposes, it doesn't matter. You can do your capstone in the current hotness, or not, as you please. If you do it in deep learning and in five years deep learning is passe, it won't matter. You'll have your degree and be four years into your career. Or you'll be four years into grad school. You'll be fine. (In this scenario, if your grad school is in deep learning, you won't be fine. Think harder about that choice than about your capstone.)

Given all that you've said, a capstone that tries to dent the "inscrutable nature" of deep learning might be an interesting choice.

[+] omarhaneef|5 years ago|reply
I think deep learning works better than simple linear regressions because we have already succeeded with simple linear regression wherever we can but we have just started to get going with "deep" learning. And the best part is, as new computers come out, the deeper we learn.

I will point out that the real win is with new data sources, and simple linear regressions may still work there.

[+] burfog|5 years ago|reply
Possible projects that might keep you occupied for the entire year:

Make a Rust front-end for the GNU Compiler Collection.

Emulate something.

Write a hypervisor.

[+] muazzam|5 years ago|reply
Thanks, it helps. I'm actively looking for project ideas.
[+] sys_64738|5 years ago|reply
"Deep learning" is the latest buzzword to get all the dollars nowadays. In a past decade corporations used to even fund Second Life for use at work.
[+] Kenji|5 years ago|reply
The most important thing is this: You pick a problem you want to solve, you pick the tools you want to solve that problem with, and one of the tools in your bag of tools is deep learning which you may end up using. Do it in that order. Do not pick deep learning and then try to solve everything with deep learning, that's putting the cart before the horse. That's all I can say about things like deep learning, blockchain, etc. Let the problems lead you to the solutions, not the other way around.