So I am a machine learning researcher who moved to a FAANG as a research scientist after graduation. My salary is 10x against the grad student stipend. That does not even account for the free food, the healthcare, and other perks. However, I have not adjusted my lifestyle so it does not feel real.
The thing is, even though having 1000x the resources compared to university, that does not really make me happier about the work specifically. It makes some things easier and other things harder.
No, what I really feel is that at work I am not actually treated like a servant any more but like a person. I don't have to work weekends and nights any more. I can take vacations and won't be flooded with emails every.single.day during holidays. I don't have extra unpaid responsibilities that I have zero recourse against.
The thing that I just cannot stop wondering is, why, knowing the perks of industry, advisers still treated us like this. Even though ostensibly trying to keep us, it felt more like squeezing out as much as they could.
For anyone considering where to go to graduate school...
One's PhD experience is highly dependent on the university one attends and especially the professor that one works under. Each lab has a different culture and each professor treats students differently. It also matters a lot whether you are on fellowship or not (someone funded directly from a professor's grant is easier to bully/exploit).
Just like you should speak to employees at a company before joining it to get an idea of culture and work/life balance, prospective graduate students should speak to current graduate students to get an idea of life at a given lab. Most graduate students are openly aware of which professors are known for treating students like indentured servants and which are known for being hands off and generous with research funds. If you are trying to pick a university/lab, definitely go to visits and speak to 4th/5th year grad students (preferably over a beer or two). Typically by the end of their PhD, graduate students are willing to tell you the truth about the different labs/professors.
Also, remember that is often totally ok to switch labs within 1-2 years. Yes, it may set you back some on your progress, but it can be much better than being miserable for 5 years.
> No, what I really feel is that at work I am not actually treated like a servant any more but like a person.
My experience in FAANG research, outside of the pay > 100x multiplier, has been very different! In grad school we could work from anywhere at anytime, there was no requirement to be glued to a desk 14 hours a day and having to respond to emails on Friday nights or weekends or risk a bad performance review from your manager, no dystopian open office space, unlimited conference travel flexibility, etc. I actually kind of miss grad school despite having made 30K a year as a PhD student.
Hello. I'd like to share one slightly tangential anecdote and observation regarding this.
I have a friend who was in a PhD in Physics program at CalTech. Absolute genius of a kid, and was surrounded by other people who are incredibly smart. My friend was always a very ambitious person, and wanted to join Wall Street as a quant after completing his PhD because he was interested in maximizing his income, and found the problems presented in finance/markets more compelling than those found in academia.
When he intimated this to people in the Department, they looked at him as if he had suddenly grown tentacles, because it's unbelievable to them that anyone would want to do something other than academia. This is a stark contrast to friends I have at places like Stanford, where no one quite frankly cares.
This doesn't touch on any alleged bad behavior or stress or pressures or experiences that people have while in grad school, but I believe that the cultural forces at institutions govern how people feel pretty strongly. That isn't a profound observation or even revealing, but I just thought that people would like to see a quick human anecdote to maybe relate to people going through this.
"The thing that I just cannot stop wondering is, why, knowing the perks of industry, advisers still treated us like this."
You can drop "knowing the perks of industry". There's no excuse for doing that under any circumstances, even if there is no industry alternative. I see this in many fields where grad students work in labs and are funded by grants. I wish universities would clean it up. There's simply no excuse for it. There's nothing about being in grad school that makes this behavior okay.
> The thing that I just cannot stop wondering is, why, knowing the perks of industry, advisers still treated us like this.
I think you had a bad advisor(s)! They’re not all like that. Maybe the group you were in was under a lot of funding pressure or something, that can cause bad behavior.
The thing that I just cannot stop wondering is, why, knowing the perks of industry, advisers still treated us like this. Even though ostensibly trying to keep us, it felt more like squeezing out as much as they could.
But people in FAANG do work nights and weekends and it's totally not unusual to be assigned sudden bitchwork by your manager and have no recourse against it.
As a note, it is possible to be in a Ph.D. program and not be an employee of the University, at least in the US. Being on a grad student stipend is usually a choice. Advisors are treating you like that because you allow yourself to be treated like that.
Yes, I have a PhD. Yes, I paid my own graduate school tuition. Yes, I had a successful career in ML.
I'm not an expert, but I've read highly-cited ML papers where the researchers barely bothered with hyperparameter search, much less throwing a few million dollars at the problem. You can still get an interesting proof of concept without big money.
And low resource computing is more theoretically and practically interesting. I've heard experts complain of some experiments "they didn't really discover anything, they threw compute at the problem until they got some nice PR." This was coming from M'FAANG people too so it's not just resentment.
Are you sure they barely bothered, or did they just not mention it? I have heard stories of lots of “cutting edge” ML research actually just being the result of extremely fine hyperparameter tuning
But I do AI and I work at a university. Developing AI might be too expensive if you're going for larger architectures and eeking out an extra % or two in a kaggle like problem. Most advances in machine learning to come though are in fields where there has been little activity. I'm currently en process of making a career out of using very basic machine learning methods and applying them to physical science problems because 95% of the people in the field don't know how (problem of tenure). This opens up lots of opportunities for funding though. NSF etc will literally just throw money at you if you say AI and that you'll apply it to any problem.
May I ask at what stage of your career you are? I am a (confused) grad student working on “ML for physical sciences” and would really appreciate some advice on career directions.
Companies spend a lot of money on AI because they have a lot of money and don't know what to do with it. Companies lack creativity and an appetite for riskier and more creative ideas. That is what Universities must do instead of trying to ape companies. The human brain doesn't use a billion dollars in compute power, figure out what it is doing.
Sort of by definition, it can never be too costly to be creative. Only too timid. And too unimaginative.
+1 on calling this BS, even though I think it is only partly BS:
While it is true that training very large language models is very expensive, pre-trained models + transfer learning allows interesting NLP work on a budget. For many types of deep learning a single computer with a fast and large memory GPU is enough.
It is easy to under appreciate the importance of having a lot of human time to think, be creative, and try things out. I admit that new model architecture research is helped by AutoML, like AdaNet, etc. and being able to run many experiments in parallel becomes important.
Teams that make breakthroughs can provide lots of human time, in addition to compute resources.
There is another cost besides compute that favors companies: being able to pay very large salaries for top tier researchers, much more than what universities can pay.
To me the end goal of what I have been working on since the 1980s is flexible general AI, and I don’t think we will get there with deep learning as it is now. I am in my 60s and I hope to see much more progress in my lifetime, but I expect we will need to catch several more “waves” of new technology like DL before we get there.
> The human brain doesn't use a billion dollars in compute power, figure out what it is doing.
This may not be true, if we’re talking about computers reaching general intelligence parity with the human brain.
Latest estimates place the computational capacity of the human brain at somewhere between 10^15 to 10^28 FLOPS[1]. The worlds fastest supercomputer[2] reaches a peak of 2 * 10^17 FLOPS, and it cost $325 million[3].
To realistically reach 10^28 FLOPS today is simply not possible at all: If we projected linearly from above, the dollar cost would be $16 quintillion (1.625 * 10^19 dollars).
So, when it comes to trying to replicate human intelligence in today’s machines, we can only hope the 10^15 FLOPS estimates are more accurate than the 10^28 FLOPS ones — but until we do replicate human level general intelligence, it’s very difficult to prove which projection will be correct (an error bar spanning 13 orders of magnitude is not a very precise estimate).
P.S. Of course, if Moore’s law continues for a few more decades, even 10^28 FLOPS will be commonplace and cheap. Personally, I am very excited for such a future, because then achieving AGI will not be contingent on having millions or billions of dollars. Rather, it will depend on a few creative/innovative leaps in algorithm design — which could come from anyone, anywhere.
I also think this is kinda ridiculous. If anything I feel like the big consumer tech companies are disadvantaged because they need to deploy something that can scale to a billion people. They can only spend pennies per customer because the margins are so low. Sure they will have a research team for marketing purposes but when it comes to deployments they aren't doing anything too fancy.
The work coming out of companies with higher profit margins per customer are doing much more novel work from what I have seen.
All this is to say, I don't see universities getting shut out anytime soon. The necessary compute to contribute is pretty cheap and most universities either have a free cluster for students or are operating with large grants to pay for compute (or both).
>The human brain doesn't use a billion dollars in compute power, figure out what it is doing
Phahahaha, this made me a good laugh :) To find out how your brain works guess what are you going to use - the brain itself. It's like trying to cut a knife with itself, or trying to use a weigher to weight itself.
My conclusion after reading many papers in the Natural Language Processing field (which is now all about machine learning) is that, generally, company papers focus on tweaking pipelines until they have increased the accuracy score. If they have done so they quickly publish this result and leave the analysis of their results for others. (BERT[1] is a prime example of this.) However, I do not agree with the fact that companies lack creativity. If you look at all the wide research currently being undertaken at the big tech companies you will be amazed by their scope. (I found this out by coming up with 'new' ideas during my thesis, only to find out that some researcher at some big tech company was already working on it.)
[1] Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
There is a far more obvious problem than the cost of computing - the cost of labor. If you are a deep learning researcher, you can join any of these companies and multiply your salary 5x.
Computer constraints are relatively straightforward engineering and science problems to solve. The lack of talent, that seems like the bigger story.
As an ML researcher without a masters or PhD that’s not necessarily true. Companies reach out to me and say things like “we are really excited to talk to you about opportunities, especially given your 4 years of production ML model work and research”
Then right before interviews I hear, “well we like you, but didn’t realize you didn’t have a PhD. We have a really awesome software engineering / machine learning engineer opening that’d be great for for you. That’s what we’ll do the interviews for”
Cool. Probably happens 2/3 the time, honestly.
Just ask me about my parents, research, and projects...
Anyway, point being you can’t 5x with a B.S. in C.S. That’s why there’s a “lack” of talent.
Hasn't that been the case for decades though? Maybe not those exact numbers, but from what I understand, STEM profs could always go into the private sector/government and make more money. AI profs were probably the exception a few decades ago; they're just catching up now that it has gotten good enough. EDIT: By 'good enough' I mean that it's progressed to the point where it can actually be used for economic gain in a company/product, where before it was just a money pit.
Long gone are the days when your patron king or queen would fund you handsomely just for doing math.
Sadly we as a country dont invest more into CS research like we used to. Companies used to pay to have teams of developers go through training. Now in some places you are lucky if they even pay for training you do at home.
It may be a sign that CS is becoming a mature field. Aeronautical Engineering doesn't build experimental aircraft. Chemical doesn't build experimental refineries. If you want to do those things you work for Lockheed, Boeing, Exxon, etc.
This is demonstrably false; high-end research universities do exactly all these things. Stanford has a high-end fab. Caltech students build experimental aircraft. Universities build nuclear reactors for research. I didn't find any examples but I'm certain that Unis in Texas have small refineries.
I hit this wall in my own Deep Learning-based automation business; I am now forced to rely on transfer learning most of the time, my Tesla/Titan RTX-based in-house "server" is no longer capable of training latest greatest models in reasonable time, cloud is out of question due to costs and (automated) parameter tuning with distributed training takes ages. I can still ask a lot for customized solutions, though I see the writing on the wall that it might not last too long (2-4 years) and I'd have to switch business as there will be only a handful of companies able to train anything that is better than current SOTA.
I asked the same question after several AI talks given by large companies (Google, Nvidia, etc).
The general answer is that it's still possible to try out things on a single GPU or several servers and many gains come from good features and smart network designs. On the other hand, squeezing out the last 5% does require more data and budget.
Personally, I think you can still do a lot with a moderate budget and smart people. But would love to hear other opinions.
Look into the modern nlp models. BERT and its many derivatives, RoBERTa, XLNet. Training all of these require roughly TB of data, and generally take days on multiple TPUs. You often can’t even fine tune on a single GPU without some clever tricks.
Have AI techniques actually changed in the last 20 years, or is there just more data, better networking, better sensors, and faster compute resources now.
By my survey of the land, there haven't been any leaps in the AI approach. It's just that it's easier to tie real world data together and operate on it.
For a university, what changes when you teach? This sounds like researchers feeling like they can't churn out papers that are more like industry reports vs advances in ideas.
There hasn't been any kind of paradigm shift but there have been a bunch of real although incremental improvements. Improved optimization algorithms, interesting and novel neural network layers.
Some of this is even motivated by mathematical theory, even if you can't prove anything in the setting of large, complex models on real-world data.
The quote from Hinton is something like, neural networks needed a 1000x improvement from the 90s, and the hardware got 100x better while the algorithms and models got 10x better.
Indeed. The academic theory for NNs has been there since before the 90s, and is solidly grounded in a mathematical framework. Whatever new techniques arose after 2010 are a) empirical results obtained by semi-trial-and-error and b) unexplainable mathematically.
While it's true that training a new DL model requires lots of computation power, I personally feel that such activity mentioned in the article is more of "application" of ML instead of "research". I personally think University should move in the direction of "pure" research instead.
For example, coming up with a new DL model that has improved image recognition accuracy would mean it has to be trained through the millions of samples from scratch, which requires a lot of time and money. But I'd argue that such thing is more of an "application" of DL instead of "research". Let me explain why... Companies like FAANG have the incentives to do that, because they have tens or hundreds of immediate practical use cases once model is completed, hence I call such activity more of an "application" of ML rather than "research", because there's a clear monetary incentives of completing them. What about University, what sort of incentives do they have by creating a state-of-the-art image recognition other than publication? The problem is publication can't directly produce the resources needed to sustain the research (i.e. money)
I think ML research in the university should move in the direction of "pure" research. For example, instead of DL, is there any other fundamentally different ways of leveraging current state-of-the-art hardware to do machine learning? Think how people moved out approaches such as SVM to neural network. Neural network was originally a "pure" research project. At the moment of creation, neural network wasn't taking off because hardware wasn't capable to keep up with its computational demand, but fast forward 10-15 years later, it becomes the state-of-the-art. University ML research should "live in the future" instead of focusing on what's being hyped at the moment
The article presents the rising costs couched within the theme of all-to-powerful tech companies like Google and Facebook, which is really irrelevant: These costs are not high because of those companies, they are high because the research itself is incredibly resource intensive, and would be so whether or not large tech companies were also engaged in it. In fact, with Google and their development of specialized chips for this purpose, AI research is probably getting cheaper due to their involvement.
Next, this research will probably continue to get cheaper. The cost to do the Dota 2 research 5 years ago would have been much higher, and will probably be even less expensive 5 years from now.
Also, I think there's plenty of room for novel & useful at the bottom end where $millions in compute resources are not essential. Cracking AI Dota is certainly interesting, but it's hardly the only game in town, and developing optimized AI techniques specifically for resource-sparse environments would be a worthy project.
Having a few hundred consumer GPUs or a few dozen "datacenter" GPUs should be within the reach of any University department, and at least Nvidia also seems happy to sponsor University setups (after all new research creates demand from the industry).
Sure, this doesn't compete with Google's data-centers. But that's assuming Universities are for some reason competing against private industry. That's not how any other engineering discipline works, so it's a bit odd to just assume without discussion.
AI is now a data problem, and a bit of optimisation problem, both should or could be solved at commercial end of research. Universities should be more focused on what's next ? Not saying this is not "the next". but, as, ground level ideas of AI are quite 50-60 year old if not more, current ground level research should make theoretical platform for the technology that is going to come in 20-50 years.
You need all four quadrants: risk capital, industry, academics, and that elusive X-factor. The hard work of AI/ML theory, such as issues around generalizability and ethics, is still done around whiteboards and academic conferences.
A more useful metric may be the proportion of proprietary versus open discovery. I don't know if I can point to a single example where researchers have not rushed to put their latest breakthroughs on OpenReview or Arxiv. Even knowledge of a technique, without the underlying models or data, is enough to influence the field.
Academic free inquiry and intellectual curiosity, looks very different than product-focused solutions-oriented corp R&D. A good working example looks something like Google AI's lab in Palmer Square, right on the Princeton campus. Researchers can still teach and enjoy an academic schedule. I think it was Eric Weinstein who said something to the effect that if you were a johnny come lately to the AI party, your best bet would just be to buy the entire Math Department at IAS! In practice, its probably easier to purchase Greenland ;)
I don't quite understand the issue here. I thought the main reason for the many recent breakthroughs in AI was that hardware has become cheaper and more powerful. Anyone can train a neural network on the graphics card of their home PC now. There are powerful open source frameworks available that do a lot of the heavy lifting for you. You can do far more today than you could back when I was in AI.
Of course the Big Tech companies have far more resources to throw at it; that's why they're Big.
A far more serious issue than access to computational power, is access to suitable data, and particularly the hold that Big Tech has on our data.
It seems like a structural problem. Deep learning generally performs better than anything else that is well known but it also has well known limitations and inefficiencies.
People should question all of the assumptions, from the idea of using NNs and the particular type of NN and all of the core parts of the belief system. Because these certain aspects are fixed on faith more than anything else.
If you want efficiency of training, adaptability, online, generality, true understanding, those assumptions might need to go. Which would not mean you could learn from DL systems, just that core structures would not be fixed.
I see an asymmetry between academia and industry. Academia has the models, industry has the data. Compute is more balanced because it's usually commodity hardware.
If industry is outpacing academia in research, I think that means data is the more valuable quantity, not compute.
And the article's theme of concentration is more a problem with data. Is Facebook dominant because of its algorithms or because of its database? If other companies had Google's index and user telemetry could they not compete with a rival search algorithm?
Maybe this is a good time for university researchers to develop AI algorithms that are not so data and compute hungry. Here's a promising bit of that--https://www.csail.mit.edu/news/smarter-training-neural-netwo.... This is easier said than done but necessity is the mother of all invention they say.
[+] [-] high_derivative|6 years ago|reply
The thing is, even though having 1000x the resources compared to university, that does not really make me happier about the work specifically. It makes some things easier and other things harder.
No, what I really feel is that at work I am not actually treated like a servant any more but like a person. I don't have to work weekends and nights any more. I can take vacations and won't be flooded with emails every.single.day during holidays. I don't have extra unpaid responsibilities that I have zero recourse against.
The thing that I just cannot stop wondering is, why, knowing the perks of industry, advisers still treated us like this. Even though ostensibly trying to keep us, it felt more like squeezing out as much as they could.
[+] [-] Donald|6 years ago|reply
It's cultural. Their PhD advisors treated them the same way. A PhD is effectively a hazing ritual required to break into academia.
[+] [-] jartelt|6 years ago|reply
One's PhD experience is highly dependent on the university one attends and especially the professor that one works under. Each lab has a different culture and each professor treats students differently. It also matters a lot whether you are on fellowship or not (someone funded directly from a professor's grant is easier to bully/exploit).
Just like you should speak to employees at a company before joining it to get an idea of culture and work/life balance, prospective graduate students should speak to current graduate students to get an idea of life at a given lab. Most graduate students are openly aware of which professors are known for treating students like indentured servants and which are known for being hands off and generous with research funds. If you are trying to pick a university/lab, definitely go to visits and speak to 4th/5th year grad students (preferably over a beer or two). Typically by the end of their PhD, graduate students are willing to tell you the truth about the different labs/professors.
Also, remember that is often totally ok to switch labs within 1-2 years. Yes, it may set you back some on your progress, but it can be much better than being miserable for 5 years.
[+] [-] electricslpnsld|6 years ago|reply
My experience in FAANG research, outside of the pay > 100x multiplier, has been very different! In grad school we could work from anywhere at anytime, there was no requirement to be glued to a desk 14 hours a day and having to respond to emails on Friday nights or weekends or risk a bad performance review from your manager, no dystopian open office space, unlimited conference travel flexibility, etc. I actually kind of miss grad school despite having made 30K a year as a PhD student.
[+] [-] IMAYousaf|6 years ago|reply
I have a friend who was in a PhD in Physics program at CalTech. Absolute genius of a kid, and was surrounded by other people who are incredibly smart. My friend was always a very ambitious person, and wanted to join Wall Street as a quant after completing his PhD because he was interested in maximizing his income, and found the problems presented in finance/markets more compelling than those found in academia.
When he intimated this to people in the Department, they looked at him as if he had suddenly grown tentacles, because it's unbelievable to them that anyone would want to do something other than academia. This is a stark contrast to friends I have at places like Stanford, where no one quite frankly cares.
This doesn't touch on any alleged bad behavior or stress or pressures or experiences that people have while in grad school, but I believe that the cultural forces at institutions govern how people feel pretty strongly. That isn't a profound observation or even revealing, but I just thought that people would like to see a quick human anecdote to maybe relate to people going through this.
[+] [-] h2odragon|6 years ago|reply
The victims must deserve it! Anyone so subhuman as to tolerate these conditions and abuses is evidently in need of punishment for being such a shank.
[+] [-] bachmeier|6 years ago|reply
You can drop "knowing the perks of industry". There's no excuse for doing that under any circumstances, even if there is no industry alternative. I see this in many fields where grad students work in labs and are funded by grants. I wish universities would clean it up. There's simply no excuse for it. There's nothing about being in grad school that makes this behavior okay.
[+] [-] dahart|6 years ago|reply
I think you had a bad advisor(s)! They’re not all like that. Maybe the group you were in was under a lot of funding pressure or something, that can cause bad behavior.
[+] [-] KKKKkkkk1|6 years ago|reply
But people in FAANG do work nights and weekends and it's totally not unusual to be assigned sudden bitchwork by your manager and have no recourse against it.
[+] [-] xiaolingxiao|6 years ago|reply
[+] [-] RJhKTcQMgG|6 years ago|reply
Yes, I have a PhD. Yes, I paid my own graduate school tuition. Yes, I had a successful career in ML.
[+] [-] galimaufry|6 years ago|reply
And low resource computing is more theoretically and practically interesting. I've heard experts complain of some experiments "they didn't really discover anything, they threw compute at the problem until they got some nice PR." This was coming from M'FAANG people too so it's not just resentment.
[+] [-] opportune|6 years ago|reply
[+] [-] acollins1331|6 years ago|reply
[+] [-] occamschainsaw|6 years ago|reply
[+] [-] abrichr|6 years ago|reply
Interesting! How do you find customers?
[+] [-] bluishgreen|6 years ago|reply
Companies spend a lot of money on AI because they have a lot of money and don't know what to do with it. Companies lack creativity and an appetite for riskier and more creative ideas. That is what Universities must do instead of trying to ape companies. The human brain doesn't use a billion dollars in compute power, figure out what it is doing.
Sort of by definition, it can never be too costly to be creative. Only too timid. And too unimaginative.
[+] [-] mark_l_watson|6 years ago|reply
While it is true that training very large language models is very expensive, pre-trained models + transfer learning allows interesting NLP work on a budget. For many types of deep learning a single computer with a fast and large memory GPU is enough.
It is easy to under appreciate the importance of having a lot of human time to think, be creative, and try things out. I admit that new model architecture research is helped by AutoML, like AdaNet, etc. and being able to run many experiments in parallel becomes important.
Teams that make breakthroughs can provide lots of human time, in addition to compute resources.
There is another cost besides compute that favors companies: being able to pay very large salaries for top tier researchers, much more than what universities can pay.
To me the end goal of what I have been working on since the 1980s is flexible general AI, and I don’t think we will get there with deep learning as it is now. I am in my 60s and I hope to see much more progress in my lifetime, but I expect we will need to catch several more “waves” of new technology like DL before we get there.
[+] [-] electrograv|6 years ago|reply
This may not be true, if we’re talking about computers reaching general intelligence parity with the human brain.
Latest estimates place the computational capacity of the human brain at somewhere between 10^15 to 10^28 FLOPS[1]. The worlds fastest supercomputer[2] reaches a peak of 2 * 10^17 FLOPS, and it cost $325 million[3].
To realistically reach 10^28 FLOPS today is simply not possible at all: If we projected linearly from above, the dollar cost would be $16 quintillion (1.625 * 10^19 dollars).
So, when it comes to trying to replicate human intelligence in today’s machines, we can only hope the 10^15 FLOPS estimates are more accurate than the 10^28 FLOPS ones — but until we do replicate human level general intelligence, it’s very difficult to prove which projection will be correct (an error bar spanning 13 orders of magnitude is not a very precise estimate).
P.S. Of course, if Moore’s law continues for a few more decades, even 10^28 FLOPS will be commonplace and cheap. Personally, I am very excited for such a future, because then achieving AGI will not be contingent on having millions or billions of dollars. Rather, it will depend on a few creative/innovative leaps in algorithm design — which could come from anyone, anywhere.
[1] https://aiimpacts.org/brain-performance-in-flops/
[2] https://en.m.wikipedia.org/wiki/TOP500#TOP_500
[3] https://en.m.wikipedia.org/wiki/Summit_(supercomputer)
[+] [-] theferalrobot|6 years ago|reply
The work coming out of companies with higher profit margins per customer are doing much more novel work from what I have seen.
All this is to say, I don't see universities getting shut out anytime soon. The necessary compute to contribute is pretty cheap and most universities either have a free cluster for students or are operating with large grants to pay for compute (or both).
[+] [-] chrisseaton|6 years ago|reply
Perhaps figuring out what it is doing itself costs billions of dollars?
[+] [-] euske|6 years ago|reply
There still will be a genius discovered in wild, but an institutionalized effort might make the discovery process quicker and more efficient.
And then there's a correlation of the test score of rich kids and their parent's income.
I want to agree with your sentiment, but the cynical side of me says that sometimes banality wins.
[+] [-] glouwbug|6 years ago|reply
[+] [-] mapcars|6 years ago|reply
Phahahaha, this made me a good laugh :) To find out how your brain works guess what are you going to use - the brain itself. It's like trying to cut a knife with itself, or trying to use a weigher to weight itself.
[+] [-] huijzer|6 years ago|reply
[1] Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
[+] [-] jchallis|6 years ago|reply
Computer constraints are relatively straightforward engineering and science problems to solve. The lack of talent, that seems like the bigger story.
[+] [-] lettergram|6 years ago|reply
Then right before interviews I hear, “well we like you, but didn’t realize you didn’t have a PhD. We have a really awesome software engineering / machine learning engineer opening that’d be great for for you. That’s what we’ll do the interviews for”
Cool. Probably happens 2/3 the time, honestly.
Just ask me about my parents, research, and projects...
Anyway, point being you can’t 5x with a B.S. in C.S. That’s why there’s a “lack” of talent.
[+] [-] fooqux|6 years ago|reply
Long gone are the days when your patron king or queen would fund you handsomely just for doing math.
[+] [-] blazespin|6 years ago|reply
[+] [-] giancarlostoro|6 years ago|reply
[+] [-] Merrill|6 years ago|reply
[+] [-] dekhn|6 years ago|reply
[+] [-] bitL|6 years ago|reply
[+] [-] m3nu|6 years ago|reply
The general answer is that it's still possible to try out things on a single GPU or several servers and many gains come from good features and smart network designs. On the other hand, squeezing out the last 5% does require more data and budget.
Personally, I think you can still do a lot with a moderate budget and smart people. But would love to hear other opinions.
[+] [-] totoglazer|6 years ago|reply
[+] [-] RhysU|6 years ago|reply
[+] [-] benkarst|6 years ago|reply
[+] [-] cujo|6 years ago|reply
Have AI techniques actually changed in the last 20 years, or is there just more data, better networking, better sensors, and faster compute resources now.
By my survey of the land, there haven't been any leaps in the AI approach. It's just that it's easier to tie real world data together and operate on it.
For a university, what changes when you teach? This sounds like researchers feeling like they can't churn out papers that are more like industry reports vs advances in ideas.
[+] [-] currymj|6 years ago|reply
Some of this is even motivated by mathematical theory, even if you can't prove anything in the setting of large, complex models on real-world data.
The quote from Hinton is something like, neural networks needed a 1000x improvement from the 90s, and the hardware got 100x better while the algorithms and models got 10x better.
[+] [-] buboard|6 years ago|reply
[+] [-] blacksoil|6 years ago|reply
For example, coming up with a new DL model that has improved image recognition accuracy would mean it has to be trained through the millions of samples from scratch, which requires a lot of time and money. But I'd argue that such thing is more of an "application" of DL instead of "research". Let me explain why... Companies like FAANG have the incentives to do that, because they have tens or hundreds of immediate practical use cases once model is completed, hence I call such activity more of an "application" of ML rather than "research", because there's a clear monetary incentives of completing them. What about University, what sort of incentives do they have by creating a state-of-the-art image recognition other than publication? The problem is publication can't directly produce the resources needed to sustain the research (i.e. money)
I think ML research in the university should move in the direction of "pure" research. For example, instead of DL, is there any other fundamentally different ways of leveraging current state-of-the-art hardware to do machine learning? Think how people moved out approaches such as SVM to neural network. Neural network was originally a "pure" research project. At the moment of creation, neural network wasn't taking off because hardware wasn't capable to keep up with its computational demand, but fast forward 10-15 years later, it becomes the state-of-the-art. University ML research should "live in the future" instead of focusing on what's being hyped at the moment
[+] [-] ineedasername|6 years ago|reply
Next, this research will probably continue to get cheaper. The cost to do the Dota 2 research 5 years ago would have been much higher, and will probably be even less expensive 5 years from now.
Also, I think there's plenty of room for novel & useful at the bottom end where $millions in compute resources are not essential. Cracking AI Dota is certainly interesting, but it's hardly the only game in town, and developing optimized AI techniques specifically for resource-sparse environments would be a worthy project.
[+] [-] wongarsu|6 years ago|reply
Sure, this doesn't compete with Google's data-centers. But that's assuming Universities are for some reason competing against private industry. That's not how any other engineering discipline works, so it's a bit odd to just assume without discussion.
[+] [-] paulhilbert|6 years ago|reply
That was funny - however not even close to reality. I have to work on a GTX 1080 (not TI)...
[+] [-] iamgopal|6 years ago|reply
[+] [-] ArtWomb|6 years ago|reply
A more useful metric may be the proportion of proprietary versus open discovery. I don't know if I can point to a single example where researchers have not rushed to put their latest breakthroughs on OpenReview or Arxiv. Even knowledge of a technique, without the underlying models or data, is enough to influence the field.
Academic free inquiry and intellectual curiosity, looks very different than product-focused solutions-oriented corp R&D. A good working example looks something like Google AI's lab in Palmer Square, right on the Princeton campus. Researchers can still teach and enjoy an academic schedule. I think it was Eric Weinstein who said something to the effect that if you were a johnny come lately to the AI party, your best bet would just be to buy the entire Math Department at IAS! In practice, its probably easier to purchase Greenland ;)
[+] [-] mcv|6 years ago|reply
Of course the Big Tech companies have far more resources to throw at it; that's why they're Big.
A far more serious issue than access to computational power, is access to suitable data, and particularly the hold that Big Tech has on our data.
[+] [-] ilaksh|6 years ago|reply
People should question all of the assumptions, from the idea of using NNs and the particular type of NN and all of the core parts of the belief system. Because these certain aspects are fixed on faith more than anything else.
If you want efficiency of training, adaptability, online, generality, true understanding, those assumptions might need to go. Which would not mean you could learn from DL systems, just that core structures would not be fixed.
[+] [-] jackcosgrove|6 years ago|reply
I see an asymmetry between academia and industry. Academia has the models, industry has the data. Compute is more balanced because it's usually commodity hardware.
If industry is outpacing academia in research, I think that means data is the more valuable quantity, not compute.
And the article's theme of concentration is more a problem with data. Is Facebook dominant because of its algorithms or because of its database? If other companies had Google's index and user telemetry could they not compete with a rival search algorithm?
[+] [-] keithyjohnson|6 years ago|reply