I don't hear a lot about "data science" anymore. And judging from the shrinking number of job postings, I suspect it was a bit overhyped a few years ago. What do you think?
I think what companies really want is smart generalists with advanced math, programming, and modeling skills coupled with domain knowledge. That skill set will always carry high value in technical companies.
The reason it carries value is the skills are difficult to acquire. I think the recent decline in interest reflects the rise of new data science candidates that are taking the path of least resistance to a career in data science. Rather than pursuing problem solving, people are pursuing "data science" which is a nebulous term in and of itself.
I am wary when people wax lyrical about all of the ways they love using machine learning on data. It makes me nervous because i worry that they have a hammer and can't wait to use it on anything vaguely nail shaped.
>I think what companies really want is smart generalists with advanced math, programming, and modeling skills coupled with domain knowledge. That skill set will always carry high value in technical companies.
What companies want is to be "in" on the data science hype, while they have no clue what they are doing and the most advanced "data science" they need are simple graphs, boxplots, and linear regressions.
Yes. Unlike software development, data science is not completely business agnostic and a fair amount of business understanding is required. For e.g. if you work with sales data, you need to be aware of seasonalities, purchasing patterns etc to understand the trends that you observe to discern between what is a true outlier and what is explainable.
What makes me cringe the most is to see flashy presentations with claims akin to 'Data Science will change your world'.For sure, it can and has been proven to automate decisions (think, credit scores), assist in decision-making (think, sales trends) and anomaly detection (think, security systems). I find so many data scientists that I interview are so hung up about the esoteric techniques they employ, often failing to even explain why was it useful or how it helped their businesses.
What has been transformational and path-breaking is the breaking of enterprise monopolies in this space (for e.g. SAS/IBM SPSS) and a variety of open-source frameworks have made it easy and convenient, apart from opening it up to software developers to build these skills. Important, though, to not lose of the sight that data science is at the sweet spot of expertise in domain, data and technology.
There's a lot of hype in software. Just like you were hearing about Blockchain startups only when Bitcoin started trading at 5 bazillion USD and now you don't hear nearly as much about it. During 2008-2015ish we went through an insane churn of new JavaScript frameworks and NoSQL database hype.
Yeah it seems to have calmed, but I don't think data science was just hype because it comes from (and somewhat is) probability and statistics, and the rate of data/information that's being produced by and extracted from people seems to be ever increasing. But it absolutely was prone to a hype cycle as with almost anything else in tech. IMO this is a phenomenon exacerbated by venture capital.
Most of what we call “data science” is repackaged “data mining” — a skill that goes easily back to the mid-90s. Sure, open source tooling makes it all a lot more accessible today; but IBM / Oracle / etc. have offered similar packages (at MUCH higher price points, of course) for decades.
I think once the hype calmed down, people started to realize that it was largely the same old shit in a much cheaper package — evolutionary rather than revolutionary. Ultimately I think the hype cycle was driven by Moore’s Law more than anything; the fact you could run this type of analysis in a manageable amount of time without needing a huge IBM mainframe was the real innovation.
That hype in software is the reason I wrote a media literacy guide for software engineers. Hype makes it easy to get free marketing and also drives clicks for media and social networks.
I'm a "data scientist" and couldn't agree more. There are a lot of companies that benefit from hyping emerging tech and careers to a point of saturation, these include bootcamps, consulting firms and service providers. The market has calmed but I wouldn't say it was just hype (obviously I'm biased!)
> IMO this is a phenomenon exacerbated by venture capital.
Not just VCs. It's a whole mafia gang consisting of tech reporters and founders also. They all have their vested interests - reporters want new stories and founders want funding and growth.
Slack, VR, AR - they all went through this cycle. Sometimes, it's a bit annoying.
It wasn't 'just' hype, but it was over-hyped. There are companies that have their act together from a data standpoint and can make use of data scientists, whatever that term actually means in the context of their organization, but most can't. So the companies who spun up a data science initiative but had no business doing so are now likely saying things like 'what do you mean we don't have the necessary data?' and 'what do you mean our data is a mess?' etc. and will quietly back off over time. Likewise, the companies who can take advantage of it will quietly do so. No different than the hype surrounding every other buzzword in tech... there is no silver bullet.
Usually the companies able to benefit from data science are also the ones best positioned to benefit from digitalization. They have their processes under control. I worried that all the others will just be relegated to ... wherever.
The hype was that you could take a huge pile of data and turn it into hugely valuable insights.
The realization is that any random pile of data likely doesn't have anything in it that is worth paying for:
Here's our analysis!
We already do/knew that.
Some people have really good, valuable data sets. Most people don't.
It’s an epistemology / ontology question, as folks familiar with the humanities would spot in little time. Aka it’s not “data” until something empowers the created metric a meaning.
My guess is that data science roles will merge with business analyst roles. Python and r will slowly join excel as tools of choice for making tables and charts to stick in powerpoint slides and pdf reports. Meanwhile the machine learning side of things will be the domain of _something_ engineers with candidates more likely to come from the computer science/math/engineer world rather than the sciences. (Other then those with degrees in physic who seem able to perpetually land wherever they feel like.)
> My guess is that data science roles will merge with business analyst roles.
Data Scientist is a buzz word for Statistician. Business Analyst is buzz word for Industrial Engineer. For example 10 years ago if you studied at my university you would witness that some Statistics students were doing second major mostly at Industrial Engineering and vice versa. They are already related for many years but average Joe has no idea.
We're passed the early peak of the hype cycle, but now the marketers have calmed down the real world applications are maturing.
If you think of Data Science as AI sure, but if you frame it as applied statistics + good software engineering practices + cloud scale I think things are in a good place.
This. Any good data science person needs to understand how distributed systems work. They need to be decent at applied statistics. I think an average engineering grad can easily work with simple statistics they learn in Linear algebra and intro to ML. The good software engineering part is what has been brought on in the last few years.
I once saw some code written by a "data scientist". The overall code was non-complex, but the Java/Scala code was the worst of my nightmare. Additionally, I think other engineers have also matured enough to understand that underneath the veneer of data science, the fundamentals do not change much.
"AI" (or whatever rebranding it gets) always works in cycles, with a phase of excitement and overpromising followed by a phase of apparent underdelivering and skepticism. But what actually happens is that the innovations just become part of the normal tooling, and stop being called "AI".
At some point there is no need to hire a "data scientist", as any python programmer is already expected to know how to use numpy, pandas, sklearn and keras, just like before it was already expected for them to do any kind of data manipulation with SQL without requiring a dedicated database expert.
The value is a "Data Scientist" isn't that they know how to use a tool - it's that tell know why to use _that_ tool (technique) and not this other one.
It wasn't hype, however people got very confused about what they actually needed vs what they thought they wanted.
When someone says they want a "Data Scientist" what they really mean is "I want a Data Scientist who is also a Data Engineer".
I have seen so many companies spend a really decent chunk of money on a data scientist and then are shocked to find that this data scientist doesn't know how to deploy models, set up spark clusters or know how many and what type of GPU they need to use to get the job done.
After all - that is not their purpose.
We were in a similar situation, but what we needed was a Data Engineer - we had a rough idea of where we wanted to go and what we wanted to achieve, he was doing a Masters in Data Science so he had that background as context.
We will look at adding a Data Scientist to our ranks in the future - but they will be working side by side with a Data Engineer who can action their requirements!
The "data scientist" where I recently contracted struggles with generating basic reports. I saw a full page SQL query with the caption "yeah, data science is hard" posted to slack. Terrifying.
I think the term "data science" is often misused. It seems to make management feel like they are on the cutting edge. They were talking about AI and a R&D department the other day. They aren't even making use of simple heuristics yet! I guess that talk helps with fundraising though.
Over the past several months I keep seeing people trying to equate data science with machine learning, and it made me wonder if the people doing this are trying to salvage (or perhaps enhance) the investment they made in data science by trying to blur the lines between the two.
Isn't the line between the two indeed blurry? Maybe deep learning is machine learning, but modern statistical methods such as elastic net, SVM, and random forests are things data scientists should know about.
At my previous company data science has become synonymous with data analysis, to the point that the number of data scientists on staff is starting to outnumber data analysts. I think it's more a sign of a maturing field than anything else.
The more narrow view of data science as big data, models, and machine learning is probably less a thing now, but data analysis overall is only getting bigger.
No it isn’t a fad. Data collection by every man and his dog is really happening. The need for people that can use the data in order to improve business outcomes is the consequence of it.
Data collection will become more prominent IMO because:
1. Data driven business preference, competitive advantage and FOMO. Already dominates sales and marketing. Starting to dominate in product and dev. Already dominates production.
2. IoT, and more data marketplaces resulting from it.
3. Extensions of the global SaaS value chains (usually connected by data).
I would like to hear more about my European fellows w.r.t. how GDPR affected their ability to muster domain knowledge.
I used to work for a small start-up and the CTO was very strict on data access, making my life as feature developer and "data scientist wanna be" almost impossible.
He, on the other hand, had not only access to all data but also used the product as a consumer (which didn't make sense for ICs so we ended just playing with sales demo accounts). I ended leaving the company because of that.
In my professional experience this is a little misguided. A pure PhD statistician isn't going to be able to hack it working on fast-paced production software environments and building end-to-end pipeline/software/ML systems. I mean, no doubt a PhD statistician could learn and be good at it, but the average statistician isn't geared up for this type of work.
On the other hand, your standard tech data scientist may find themselves out of their element if needing to design a very rigorous randomized trial for testing a new drug, and making careful inference (I mean I'm sure plenty could, but I'm not going to trust a 25 year old with two years work experience to do that).
With the cloud came a lot of data, so 'big data' was a wave in which Engineers had to deal with ti, and 'data science' was the wave in which we tried to leverage it.
The reality is that most insight from 'big data' are optimizations. They're not going to move the needle on the business as perhaps we would have hoped.
Data Science focused on ad targeting - now that might move the needled.
And of course, maybe some Data Science working along side AI engineers make a breakthrough which could move the needle.
But from a high level, CEO's view, all of these things have trendy undercurrents, the trick is to figure out how much of it really matters to the business.
The 'wins' for consumers will be slow: maybe better product search, better ads. Maybe they figure out how to send flights around ugly weather or how to slot landing times for an x% decrease in flight delays. Or slot road fixing/lights for an x% decrease in traffic delays.
I work on a data science team that's growing pretty aggressively. I also frequently hear from recruiters hiring at companies big and small for analyst and scientist positions.
I'm not sure what it would mean for days science to be "just hype." I see DS work on this website alone all the time.
It is "different now". The DS bubble popping was covered well by Vicki Boykis:
> Since academia is typically a lagging indicator in adoption to new trends in the work place, it’s been long enough that it’s truly worrying for junior data scientists, all of who are hoping to find data science positions. It can be very hard for someone with a new degree in data science to find a data science position, given how many new people they’re competing with in the market.
It wasn't 'just' hype, but the real problem was that very few companies had any data that you could actually extract much value from. Garbage in Garbage out is as true as ever.
That being said I've observed that the data science techniques and tools that have developed over the past few years have been absorbed and adopted by a lot of people that aren't "data scientist". So while companies are hiring a lot less "data scientists", a lot of "data science" is now done by domain experts and analysts as part of their work.
The best quote I've heard for ML is "ML promises to deliver what Computer Science promised to deliver in the decades before as AGI will promise to deliver in the future"
[+] [-] tictacttoe|6 years ago|reply
The reason it carries value is the skills are difficult to acquire. I think the recent decline in interest reflects the rise of new data science candidates that are taking the path of least resistance to a career in data science. Rather than pursuing problem solving, people are pursuing "data science" which is a nebulous term in and of itself.
[+] [-] Ntrails|6 years ago|reply
[+] [-] Rainymood|6 years ago|reply
What companies want is to be "in" on the data science hype, while they have no clue what they are doing and the most advanced "data science" they need are simple graphs, boxplots, and linear regressions.
[+] [-] roystonvassey|6 years ago|reply
What makes me cringe the most is to see flashy presentations with claims akin to 'Data Science will change your world'.For sure, it can and has been proven to automate decisions (think, credit scores), assist in decision-making (think, sales trends) and anomaly detection (think, security systems). I find so many data scientists that I interview are so hung up about the esoteric techniques they employ, often failing to even explain why was it useful or how it helped their businesses.
What has been transformational and path-breaking is the breaking of enterprise monopolies in this space (for e.g. SAS/IBM SPSS) and a variety of open-source frameworks have made it easy and convenient, apart from opening it up to software developers to build these skills. Important, though, to not lose of the sight that data science is at the sweet spot of expertise in domain, data and technology.
[+] [-] usgroup|6 years ago|reply
Can you quote a source for this?
[+] [-] lame88|6 years ago|reply
Yeah it seems to have calmed, but I don't think data science was just hype because it comes from (and somewhat is) probability and statistics, and the rate of data/information that's being produced by and extracted from people seems to be ever increasing. But it absolutely was prone to a hype cycle as with almost anything else in tech. IMO this is a phenomenon exacerbated by venture capital.
[+] [-] wayoutthere|6 years ago|reply
I think once the hype calmed down, people started to realize that it was largely the same old shit in a much cheaper package — evolutionary rather than revolutionary. Ultimately I think the hype cycle was driven by Moore’s Law more than anything; the fact you could run this type of analysis in a manageable amount of time without needing a huge IBM mainframe was the real innovation.
[+] [-] nemild|6 years ago|reply
https://github.com/nemild/hack-the-media/blob/master/softwar...
[+] [-] sakrata|6 years ago|reply
[+] [-] mandeepj|6 years ago|reply
Not just VCs. It's a whole mafia gang consisting of tech reporters and founders also. They all have their vested interests - reporters want new stories and founders want funding and growth.
Slack, VR, AR - they all went through this cycle. Sometimes, it's a bit annoying.
[+] [-] amelius|6 years ago|reply
[+] [-] blihp|6 years ago|reply
[+] [-] kevinventullo|6 years ago|reply
[+] [-] hef19898|6 years ago|reply
[+] [-] OldHand2018|6 years ago|reply
The realization is that any random pile of data likely doesn't have anything in it that is worth paying for: Here's our analysis! We already do/knew that.
Some people have really good, valuable data sets. Most people don't.
[+] [-] foolrush|6 years ago|reply
The map is not the territory.
https://www.amazon.com/Raw-Data-Oxymoron-Infrastructures/dp/...
[+] [-] bradleyjg|6 years ago|reply
[+] [-] darkhorn|6 years ago|reply
Data Scientist is a buzz word for Statistician. Business Analyst is buzz word for Industrial Engineer. For example 10 years ago if you studied at my university you would witness that some Statistics students were doing second major mostly at Industrial Engineering and vice versa. They are already related for many years but average Joe has no idea.
[+] [-] joelschw|6 years ago|reply
If you think of Data Science as AI sure, but if you frame it as applied statistics + good software engineering practices + cloud scale I think things are in a good place.
[+] [-] sha_r_roh|6 years ago|reply
I once saw some code written by a "data scientist". The overall code was non-complex, but the Java/Scala code was the worst of my nightmare. Additionally, I think other engineers have also matured enough to understand that underneath the veneer of data science, the fundamentals do not change much.
[+] [-] ddragon|6 years ago|reply
At some point there is no need to hire a "data scientist", as any python programmer is already expected to know how to use numpy, pandas, sklearn and keras, just like before it was already expected for them to do any kind of data manipulation with SQL without requiring a dedicated database expert.
[+] [-] PLenz|6 years ago|reply
[+] [-] xs83|6 years ago|reply
When someone says they want a "Data Scientist" what they really mean is "I want a Data Scientist who is also a Data Engineer".
I have seen so many companies spend a really decent chunk of money on a data scientist and then are shocked to find that this data scientist doesn't know how to deploy models, set up spark clusters or know how many and what type of GPU they need to use to get the job done.
After all - that is not their purpose.
We were in a similar situation, but what we needed was a Data Engineer - we had a rough idea of where we wanted to go and what we wanted to achieve, he was doing a Masters in Data Science so he had that background as context.
We will look at adding a Data Scientist to our ranks in the future - but they will be working side by side with a Data Engineer who can action their requirements!
[+] [-] ficklepickle|6 years ago|reply
I think the term "data science" is often misused. It seems to make management feel like they are on the cutting edge. They were talking about AI and a R&D department the other day. They aren't even making use of simple heuristics yet! I guess that talk helps with fundraising though.
[+] [-] flensortow|6 years ago|reply
[+] [-] Bostonian|6 years ago|reply
[+] [-] cedricd|6 years ago|reply
[+] [-] usgroup|6 years ago|reply
Data collection will become more prominent IMO because:
1. Data driven business preference, competitive advantage and FOMO. Already dominates sales and marketing. Starting to dominate in product and dev. Already dominates production.
2. IoT, and more data marketplaces resulting from it.
3. Extensions of the global SaaS value chains (usually connected by data).
Hence data science will thrive in the future.
[+] [-] redstone08|6 years ago|reply
Enterprises seems to hire less data scientists actually, but they are trying to raise their employees' data skills.
I think that's the cause of the growth of self-analytics tools. Below are examples of them.
1. Metatron Discovery : https://metatron.app 2. Metabase : https://metabase.com/
[+] [-] sweeeety|6 years ago|reply
I used to work for a small start-up and the CTO was very strict on data access, making my life as feature developer and "data scientist wanna be" almost impossible.
He, on the other hand, had not only access to all data but also used the product as a consumer (which didn't make sense for ICs so we ended just playing with sales demo accounts). I ended leaving the company because of that.
[+] [-] darkhorn|6 years ago|reply
[+] [-] natalyarostova|6 years ago|reply
On the other hand, your standard tech data scientist may find themselves out of their element if needing to design a very rigorous randomized trial for testing a new drug, and making careful inference (I mean I'm sure plenty could, but I'm not going to trust a 25 year old with two years work experience to do that).
[+] [-] sonnyblarney|6 years ago|reply
The reality is that most insight from 'big data' are optimizations. They're not going to move the needle on the business as perhaps we would have hoped.
Data Science focused on ad targeting - now that might move the needled.
And of course, maybe some Data Science working along side AI engineers make a breakthrough which could move the needle.
But from a high level, CEO's view, all of these things have trendy undercurrents, the trick is to figure out how much of it really matters to the business.
The 'wins' for consumers will be slow: maybe better product search, better ads. Maybe they figure out how to send flights around ugly weather or how to slot landing times for an x% decrease in flight delays. Or slot road fixing/lights for an x% decrease in traffic delays.
[+] [-] daturkel|6 years ago|reply
I'm not sure what it would mean for days science to be "just hype." I see DS work on this website alone all the time.
[+] [-] starpilot|6 years ago|reply
> Since academia is typically a lagging indicator in adoption to new trends in the work place, it’s been long enough that it’s truly worrying for junior data scientists, all of who are hoping to find data science positions. It can be very hard for someone with a new degree in data science to find a data science position, given how many new people they’re competing with in the market.
http://veekaybee.github.io/2019/02/13/data-science-is-differ...
[+] [-] dagw|6 years ago|reply
That being said I've observed that the data science techniques and tools that have developed over the past few years have been absorbed and adopted by a lot of people that aren't "data scientist". So while companies are hiring a lot less "data scientists", a lot of "data science" is now done by domain experts and analysts as part of their work.
[+] [-] thedevindevops|6 years ago|reply
[+] [-] RasputinsBro|6 years ago|reply