Ask HN: Is it a waste of time to teach yourself data science without a degree?
182 points| thewarrior | 9 years ago | reply
This is opposed to regular programming gigs where you can get work based on a portfolio.
Also there are efforts to commoditize common methods and algorithms by wrapping them up in APIs and SDKs.
So is it a waste of time to learn it on the side with the hopes of getting a data science job ?
[+] [-] itamarst|9 years ago|reply
Companies are looking for what you as a candidate can do for them.
Self-study or taking a class signals some level of "I tried to learn this thing." So that's a start.
Even better is "I built X", where X is obviously based on skill you learned. In which case you can omit the class because you have proof of learning, not just trying to learn.
Even better is "I provided business value V to my employer by building X." Because now you're showing how this skill is useful to someone else. So using skill at work is another thing to try.
Ideal is you write the above, but emphasize V (or choose between multiple things you can list) in a way that suggests you can help the needs of the particular company you're applying to.
So there's having the skill (which is good), but there's also how you present it to show it will provide value (also important).
More on the contrast between having engineering skills and marketing yourself here: https://codewithoutrules.com/2017/01/19/specialist-vs-genera...
[+] [-] thro1111111|9 years ago|reply
[+] [-] rdudek|9 years ago|reply
[+] [-] usgroup|9 years ago|reply
[+] [-] imh|9 years ago|reply
If you need structure to go through a few years of coursework on your own, you should go for the degree. If you just want to learn how to put pieces together and not learn how/why they work under the hood, you should opt for something else.
As with most questions about going nontraditional routes, you have to be really good to compensate, and getting really good is constant exhausting work.
[+] [-] endymi0n|9 years ago|reply
I'd say if you don't want to work for a large, respected company first, it's a waste of time. Your degree is your entry ticket to your first job, not more. Later on, you can even work at Google if you want - just make a great product and get acquihired.
Three tips on what you should do instead:
1) BUILD something and show off your skills. Like, continuously. Always have your own challenges, do something about them, put your code online on Github. Host it so it can be seen and played with. Work towards a goal and learn what you need to learn on the side.
2) Focus on applying to companies not listing a degree in their job ad. You'll see there are quite a lot of them.
3) Don't focus on your lack of a degree in any interviews. Don't deny it, but just don't make it seem a deal. Often times, people won't even ask.
[+] [-] cr0sh|9 years ago|reply
I wouldn't necessarily say that.
I don't have a degree (well, an Associates that ain't worth much). I've been employed as a software developer for 25+ years now (since I was 18 years old).
I wish I had pursued a degree, though.
Back then, when it came to my education, I was pretty lazy - at least when it came to more structured learning. I liked to pursue stuff on my own, though, at my own pace. I've done well in that manner.
In 2011, I "discovered" the idea of a MOOC: I took Andrew Ng's "ML Class" - and successfully completed it. That led me Udacity's CS373 course in 2012 - which I also completed successfully.
That isn't to say I didn't struggle with both of those: I had no experience with probabilities and stats, and I hadn't touched linear algebra since high school. But with the help of resources on the internet and elsewhere (along with help from others on the internet, and fellow classmates), I managed to complete both successfully, and I learned a lot in the process.
Last year, I started Udacity's Self-Driving Car Engineer Nanodegree. Today, I'm working on term 2. We're dealing with localization - basically learning SLAM, which was covered in the CS373 course, too. Prior to that, we learned about how Kalman filters (standard, EKF, and UKF) all worked to integrate sensor data. In the first term, as part of one of the projects I implemented NVidia's End-to-End CNN to drive a virtual car around a track.
All of these experiences, and others outside of all this, have taught me that perhaps I cut myself short by not pursuing a degree when I was younger. My current plan is once I finish this Udacity course, I'm going to get my BS online, then work toward an MS in comp sci. It isn't a matter of "I think I can do it" - I know I can do it. It's more a matter of absoluting proving it, and likely learning a lot more along the way.
I don't think a degree is a waste of time, unless you intend never learning more stuff as you "grow older". If your only goal is to "make money" and all that, maybe it is. To me, though, had I gotten my degree back then, I believe I would be much, much further along today. I can't change that, though - so all I can do is move forward.
[+] [-] quadrature|9 years ago|reply
Also to add to that most of the work in ML is feature engineering, data cleaning, testing and building pipelines which all require a good software engineering background.
[+] [-] carlmcqueen|9 years ago|reply
I do a lot of the grunt work of getting the data sourced, cleaned and ready and am called the 'data wizard' and other such annoying names.
What's frustrating is I can run the last lines of code and read and understand the output of the last step, but as the original question asked, management would prefer someone with a phd or masters in customer analytics to be the expert of the data output.
[+] [-] jcadam|9 years ago|reply
I'm not trying to learn about ML for purposes of employment, It's somewhat relevant to my current job, and I may have some interest in using it on my personal projects. But mostly I'm just learning for the 'fun' of it :)
I don't have the time, money, or inclination to pursue a MS in data science atm (My current 'formal' education consists of a BS in Comp Sci and an MBA), but I may go back to school when the kids are grown, more for personal edification than anything else, however. A big shift in career, from software engineer to 'data scientist (or whatever they call it)' is probably not possible at my advanced age (37).
[+] [-] unknown|9 years ago|reply
[deleted]
[+] [-] Helmet|9 years ago|reply
[+] [-] jorgemf|9 years ago|reply
But, honestly, I think it is very difficult to learn data science by yourself. Someone with experience teaching you will make a huge difference. Data science is different than programming as in programming you can see step by step what is happening, in data science most of times it either works or doesn't. And you know it after your algorithm has run through all data for at least an hour. It is really hard to learn this way, you need hints that only someone with experience can provide to you. Moreover you can do a lot of mistakes without knowing it, for example, when cleaning the dataset people use the whole dataset to fill gaps and them split it for training and test. It feels right but that it is a huge mistake that invalidates the whole experiment (because you use information from the test set in the train set, to fill the gaps).
[+] [-] NumberCruncher|9 years ago|reply
If you are like AWS and say that using logistic regression is machine learning, then yes, you can teach yourself data science. Learn SQL, read a couple of books on logistic regression, use some open data for building a couple of models. There are many companies where you can have a decent job and an easy living with SQL and logistic regression on your tool belt.
If you say that data science starts with automating stock trading or building the intelligence of self driving cars, than no, you can not teach yourself data science. You will need at least one degree. Or more.
[+] [-] alkonaut|9 years ago|reply
I think the widely accepted definition is "Statistics, but on a Mac"
[+] [-] jtcond13|9 years ago|reply
[+] [-] monster_group|9 years ago|reply
Isn't it a little presumptive telling others they can't teach themselves something? Or do you mean to imply that they can't get a job without a degree? Those are different things.
[+] [-] StavrosK|9 years ago|reply
Why not? What is it that prevents anyone from learning anything without getting a degree? I disagree with your statement, I think it might be harder, but I don't think anyone "cannot teach themselves X".
[+] [-] CCing|9 years ago|reply
(and it's one of the best self driving software out there)
Of course you need to study(a lot), but a degree is not required.
[+] [-] dagw|9 years ago|reply
Also many jobs that aren't data science jobs per se offer many opportunities to do data science type things. Get a job at a company that works with a type of data you find interesting, and that perhaps doesn't have a dedicated in house data scientist, and every time an interesting data related challenge shows up just go "I have a good idea on how we can approach this" (assuming you actually do). Next thing you know people will coming to you with their data science problems and before you know it you have several years of data science experience on your CV.
[+] [-] ChemicalWarfare|9 years ago|reply
yes, most likely they won't hire you for a "Data Scientist" position, but there are related jobs out there you can be qualified for if you have programming skills and understand DS stuff to some degree.
I've seen setups where a PhD with a "scientist" in his title would act as an architect/co-team lead with a senior engineer running a team of developers.
Someone has to implement DS' ideas after all and unless we're talking a really small team (or a jack of all trades DS) where DS has to write all the code himself - there is a need for developers with "some DS background" in those situations.
[+] [-] jey|9 years ago|reply
I don't have a degree but work as a data scientist at a research institution. I'm self-taught and was originally hired as a software engineer on the basis of my projects and work experience.
It's true that you have to convincingly make the case for your competence, but a bachelor's degree is really at best a certificate of minimal competency in a subject. Its signalling[1] value quickly gets swamped out by actual work experience where you're continually learning and improving. So there's a great hack: just do actual good work and put it on your resume. Your portfolio of work should convey your competence so well that having a degree wouldn't really add anything. (So you can skip the degree, but you'll still have to put in the work.)
Remember that any healthy organization wants to hire for competence at job duties. If some company rejects you for not having a degree because the hiring manager has to cover their ass to upper management instead of optimizing for getting work done, you should really just be glad that you dodged a bullet.
I think what's most important is to keep growing and learning. Pg had it right: "If you're worried that your current job is rotting your brain, it probably is"[2].
1. https://en.wikipedia.org/wiki/Signalling_(economics)
2. http://www.paulgraham.com/gh.html
[+] [-] jtcond13|9 years ago|reply
I've worked around/in data science teams at a large BigCo and I think that you're far overestimating the bar here. There aren't enough people to who can write data pipeline code (SQL/Shell/etc.), much less implement and intelligently explain statistical/ML models. Also, the average decision maker here does not understand the difference between 'created model in Pandas' and 'created model with Amazon's ML API'.
The modal background of data scientists in industry is closer to 'Econ BA + knows Python' than 'Artificial Intelligence PhD'. Moreover, the former will still enjoy a remunerative career if (s)he's sufficiently savvy about identifying problems and showing off how they can be solved with technology.
There may be a point in time when companies can't get a return by throwing math-savvy programmers at a problem, but that will be long after you and I have passed from the scene.
[+] [-] nilkn|9 years ago|reply
(1) You could focus on building data processing platforms using, e.g., Spark. This will get you very close to the data science folks and you could probably end up doing some interdisciplinary work if you wanted it and demonstrated enough interest and competence. At the very least, people who can build highly scalable data processing systems and who also have a reasonable understanding of how the data is being used are very valuable.
(2) There are lots of companies out there that don't engage in data science/machine learning at all. You could join such a company and represent the push towards developing a data science or ML division or team. If you're successful this could also get you major credit as a manager as well as putting you very close to real-world data scientists and ML projects.
[+] [-] EternalData|9 years ago|reply
With that said, a lot of companies hiring for data science roles fall into the category of software startups -- larger companies like Google or Facebook are looking for specialists who tend to hold degrees. But at smaller companies, you can be more of a generalist and there, the old mantra of "show me what you've built" often applies. You could build out a data science career if you found just the right company.
By no means is it easy, but I wouldn't say it's a waste of your time (unless you have some incredible opportunity cost you're using up).
If you were to go about doing it, I found this blog post that can help you with your plan of attack: https://www.springboard.com/blog/learn-data-science-without-...
[+] [-] zengid|9 years ago|reply
It never hurts to learn new things. Another HN poster suggested this channel for beefing up on linear algebra, and I absolutely love it [2].
[1] https://youtu.be/qWJpI2adCcs?t=58m
[2] https://www.youtube.com/playlist?list=PLlXfTHzgMRUKXD88IdzS1...
[+] [-] randcraw|9 years ago|reply
The skills required for each DS role vary a lot. I wouldn't expect a cloud expert to have learned about the Hadoop stack or HPC workflows in school, at least not to a useful degree. The same goes for DBA or business analyst or data wrangler.
But statistics and ML lie at the other end of the spectrum. These roles require a hierarchy of formal skills that are rarely mastered outside of college. They're expected to keep up with the research literature or formal techniques, which almost always requires the math skills of an engineer or mathematician.
Remember, HR everywhere is technically clueless. If management doesn't tell them the precise set of skills needed for the job, they'll minimize risk and ask for more expertise and experience than is needed -- usually in the form of excess degrees or prestige or buzzwords. The best cure for this is to bypass HR and go straight to a technical manager who knows what s/he wants. That's hardest at large corporations, who tend to outsource their HR needs to the lowest bidder.
At a smaller company, a lack of degree will matter less. If you can convince them you know what they need RIGHT NOW and can learn future material quickly, that's what they want to hear. (That's probably what the bosses of the startup did).
Or if you're targeting a specific project, then if you can show (e.g. via Kaggle or an online portfolio) that you clearly have the needed skills and you're not just a script kiddie, that speaks a lot louder than a mere degree (especially if it's over a decade old).
[+] [-] daliwali|9 years ago|reply
Don't listen to them. Every professional will at some point in their career be judged by those less capable.
[+] [-] unknown|9 years ago|reply
[deleted]
[+] [-] eljefe6a|9 years ago|reply
If you're coming from a programming background, I'd suggest becoming a Data Engineer with the goal of becoming a Data Scientist. I've had several students do that. They were general programmers who learned Big Data/data engineering and eventually became more technical Data Scientists. You can start to learn more about the whys here: http://www.jesse-anderson.com/2017/03/what-happens-when-you-....
[+] [-] traviswingo|9 years ago|reply
Don't get so caught up in the "degree."
I've met individuals with graduate degrees in computer science (i know OP asked for data science, but the overall point here applies to any field) that didn't hold a candle to self taught developers. If you're actually passionate and interested about something, you will become extremely well-versed in it. On the other hand, if you're not excited about data science, a degree with probably benefit you more than without one since it will force you to learn the topic.
In a nutshell, it's up to you to make yourself valuable and present that value to the world - a degree is just a shortcut for recruiters to filter on, but you can skip recruiters and talk to anyone in any company.
[+] [-] intellectronica|9 years ago|reply
[+] [-] wellwell|9 years ago|reply
http://cdn.oreillystatic.com/oreilly/radarreport/06369200290...
So no, not futile.
[+] [-] dpflan|9 years ago|reply
Their programs express job placement as a perk of graduation.
https://www.udacity.com/nanodegree
Educating for the "jobs of the future" is one of Udacity's goals, data scientist being one of those jobs.
[+] [-] Raf_|9 years ago|reply
Note that only selected nanodegrees come with the job placement guarantee, and that the guarantee seems to essentially mean a refund, if you fail to find a job within 6 months. https://www.udacity.com/nanodegree/plus
As a sidenote - the deepest (meta) learning I've gotten is that paying for the course made me much more engaged and determined to invest time in understanding the material and completing assignments.
[+] [-] Pandabob|9 years ago|reply
The first three months are basically a walkthrough of Peter Norvigs AI book, and the second part will be about deep learning.