(no title)
a_zaydak | 5 years ago
When I first started hiring and working with data scientist my view was this: If you can only manipulate data and run it through pipelines to generate models then you can't do enough to be highly valuable. You either need to have a strong enough background in CS to build the pipelines / tools or a strong enough mathematics background to be able to propose cutting edge new ideas. From my experience it is hard to find someone who has one of these skill just from a University "data science" program. At a small company (at least ones that I have worked with) being only proficient in R and basic Python isn't enough. That being said, I have met and handful of Data Scientist who were very smart and self motivated enough to pick up on the lacking skills when given the chance.
My question to HN is this; are there rolls at these larger companies for a Data Scientist who who primarily just crunches data in R and Python without the ability to actually build the pipelines / tools or conduct research?
proverbialbunny|5 years ago
From my experience, there are two types of data scientists who work who do infrastructure work: 1) Those who do not make the best data scientist because their skill set is too far in engineering land, leaving them weak where it counts. If the startup is relying on the data scientist to be profitable, I'd be cautious with these types. or 2) Someone who is senior, beyond senior really, who has worked both jobs, and doesn't mind doing both jobs. This unicorn is so rare it is mythical. The joke when the terminology was created is they're so rare no one has ever seen one, hence unicorn.
Me, I can not do the work I need to do if I'm on call. That is where I draw the line. That means hiring someone to monitor the infrastructure. Furthermore, I'm an okay architect, but you really do want to hire a specialist if you can help it for that. Do I help them with the infrastructure? Absolutely, but they're on call if a server is on fire. They have the admin login credentials, not me.
I get wearing multiple hats, but keep in mind to be a data scientist you're already wearing multiple hats. Being a data scientist is like double majoring and getting a phd. At what point are they stretched too thin? The consensus in the industry is they're already stretched too thin and should be broken up into different specialized roles.
>My question to HN is this; are there rolls at these larger companies for a Data Scientist who who primarily just crunches data in R and Python without the ability to actually build the pipelines / tools or conduct research?
That is the standard role, even at startups. However, the industry consensus these days is data scientists should have more responsibility when it comes to deploying models than previous standards.[1] So data scientists are being pushed in a more engineering direction, not with hosting sql servers and infrastructure, but with working with engineers to make sure the models are monitored properly. This change comes from model deployment being further automated as time goes on, making it easier for the data scientist to have more responsibility during this stage.
[1] source: https://www.dominodatalab.com/static/gfx/uploads/domino-mana... page 9. Suboptimal organization and incentive structures.
a_zaydak|5 years ago
jorpal|5 years ago