Whenever I see these posts I immediate translate them in my head to "we're in the middle of a talent shortage at a price I am willing to pay."
I've worked with very large amounts of data and high performance computing for most of my career; I mostly had finance related jobs in the last decade or so. I have most of the skill you want, including some you don't know you want. However when salary comes up, that is where we start to part ways. If you are really serious about a shortage, you should be really serious about making offers that can be competitive, but I keep seeing the same $150k offers. That isn't a "shortage" kind of offer.
Are they looking for someone who must have every box ticked or are they looking for someone with enough qualifications yet needing work so much they are willing to undercut themselves? Are they justifying their salary offer because you tick 90% of the boxes and not 100%?
I've been looking for work in data engineering and databases for 9 months, and while I'm certainly not as qualified and experienced as you are, I consider myself capable. I've definitely passed the take home and whiteboard tests I've been given, etc.
When I read about a "shortage," I wonder if this is more indicative of unicorn searching than anything else.
I think it's definitely true. Functionally, I'm a director of data engineering (with a big company, so my real title is way more generic). Usually in the initial screen, we'll talk general dollars, and my number is always out of range. For my level and the fact that I'm reasonably happy where I live now, the number is 200k + relocation (more for Bay Area, but lets not go there), and I don't think that's unreasonable for a director level who is presumably going to also develop your more junior DEs.
I don't fancy it up too much, either. I build teams that make the data move and land it clean so that your PhDs can do the smaaaht stuff with it. I can stack BI and Analytics on top, but a lot of people can do that starting from clean data - and clean data is what I do. But I do get the impression that we're viewed as janitors and plumbers - who you'd be thrilled to see at 3am when your shit(ter) broke, right?
Although your statement is technically true, it is basically meaningless.
Yes, you can always always always find somebody to do a job is your a willing to pay 10 million dollars. That means that "shortages" are impossible. It means that you can never have a shortage in any situation, because you can always pay 10 million dollars for a single visit to the doctor.
But this line of logic isn't very useful when talking about "shortages".
If you had to pay a million dollars for a loaf of bread, is there a shortage of bread? IE, billions of people will starve to death by next week, because they can't afford to buy food.
Most people would say "Yes, there is a shortage of bread".
When people talk about shortages, they are obviously talking about a shortage at a certain price point. There is no other definition of the word shortage that makes sense.
A good definition that I use for the term shortage is "If the government could snap its fingers and instantly produce large amounts of X overnight, would the world be a better place"?
If the answer is "Yes, the world would be in a much much better place", then that means there is a shortage of X. If the answer is "No, the world would only be a little better". Then that means that there is NOT a shortage of X.
Of course this is more or less always true - there are only shortages or excesses of things when prices don't or can't adjust freely.
If there was 1 gallon of water left on earth, Bill gates would buy that gallon for $50 billion, and everyone else would die of dehydration.
There has always been a shortage of maids willing to do all my house work for $10.
And there is a shortage of data engineers at $x, but there wouldn't be a shortage at $1M/year (because less companies would want one, and more people would be willing to do the work).
Maybe you should start a Data Science & Engineering consultancy. The same people who would offer $150K to an employee often have bosses who would love to spend $500K for a person-year of (contract) work if it comes with a high probability of success.
This argument comes up all the time on HN, but I don't think it means anything. It seems to me that the ability to fill an opening by offering more salary can't disprove a talent shortage, because it is always possible to do so.
Thought experiment: If 100 companies had openings for a skill set that only one person could deliver, all 100 companies could eventually fill their openings by sequentially outbidding each other for the services of that one person.
So how would we know if a talent shortage really exists for a certain job? I can think of a couple potential hints: if starting salaries are going up much faster than the national average, or if the unemployment rate for that job is much lower than the national unemployment rate. Either would seem to indicate that, relative to the job market as a whole, there was a greater demand than supply for that particular job.
Yeah, this is a selling problem. It feels like you're far more likely to gain traction starting a data team than taking an IC-track DE role. It's easier for companies to justify $200k+ for your skillset in that case, even if it takes you away from pure engineering.
Alternatively, you can just join a large tech org. Netflix etc. have no problem paying good DEs north of $200k in total comp.
This has been my experience with any "senior" engineering / BI / DS role. There is a particularly high level of price sensitivity to anything above 200k.
In particular, employers whining about lack of X need to ponder raising wages to where employees can afford homes in a city where prices are now within spitting distance of $1k/ft2. When your basic pitch is, "We desperately need [data engineers | machine learning engineers | computer vision engineers | what have you] so desperate to live in CA they'll accept never being able to afford a home unless our lottery tickets pay out", it should be unsurprising they have a hard time finding the talent they claim to need. Or, they could accept remote workers! Even remote workers near sfbay, who just don't want to burn 2.5 hours/day commuting in and out of sf...
We all like $400K the investment bankers make. But Finance Industry had developed a business where they could pay their workers $400K and still make a huge profit for their investors. Except for Googles and Facebooks, the average tech startup is not making Finance industry level profits.
Also Finance requires proper education and training. Not so much for App development. So for everyone who complains about getting $150K offers, there are a 100 thousand people right here in US applying for $60K technical analyst jobs.
We should rename this job position to Data Sanity Engineers.
I have been thrown these projects at work before, where I'm the frontend engineer and I need to make some cool D3 visualization, but low behold the data is shit, and I have to help the backend team make the data useable. It's a mind-numbing job, that nobody wants, because it sounds like a one month task to get a good REST API up and working, but it usually takes three months, because you have to go back and forth making sure the data is right, and there is always 10 tricky edge cases that you have to work some magic on. Not only that but you need to have smart people cleaning the data, so that you don't make some big mistake down the line or your REST API is super slow, and you have to add another couple weeks or month to rework the data again. So that one month becomes three months, and most likely a year, because somebody will say that looks great but can we also add this, and it goes on and on. It's literally a mind-numbing job that most nobody wants. I have found that products like Tableau are the best for this, you still have to clean the data, but it helps speed up the process.
As a contradiction to this point, some people (me) really enjoy working with data, from cleaning, munging, creating, sorting, pipelining, etc, and find front-end visualization production excessively boring and mind-numbing.
Give me emacs and a command line, and I have all the truth I need, which is far more honest, in my mind, than anything that can be created with D3 or Tableau. Beauty is in the eye of the beholder, and it doesn't really do anyone service to look down on the work others find enjoyable. If doing D3 makes you happy, that is awesome, and I can only congratulate you for your passion and your ability to look forward to work I don't "get," and I wish the feelings would be mutual.
Bizarrely, I remember a recent HN discussion where a poster was arguing that any software developer who is not working in machine learning is like a plumber.
I guess this means that the entire profession consists of janitors and plumbers.
Janitors? They are certainly more than janitors! More like plumbers... getting your data safely from point a to point b without plugging things up while passing through [process] boundary's. How much does a plumber cost? $140 / hr? Sounds about right.
Hey I'm the author of this blog post and the CEO of the company that did the benchmark report. That was a very poor choice of words on my part, and I appreciate you flagging it. I reworked the paragraph to remove the janitor comment and (hopefully) make it clearer.
Ignoring the breathless nature of the article, this is a buzzword label for a commodity skill set that pays a commodity salary in tech. It is also the commodity skill set that my employers have all paid me for.
There has been for a long time hype around new technology and labels for business intelligence, data warehousing, big data, and now data engineering/science. I'm not saying there are not some roles in this space that return huge value to organizations, but that these opportunities are much rarer than the buzz indicates.
I wonder if the perceived shortage is mainly hype as the shift to new cloud technologies makes many of the older ideas a little less useful - if you are plowing data into BigQuery, you probably aren't so worried about your star schema data model for reporting.
I would strongly advise people that look at these types of articles to look at the roles in question and ask "Is this role on the critical path to customers paying us?" My experience has been that the answer is often "No." This is bad. I have also seen situations where businesses that do rely on smart data integration can show that they are selling dollar bills for ten cents that still have trouble getting customers on board with spending that ten cents. Business is weird.
I'm trying to switch careers into "Data Engineering" now, as a full stack developer who is more interested in ML, and I've found almost no traction internally at my company or externally. It looks like I may just accept a full stack position at a good company that does a lot of data science for now, but though I would ask - Where are all these jobs?
"Data Engineering" is most of the work that needs to be done, but I think companies haven't identified it as a category.
From my P.O.V., "Full Stack Engineer" is a place you don't want to be because it means putting out fires with whatever junk javascript is in the front end. It seems like everybody who's built a serious javascript application has invented their own Virtual DOM because none of the popular Virtual DOM libraries are good for much other than wasting time and CPU cycles.
"Data Scientist" is a bad title in it's own way, in the sense that "Computer Science" is bad, but worse. To a lot of people there is a Brahmin kind of attitude associated with "Scientist" -- i.e. an aversion to getting your hands dirty. Real world data is pretty dirty and you aren't going to get far in getting value out of it unless you spend 80-90% of your time dealing with the dirt.
My official title is "Data Scientist" although I'm closer to the "ML Engineer" someone else mentions in a child comment.
Frankly speaking, if your company doesn't need a data engineer, it won't hire one or move you into that role. They likely don't, either, if you're experiencing this pushback -- data engineers often develop ETL pipelines or data warehouses, both of which are very useful if your company has a data team and very useless if it does not.
That said, you may want to move closer to my role. There's actually a shortage of data-savvy people who can also write production software, and you would nicely complement a more research-inclined data scientist or analyst -- someone with far more experience with research/analysis than development.
I see tons of them. If you're interested in ML, you're probably more looking towards data science. Data engineering (in general) is more about getting the data in a state where it can be used (extracted, cleaned, moved, transformed, etc.) at least from what i've commonly seen in the industry. A decent breakdown is here: https://blog.insightdatascience.com/data-science-vs-data-eng...
You might want to look at "Machine Learning Engineer" positions if you want to do ML in practice, it's starting to be a title I see somewhat often now.
As others have pointed out Data Engineering is more about building data pipelines, making architecture decisions for your ML stack, things like that. Less about model building, prototyping and training, which is what I think of when somebody says they 'do' ML.
Every time something comes up on HN about a talent shortage in a field related to software engineering, it hurts. I have been unsuccessfully looking for a full time position since my last start up (I was not a founder) folded six months ago. I have been on over 25 in person interviews and gone through untold degrading whiteboard interviews, code tests, trick questions, and take home projects; all have ended in rejection. This industry has a need to torture candidates because we are all considered to be liars by default. Much is said about combating impostor syndrome in ourselves but we are too eager to engender it in others.
It seems people in this industry refuse to understand that some people are not perfect. I never graduated college because I hated it with the very fiber of my being, so I am not particularly great at white boarding answers to algorithm questions off the top of my head in a high pressure environment. If I need them during my job, I look up answers and learn from people who are much smarter than I am.
My personal identity has been shattered, as I thought my ~5-10 year history of success in the industry indicated I was in demand and talented. I saw posts like this and thought that if the worst happened I'd still be able to find a job. The idea that there is a talent shortage is a lie, or candidates like me wouldn't be treated as I have been. I'm not asking for a free job, or a handout. I have had a successful career so far and am capable of doing good work. But I'm not a specialist in Big Data Machine Learning Neural Networks.
I have struggled with bipolar disorder and suicidal ideation most of my life. I've dealt with the death of my beloved grandmother and my father who was instrumental in my choosing to be an engineer with only minor lapses in control. Nothing has caused me to consider taking my own life as much as the past 6 months. It seems there is no future for me in the only career I have any skill in and which is a huge part of my identity. And to constantly be told that there is such a shortage of engineers only salts the wound.
" I have been on over 25 in person interviews and gone through untold degrading whiteboard interviews, code tests, trick questions, and take home projects; all have ended in rejection."
The fact that you pulled through 25 of them is already commendable. Unfortunately as a labor provider you'll be subjected to all kinds of crap for the privilege of working.
Every single person on here needs to have a secondary business going on right now. Doesn't have to be a highly skilled industry either, selling hand made stuff on Etsy can be a lifeline in these situations.
Hey, I'm going through something similar. I had to quit an amazing job because my wife and I pursued a dream and moved to Europe (no remote).
I had always had an easy time getting a job before but this time it was different. Granted I knew it'd be tougher since for remote jobs, the world is the competition. But it was a summer of endless shitty timed hackerrank-style tests (virtual whiteboard hazing). I would tell my co-workers about them and they'd laugh in bewilderment at the questions that were asked in what should be a technical screener, and these are extremely smart and productive software guys that have started companies, written books, give conference talks. One funny question I got for a frontend React job: write a function that takes a sequence of bits that represent a negative-binary number (not a base-2 number that is negative, but a base-(-2) number) and return its negated value in base-2. For a frontend job. It was one of 4 questions to be answered in 90 minutes. gtfo.
A few companies would reply, most strung me along while -- I realize now -- they were keeping me as a backup(-backup) incase their "A-player" turned them down. Countless interviews, hours on takehome projects, it was tough. I learned to cut bait if the company was slow to move forward, had weeklong periods of no communication, etc.
I (just very recently) found it's easier to land small contract gigs because the barrier to entry seems to be lower, demonstrate value, and keep getting work from those guys after the initial project was done. It is different but so far I actually like the freedom that comes with contracting. I haven't been at it long enough to experience the downsides.
There's definitely not a shortage of talent. It's that every company thinks they need "A-players", when the vast, vast majority are doing a damn basic CRUD app.
Just wanted to say I hear you brother and share my story in some solidarity. You will find something, just keep plugging away. Each "failed" attempt makes you better no matter how many attempts it takes. Cliche of course but it is true. I am very lucky in that I don't face the mental demons you do, even then this job search hit me pretty hard. Please be proactive and take care of yourself, body and mind (body goes a long way toward mind also).
I've heard more than one CTO/Sr. Engineer refer to people in these roles as 'data grunts' or something similarly dismissive. Then they're mystified as to why solid engineers are so quick to move up or out, year after year.
anything and everything is marketed as "data science" and "data engineering" these days becasue this is the buzzword of the day.
I've been dealing with large data even before "big data" was a word but i dont call myself "data scientist" or "data engineer". I am still a software engineer working on what benefits my organization.
"Serial Entrepreneur" is the same these days, claimed by anyone who had a lemonade stand as a kid.
> I am still a software engineer working on what benefits my organization
But if you saw a nearby local maximum that's higher than your current local maximum, wouldn't you change what you call yourself, if it means being paid more but doing the same work?
This is similar to how the average "software engineer" makes about $30k/year more than the average "programmer".
I really enjoy that kind of work but it is difficult to articulate your business value in that environment. The best thing is working closely with a data scientist/front-end dev who can deliver products to the analysts and executives that need the data and make sure that you get the credit for enabling new streams of data. But most of the time you are putting out someone else's dumpster fire.
One advantage of data engineering: unlike front-end work, there are few non-technical people who will have an opinion on how you are doing things and burden you with bikeshedding.
There are 6600 jobs listed and 6500 individuals on LinkedIn with that particular title, and therefore there's a shortage? Seriously?
* How many aren't on LinkedIn?
* Since the whole article is about how the job title is poorly defined and growing in prevalence, why would you assume that people who don't already have such a job would use the term?
* The "growth" charts on the full study are just as bad - how much of that is just from renaming existing generic developer positions, since "data engineer" is clearly a relatively new term?
6500 data engineers on all of Linkedin, but 6600 job openings in the bay area. so there are more job openings in one area than all data engineers on linkedin
The fact that the original, unmodified article referred to data engineers as "janitors" pretty much says it all.
It's very analogous to front-office and back-office work in Investment Banking. "Data Scientist" are the front-office, with all the prestige, and "Data Engineers" are the back-office, doing a lot of the heavy lifting without nearly as much recognition.
In my opinion there shouldn't be a delineation. You shouldn't be a data scientist if you can't gather, process, and clean up your own data.
Ideally you'd have a symbiosis, and each side would recognize the importance of the other.
Even if you require your data scientists to be able to do engineering work, it's probably way more efficient to have some good generalist Software Engineers doing all the "pre-math" work and freeing your statisticians up for what they're (hopefully) good at.
Plus as a side effect, your software will probably be better.
Data engineering sounds much better than "data plumbing", but in my experience the latter is a more accurate description of the work of a data engineer: Building -and often unclogging- pipes that transport data from A to B, and putting in filters to clean it and extract the useful bits.
So why not change your LinkedIn job title to "data plumber", which is sure to get you some serious recruiter attention ;)
I worked for about 10 years doing exactly what they want, but I ended up having to write a lot of the tools which means I'm not able to check the boxes on some tool you require which gets me punted by HR.
I'm starting to think that the message is if HR is going to do checklists then developers should really make sure they work mostly with contracts that use popular checklist items.
As a data person I would really like to put some numbers on how much the typical HR hiring process costs a business. I don't know anybody that says they are happy with how hiring works in he tech industry but I've also never seen an HR person try and improve the process.
Quick sidenote, anyone know where the databases / distributed systems engineering jobs are at? E.g. if one wanted to not use these tools but also go help build these tools?
I can think of Facebook, Google, Microsoft, IBM (which locations and groups within these companies / where?). I can also think of Confluent, CitusDB, Databricks, etc.
Market Research is a $40B industry that depends almost completely on these concepts. I'm not sure how prevalent distributed systems are with MR companies, but that's an implementation detail anyway.
Weirdly the problem is most hires have it backwards.
Before going out to the market and discovering what talent exists and consequently what salary it will take to get them to join (ie negotiate) most organisations decide on a salary range, usually reflecting the current internal structure not the current external market.
The longer an organisation has existed the more out of whack with the market its internal set up is.
As such companies decide on their price point first, then go looking. Which is of course backwards.
These "shortage" stories always make me roll my eyes, because they're usually about money more than anything. And money is usually about cost of living more than anything.
If you choose to locate your company in one of the highest cost of living regions in the world, then you are complicit in the "shortage". Supply and demand - pay up. Or don't.
I am a data engineer working on a machine learning team with models actively used as part of our product(s).
From my experiences working in various contexts (applied machine learning, analytics, policy research, academics, etc...), there are several of factors that contribute to this shortage: (1) "data engineering" often requires a lot of breadth and knowledge, (2) "data engineering" is often (derisively and naively) referred to as the "janitorial work" of data science, (3) the spectrum of roles and requirements within the "data engineering" domain, in terms of job descriptions, can range from database systems administration, to ETL, to data warehousing, curation of data services / APIs, business intelligence, to the design/deployment/operation of pipelines and distributed data processing and storage systems (these aren't mutually exclusive, but often job descriptions fall into one of these stovepipes).
Some of my quick thoughts and anecdata:
Companies have made large investments in creating 'data science' teams, and many of those companies have trouble realizing value from those investments.
A part of this stems from investments and teams with no tangible vision of how that team will generate value. And there are several other contributing factors…
"Dirty work." People haven't learned how to, and more often don't want to do it. There's a vast number of tutorials and boot camps out there that teach newcomers how to "learn data science" with clean datasets -- this is ideal for learning those basics, but the real world usually does not have clean or ideal datasets -- the dataset may not even exist -- and there are a number of non-ideal constraints.
There are people that wish to call themselves “data scientists” that “don’t want to write code” and would “prefer to do the analysis and storytelling”
Engineering as the application of science with real world constraints: there are a number of factors that we take into account, often acquired through painful experience, that aren’t part of these tutorials, bootcamps, or academic environments.
Many “data scientists” I’ve met have a hard time adapting to and working with these constraints (e.g. we believe that the application of data science would solve/address __ problem, but: how do we know and show that it works and is useful? what are the dependencies, and costs of developing and applying that solution? is it a one-time solution, or is it going to be a recurring application? does the solution require people? who will use it? what are the assumptions or expectations of those operators and users? is it suitable? is it maintainable? is it sustainable? how long will it take? what are the risks involved and how do we manage them? is it re-usable, and can we amortize its costs over time? is it worth doing? This is part of a methodology that comes from experience, versus what is taught in data science)
Larger teams with more people/financial/political resources can specialize and take advantage of these divisions of labor, which helps recognize the process aspects of applying data science and address some of the above
Short story: if you view data engineering as "janitorial work" you're missing the big picture
Anyone else notice that the attributes of a 'unicorn' data scientist include the traits of a 'data engineer?'
How does one get started with this? I suppose a lot of people who hang out at HN are competent devs good in programming and databases, but probably beginners in math, ML, AI etc. How does such a person get started and find a job in this field?
in my mind the problem is really simple: most executives aren't smart enough to understand how any of this shit works, or build a compelling business case around it. they just know they need a 'big data' team, so it just dies on the vine.
someone with enough smarts to build/lead a team, sell to executive management, and have an actual business application is just too rare compared to the prevalence of the engineering talent.
It was only 20 years ago that companies hired a "web master" or a generalist to do everything. But pieces of those jobs became specialized. Now we need UX, UI programmer, general engineers, dev ops, data engineers, a data scientist, etc.
And how many companies are still interviewing with fizzbuzz?
So I know SQL, Python, Django, Java (though its been a while), Javascrit, Linux, some cloud computing and a bit of devops. Am I a data engineer? Software engineer, with a lot of database background? What makes a data engineer different from a software engineer?
- The challenge for an organization is to recognize that there is a significant difference between the 'data engineer' working on a vertical project and the 'data engineer' responsible for integrating data across the enterprise.
- The project 'data engineer', in today's world, most likely will be a software developer responsible for ETL, etc. The data design will be more or less up to the software developer.
- An enterprise 'data engineer' is more concerned with data that affects the enterprise. This typically involves some sort of data integration. For example, how to integrate relevant data from N projects (e.g. A,B,C .. Z) where each project has its own idea of how to represent similar concepts (e.g. person, user, customer), with different provenance, truth assertions, access rules, data retention periods, granularity of metadata (e.g. at the attribute level vs entity level), etc. The enterprise is interested in questions like 'What did we know and when did we know it?", etc. The enterprise 'data engineer' will probably levy requirements on the project 'data engineer' to meet the enterprise's needs.
Just checked, the # of data engineers rose to 9,246 (42%) in the last six months. So, the shortage is at least being addressed by people changing their job titles on LinkedIn.
jnordwick|9 years ago
I've worked with very large amounts of data and high performance computing for most of my career; I mostly had finance related jobs in the last decade or so. I have most of the skill you want, including some you don't know you want. However when salary comes up, that is where we start to part ways. If you are really serious about a shortage, you should be really serious about making offers that can be competitive, but I keep seeing the same $150k offers. That isn't a "shortage" kind of offer.
dizzystar|9 years ago
I've been looking for work in data engineering and databases for 9 months, and while I'm certainly not as qualified and experienced as you are, I consider myself capable. I've definitely passed the take home and whiteboard tests I've been given, etc.
When I read about a "shortage," I wonder if this is more indicative of unicorn searching than anything else.
SmellTheGlove|9 years ago
I don't fancy it up too much, either. I build teams that make the data move and land it clean so that your PhDs can do the smaaaht stuff with it. I can stack BI and Analytics on top, but a lot of people can do that starting from clean data - and clean data is what I do. But I do get the impression that we're viewed as janitors and plumbers - who you'd be thrilled to see at 3am when your shit(ter) broke, right?
stale2002|9 years ago
Yes, you can always always always find somebody to do a job is your a willing to pay 10 million dollars. That means that "shortages" are impossible. It means that you can never have a shortage in any situation, because you can always pay 10 million dollars for a single visit to the doctor.
But this line of logic isn't very useful when talking about "shortages".
If you had to pay a million dollars for a loaf of bread, is there a shortage of bread? IE, billions of people will starve to death by next week, because they can't afford to buy food.
Most people would say "Yes, there is a shortage of bread".
When people talk about shortages, they are obviously talking about a shortage at a certain price point. There is no other definition of the word shortage that makes sense.
A good definition that I use for the term shortage is "If the government could snap its fingers and instantly produce large amounts of X overnight, would the world be a better place"?
If the answer is "Yes, the world would be in a much much better place", then that means there is a shortage of X. If the answer is "No, the world would only be a little better". Then that means that there is NOT a shortage of X.
tuna-piano|9 years ago
If there was 1 gallon of water left on earth, Bill gates would buy that gallon for $50 billion, and everyone else would die of dehydration.
There has always been a shortage of maids willing to do all my house work for $10.
And there is a shortage of data engineers at $x, but there wouldn't be a shortage at $1M/year (because less companies would want one, and more people would be willing to do the work).
biztos|9 years ago
snowwrestler|9 years ago
Thought experiment: If 100 companies had openings for a skill set that only one person could deliver, all 100 companies could eventually fill their openings by sequentially outbidding each other for the services of that one person.
So how would we know if a talent shortage really exists for a certain job? I can think of a couple potential hints: if starting salaries are going up much faster than the national average, or if the unemployment rate for that job is much lower than the national unemployment rate. Either would seem to indicate that, relative to the job market as a whole, there was a greater demand than supply for that particular job.
achompas|9 years ago
Alternatively, you can just join a large tech org. Netflix etc. have no problem paying good DEs north of $200k in total comp.
tobyjsullivan|9 years ago
sdoowpilihp|9 years ago
x0x0|9 years ago
harichinnan|9 years ago
Also Finance requires proper education and training. Not so much for App development. So for everyone who complains about getting $150K offers, there are a 100 thousand people right here in US applying for $60K technical analyst jobs.
bobosha|9 years ago
That's true for just about anything.
"there is no epipen crisis, only a crisis at what you are willing to pay"
"There is no poverty , only poverty at a given income level"
*"there is no crime problem, only crime problem at a given crime level"
what you are saying is self-contradictory. If you (or others) are able to turn down 150K offers...you know what you are.
unknown|9 years ago
[deleted]
whenwillitstop|9 years ago
mrharrison|9 years ago
I have been thrown these projects at work before, where I'm the frontend engineer and I need to make some cool D3 visualization, but low behold the data is shit, and I have to help the backend team make the data useable. It's a mind-numbing job, that nobody wants, because it sounds like a one month task to get a good REST API up and working, but it usually takes three months, because you have to go back and forth making sure the data is right, and there is always 10 tricky edge cases that you have to work some magic on. Not only that but you need to have smart people cleaning the data, so that you don't make some big mistake down the line or your REST API is super slow, and you have to add another couple weeks or month to rework the data again. So that one month becomes three months, and most likely a year, because somebody will say that looks great but can we also add this, and it goes on and on. It's literally a mind-numbing job that most nobody wants. I have found that products like Tableau are the best for this, you still have to clean the data, but it helps speed up the process.
Data cleaning is a super golden problem to solve.
dizzystar|9 years ago
Give me emacs and a command line, and I have all the truth I need, which is far more honest, in my mind, than anything that can be created with D3 or Tableau. Beauty is in the eye of the beholder, and it doesn't really do anyone service to look down on the work others find enjoyable. If doing D3 makes you happy, that is awesome, and I can only congratulate you for your passion and your ability to look forward to work I don't "get," and I wish the feelings would be mutual.
kafkaesq|9 years ago
Which are difficult to find when you think of them as "janitors", and treat them accordingly.
msie|9 years ago
SmellTheGlove|9 years ago
dmatthewson|9 years ago
Hm, I wonder why he's having problems hiring janitors.
pavlov|9 years ago
I guess this means that the entire profession consists of janitors and plumbers.
jrs235|9 years ago
kafkaesq|9 years ago
In a boldface font, no less. The cockiness behind that language is really quite astounding.
jakestein|9 years ago
tom_b|9 years ago
There has been for a long time hype around new technology and labels for business intelligence, data warehousing, big data, and now data engineering/science. I'm not saying there are not some roles in this space that return huge value to organizations, but that these opportunities are much rarer than the buzz indicates.
I wonder if the perceived shortage is mainly hype as the shift to new cloud technologies makes many of the older ideas a little less useful - if you are plowing data into BigQuery, you probably aren't so worried about your star schema data model for reporting.
I would strongly advise people that look at these types of articles to look at the roles in question and ask "Is this role on the critical path to customers paying us?" My experience has been that the answer is often "No." This is bad. I have also seen situations where businesses that do rely on smart data integration can show that they are selling dollar bills for ten cents that still have trouble getting customers on board with spending that ten cents. Business is weird.
mattnewton|9 years ago
PaulHoule|9 years ago
From my P.O.V., "Full Stack Engineer" is a place you don't want to be because it means putting out fires with whatever junk javascript is in the front end. It seems like everybody who's built a serious javascript application has invented their own Virtual DOM because none of the popular Virtual DOM libraries are good for much other than wasting time and CPU cycles.
"Data Scientist" is a bad title in it's own way, in the sense that "Computer Science" is bad, but worse. To a lot of people there is a Brahmin kind of attitude associated with "Scientist" -- i.e. an aversion to getting your hands dirty. Real world data is pretty dirty and you aren't going to get far in getting value out of it unless you spend 80-90% of your time dealing with the dirt.
achompas|9 years ago
Frankly speaking, if your company doesn't need a data engineer, it won't hire one or move you into that role. They likely don't, either, if you're experiencing this pushback -- data engineers often develop ETL pipelines or data warehouses, both of which are very useful if your company has a data team and very useless if it does not.
That said, you may want to move closer to my role. There's actually a shortage of data-savvy people who can also write production software, and you would nicely complement a more research-inclined data scientist or analyst -- someone with far more experience with research/analysis than development.
willis77|9 years ago
ironchef|9 years ago
alexbeloi|9 years ago
As others have pointed out Data Engineering is more about building data pipelines, making architecture decisions for your ML stack, things like that. Less about model building, prototyping and training, which is what I think of when somebody says they 'do' ML.
minimaxir|9 years ago
bcbrown|9 years ago
ef5a0b0628|9 years ago
It seems people in this industry refuse to understand that some people are not perfect. I never graduated college because I hated it with the very fiber of my being, so I am not particularly great at white boarding answers to algorithm questions off the top of my head in a high pressure environment. If I need them during my job, I look up answers and learn from people who are much smarter than I am.
My personal identity has been shattered, as I thought my ~5-10 year history of success in the industry indicated I was in demand and talented. I saw posts like this and thought that if the worst happened I'd still be able to find a job. The idea that there is a talent shortage is a lie, or candidates like me wouldn't be treated as I have been. I'm not asking for a free job, or a handout. I have had a successful career so far and am capable of doing good work. But I'm not a specialist in Big Data Machine Learning Neural Networks.
I have struggled with bipolar disorder and suicidal ideation most of my life. I've dealt with the death of my beloved grandmother and my father who was instrumental in my choosing to be an engineer with only minor lapses in control. Nothing has caused me to consider taking my own life as much as the past 6 months. It seems there is no future for me in the only career I have any skill in and which is a huge part of my identity. And to constantly be told that there is such a shortage of engineers only salts the wound.
googletazer|9 years ago
The fact that you pulled through 25 of them is already commendable. Unfortunately as a labor provider you'll be subjected to all kinds of crap for the privilege of working.
Every single person on here needs to have a secondary business going on right now. Doesn't have to be a highly skilled industry either, selling hand made stuff on Etsy can be a lifeline in these situations.
ultramagas|9 years ago
I had always had an easy time getting a job before but this time it was different. Granted I knew it'd be tougher since for remote jobs, the world is the competition. But it was a summer of endless shitty timed hackerrank-style tests (virtual whiteboard hazing). I would tell my co-workers about them and they'd laugh in bewilderment at the questions that were asked in what should be a technical screener, and these are extremely smart and productive software guys that have started companies, written books, give conference talks. One funny question I got for a frontend React job: write a function that takes a sequence of bits that represent a negative-binary number (not a base-2 number that is negative, but a base-(-2) number) and return its negated value in base-2. For a frontend job. It was one of 4 questions to be answered in 90 minutes. gtfo.
A few companies would reply, most strung me along while -- I realize now -- they were keeping me as a backup(-backup) incase their "A-player" turned them down. Countless interviews, hours on takehome projects, it was tough. I learned to cut bait if the company was slow to move forward, had weeklong periods of no communication, etc.
I (just very recently) found it's easier to land small contract gigs because the barrier to entry seems to be lower, demonstrate value, and keep getting work from those guys after the initial project was done. It is different but so far I actually like the freedom that comes with contracting. I haven't been at it long enough to experience the downsides.
There's definitely not a shortage of talent. It's that every company thinks they need "A-players", when the vast, vast majority are doing a damn basic CRUD app.
Just wanted to say I hear you brother and share my story in some solidarity. You will find something, just keep plugging away. Each "failed" attempt makes you better no matter how many attempts it takes. Cliche of course but it is true. I am very lucky in that I don't face the mental demons you do, even then this job search hit me pretty hard. Please be proactive and take care of yourself, body and mind (body goes a long way toward mind also).
rch|9 years ago
skynetv2|9 years ago
I've been dealing with large data even before "big data" was a word but i dont call myself "data scientist" or "data engineer". I am still a software engineer working on what benefits my organization.
"Serial Entrepreneur" is the same these days, claimed by anyone who had a lemonade stand as a kid.
Swizec|9 years ago
But if you saw a nearby local maximum that's higher than your current local maximum, wouldn't you change what you call yourself, if it means being paid more but doing the same work?
This is similar to how the average "software engineer" makes about $30k/year more than the average "programmer".
jboggan|9 years ago
I really enjoy that kind of work but it is difficult to articulate your business value in that environment. The best thing is working closely with a data scientist/front-end dev who can deliver products to the analysts and executives that need the data and make sure that you get the credit for enabling new streams of data. But most of the time you are putting out someone else's dumpster fire.
One advantage of data engineering: unlike front-end work, there are few non-technical people who will have an opinion on how you are doing things and burden you with bikeshedding.
[0] - http://www.avclub.com/tvclub/its-always-sunny-philadelphia-c...
GeneralMayhem|9 years ago
* How many aren't on LinkedIn?
* Since the whole article is about how the job title is poorly defined and growing in prevalence, why would you assume that people who don't already have such a job would use the term?
* The "growth" charts on the full study are just as bad - how much of that is just from renaming existing generic developer positions, since "data engineer" is clearly a relatively new term?
sportanova|9 years ago
binalpatel|9 years ago
It's very analogous to front-office and back-office work in Investment Banking. "Data Scientist" are the front-office, with all the prestige, and "Data Engineers" are the back-office, doing a lot of the heavy lifting without nearly as much recognition.
In my opinion there shouldn't be a delineation. You shouldn't be a data scientist if you can't gather, process, and clean up your own data.
biztos|9 years ago
Even if you require your data scientists to be able to do engineering work, it's probably way more efficient to have some good generalist Software Engineers doing all the "pre-math" work and freeing your statisticians up for what they're (hopefully) good at.
Plus as a side effect, your software will probably be better.
ThePhysicist|9 years ago
So why not change your LinkedIn job title to "data plumber", which is sure to get you some serious recruiter attention ;)
untilHellbanned|9 years ago
Looks like we need more English engineers too.
cutler|9 years ago
protomyth|9 years ago
I'm starting to think that the message is if HR is going to do checklists then developers should really make sure they work mostly with contracts that use popular checklist items.
mulmen|9 years ago
makmanalp|9 years ago
I can think of Facebook, Google, Microsoft, IBM (which locations and groups within these companies / where?). I can also think of Confluent, CitusDB, Databricks, etc.
rhizome|9 years ago
lifeisstillgood|9 years ago
Before going out to the market and discovering what talent exists and consequently what salary it will take to get them to join (ie negotiate) most organisations decide on a salary range, usually reflecting the current internal structure not the current external market.
The longer an organisation has existed the more out of whack with the market its internal set up is.
As such companies decide on their price point first, then go looking. Which is of course backwards.
otto_ortega|9 years ago
collyw|9 years ago
realworldview|9 years ago
slantedview|9 years ago
If you choose to locate your company in one of the highest cost of living regions in the world, then you are complicit in the "shortage". Supply and demand - pay up. Or don't.
moandcompany|9 years ago
From my experiences working in various contexts (applied machine learning, analytics, policy research, academics, etc...), there are several of factors that contribute to this shortage: (1) "data engineering" often requires a lot of breadth and knowledge, (2) "data engineering" is often (derisively and naively) referred to as the "janitorial work" of data science, (3) the spectrum of roles and requirements within the "data engineering" domain, in terms of job descriptions, can range from database systems administration, to ETL, to data warehousing, curation of data services / APIs, business intelligence, to the design/deployment/operation of pipelines and distributed data processing and storage systems (these aren't mutually exclusive, but often job descriptions fall into one of these stovepipes).
Some of my quick thoughts and anecdata:
Companies have made large investments in creating 'data science' teams, and many of those companies have trouble realizing value from those investments.
A part of this stems from investments and teams with no tangible vision of how that team will generate value. And there are several other contributing factors…
"Dirty work." People haven't learned how to, and more often don't want to do it. There's a vast number of tutorials and boot camps out there that teach newcomers how to "learn data science" with clean datasets -- this is ideal for learning those basics, but the real world usually does not have clean or ideal datasets -- the dataset may not even exist -- and there are a number of non-ideal constraints.
There are people that wish to call themselves “data scientists” that “don’t want to write code” and would “prefer to do the analysis and storytelling”
Engineering as the application of science with real world constraints: there are a number of factors that we take into account, often acquired through painful experience, that aren’t part of these tutorials, bootcamps, or academic environments.
Many “data scientists” I’ve met have a hard time adapting to and working with these constraints (e.g. we believe that the application of data science would solve/address __ problem, but: how do we know and show that it works and is useful? what are the dependencies, and costs of developing and applying that solution? is it a one-time solution, or is it going to be a recurring application? does the solution require people? who will use it? what are the assumptions or expectations of those operators and users? is it suitable? is it maintainable? is it sustainable? how long will it take? what are the risks involved and how do we manage them? is it re-usable, and can we amortize its costs over time? is it worth doing? This is part of a methodology that comes from experience, versus what is taught in data science)
Larger teams with more people/financial/political resources can specialize and take advantage of these divisions of labor, which helps recognize the process aspects of applying data science and address some of the above
Short story: if you view data engineering as "janitorial work" you're missing the big picture
Anyone else notice that the attributes of a 'unicorn' data scientist include the traits of a 'data engineer?'
vijayr|9 years ago
beachstartup|9 years ago
someone with enough smarts to build/lead a team, sell to executive management, and have an actual business application is just too rare compared to the prevalence of the engineering talent.
cheriot|9 years ago
And how many companies are still interviewing with fizzbuzz?
collyw|9 years ago
njd|9 years ago
- The project 'data engineer', in today's world, most likely will be a software developer responsible for ETL, etc. The data design will be more or less up to the software developer.
- An enterprise 'data engineer' is more concerned with data that affects the enterprise. This typically involves some sort of data integration. For example, how to integrate relevant data from N projects (e.g. A,B,C .. Z) where each project has its own idea of how to represent similar concepts (e.g. person, user, customer), with different provenance, truth assertions, access rules, data retention periods, granularity of metadata (e.g. at the attribute level vs entity level), etc. The enterprise is interested in questions like 'What did we know and when did we know it?", etc. The enterprise 'data engineer' will probably levy requirements on the project 'data engineer' to meet the enterprise's needs.
LawrenceHecht|9 years ago
wpiel|9 years ago
I'm not even sure if I'm being sarcastic.
edoceo|9 years ago
But only 1 out of 100 are qualified :(