In my ~25 years of professional software development, the single biggest factor in productivity for me has been whether I was involved at the start of a project. Knowing the initial design decisions, and being comfortable changing anything, allows me to be orders of magnitude more productive than when I'm diving into existing code designed by someone else.
I saw this perhaps most acutely with a company I sold - for a couple of years I was more productive than on nearly any other large software project I've worked on, because I knew the ins and outs of everything. The developers who bought it and took over are probably better developers than I am, and they are unquestionably excellent coders, yet it took a couple of years for them to get productive at making even medium sized changes. It became incredibly obvious to me how handicapped you are diving into something someone else made, especially if the original designer isn't there anymore.
Meshing really well with managers & PMs is probably the next biggest factor in my own experience, but it doesn't come even close to the gap between being there from day 1 vs coming in much later.
> Productivity tracking tools and incentive programs will never have as great an impact as a positive culture in the workplace.
I'm a fan of choosing to use time management apps and productivity tools to manage my own budgets. But I admit that I hate it when I have to do it for someone else.
One of the most powerful benefits from being there since the start is the complete confidence in ripping out and deleting obsolete code later. Even good developers new to the project are afraid to do this, and they should be since it's very risky without the full context.
The natural trajectory for a project is to keep adding features until it collapses from its own weight. Only the long tenure developer can fight this and revitalize a project by removing the useless excess.
Couldn't agree more. I've noticed this many times in my 16+ yrs into the industry. Not recognizing this is one of the primary reason line managers letting an employee (esp those who are involved from the beginning) go, if they are asking for a raise, thinking they are easily replaceable. It really costs the company.
This has been observed long ago, and is known as Brooks's Law [0].
Building software is a knowledge business, and there are three types of knowledge involved:
1. Subject knowledge: understanding of the subject the software is about (e.g. accounting when building accountancy software).
2. Platform knowledge: understanding of the platform used to build the software (e.g. Python, SQL, React etc).
3. Architecture knowledge, which is what the parent is talking about: understanding of the specific choices made in the development, being aware of all the Chesterton's Fences [1] etc.
Very much agree. It honestly shocks me that more people don't recognize this. I think as humans we just tend to forget stuff over time, as we go, and as our perspective changes.
Make sure no code base would take more then a month to rewrite. That way it can be rewritten. But some tests could probably be reused, so tests should not be too intertwined with the code, tests should work independently from the code it tests.
This also applies for product management imho. At least for me it does. If I have to start a product from start it is much better than taking over another person's product.
When you tell new engineers about this target they see a great opportunity to game it, just ship smaller changes. It turns out that smaller changes are quicker to ship. Lead to better code and tests. Have lower risk of cancellation and problems in production. And lead to earlier and better feedback.
Inspired by Goodhart’s Law I'll propose the following: A measure that when it becomes a target improves productivity. ~Sijbrandij's Law
Your proposed law has been tried for many years, by just as many good-willed people who believed their measures would result in target increases. In fact, the entire industry is being bombarded by one such methodology that includes those measures: Scrum. Must we really repeat the years of complaints, criticism and debates to show any measure can get warped and gamed to the point it only vaguely resembles a tool of productivity?
So we get young, naive engineers to focus on small changes. Cool, probably as it should be, you gotta start somewhere. And when these developers get hungry for bigger projects, when they get bored implementing the umpteenth small and by that point (for them) trivial change, how do you encourage them to tackle bigger technical problems? Those that lay the foundation for the new people to do their job more easily and on-board quicker? Or did you actually not tell us all, and you measure far more than just the number of changes?
It seems like a useful aggregate metric, but is it also used to rate individuals? For that purpose, it seems like it would be terrible. What if you have an experienced staff member from whom everyone else constantly seeks advice? That person may be having a positive impact that isn't visible as merge requests.
I have never had the opportunity to try this method but after much theory crafting and many useless pointing meetings and much statistical investigation with negative correlation between "complexity" and "time to completion" I can only think that this is the correct way to get velocity.
Productivity as a software developer consists chiefly in not making mistakes. That can lead to the situation where your best developers may appear to do nothing for long stretches. Research and deliberation are desirable. Blind hacking is the least valuable yet most visible activity of inexperienced programmers. All common "objective" measures of productivity such as closed tickets, lines of code, or PRs are seriously flawed.
The most common implementation I've seen of this are senior developers who write almost no code, but spend their time telling junior developers what to go back and reimplement, giving advice, etc. It's still the junior programmer's actually getting the product written, just with nudges in the right direction. Not a bad system really, reminds me of the relationship between officers and enlisted in the military.
I'm currently building models around high functioning software developers/projects and what I've noticed is, churn can swing quite a bit. Take the following for example:
Over 150 days, this developer's churn really fluctuates and that's because they work on different things, that requires different amounts of code. And if you look at the following:
you can see they still commit regularly, but as the Reviewability section shows, their changes are mainly small ones, which sort of aligns with what Sid (sytse) mentioned, which is mainly focusing on small changes.
The churn for the project microsoft/vscode fluctuates quite a bit as well.
Based on what I've learned so far, you really need a good baseline (that can vary greatly from one developer to another) to be able to determine if somebody is more/less productive.
This isn't at all universally true, it depends on the costs of failure. If you're programming a Mars rover, then sure, avoiding mistakes is paramount. If you're writing a one-off data analysis script that only has to work once and only has to be mostly accurate, spending a week coming up with the ideal specification and writing a full comprehensive test suite is usually much less productive than hacking at it for a few hours until it works. Good software developers know when to make the appropriate safety tradeoffs given the goals and constraints of the task at hand.
We can't even come up with an objective way to score software itself. So how in the world are we going to go even deeper and score the process and the people that create it?
Sarah and Bob make clocks, but sometimes they make hats, and sometimes they make screws, or hammers, or lamps. And sometimes the things they make get sold to customers, but sometimes other employees take them home, sometimes they make parts for each other to use when making bigger projects, and often they help each other and other employees out on unrelated projects. And sometimes they do repairs too. Oh yeah they also paint portraits that hang up around the office.
Try coming up with a measurement for their individual productivity that is easy enough to be useful, hard to game, and cheap enough to make it worth the price.
The first step is to figure out how to measure the value of all the stuff they make...
Most developers like solving problems - this gives them high. Often, without realizing it, they create problems they are eager to solve to get their dose. Solving problems can be quantified, too. Unfortunately, it's hard to quantify the number of problems avoided by manifesting! Often this goes against the first goal I mentioned. For example, solving one problem 10 times gives you closing 10 tickets, making 10 PRs, and contributing a lot more LOCs in a short amount of time. But creating one PR and one ticket, which not only prevents those 10 but 100s and 1,000s more in the future, is quantified as "less work." I've had this at one job recently where every time I suggested fixing a repeated issue was answered with: "We have a bigger fish to fry." Yet, we kept wasting time frying tadpoles.
The smallest organizational unit at which productivity can usefully be measured is an agile team of about 7 people. Below that size the effort of quantifying productivity exceeds any possible value of doing so, and incentivizes the wrong behaviors.
A good manager can get a reasonable subjective sense of individual productivity but won't be able to quantitatively measure it.
I agree, and I would add that there is one good subjective way to measure the productivity of individual developers, and that is talking to their teammates.
The problem is that we do not have a standard "output unit."
> Productivity: the effectiveness of productive effort, especially in industry, as measured in terms of the rate of output per unit of input.
We can all agree "lines of code" is a shit metric, and we can't say "# of bugs closed," because each will have variable difficulty and value. Programmers employed by a business are in charge of automating repetitive tasks, not performing them (the classic measure of productivity).
I perform UX research on APIs. Here, we standardize the "output unit" and therefore can get a better idea of a developer's productivity. Every developer performs the same task, so we can simply measure time spent.
There will never be an ethical solution to measure developer productivity during the workday; this isn't Ford's assembly line.
Even worse, # of bugs closed may be measuring the inverse of what you think you are. See the classic Dilbert cartoon about "writing yourself a new minivan".
He mentions increasing salary won't lead to increased productivity ... and that's true, if the same developer remains. But what if we remove that constraint? What if increased salary means a higher quality of developer takes the position? Wouldn't this mean higher productivity?
Bit of a cold scenario, but one way to game it out is hypothetically removing the current dev and then hiring someone better at double the pay.
Or, less unfair seeming, double the pay by hiring a second dev. That might not double productivity ... depending on the situation it might 1.5x it, or just as easily 4x it.
Productivity is a measure I hate to use for individuals because it’s similar to the problem of root cause analysis in systems - our individual productivity comes from many different levers and influences and we tend to act like what works for some archetype of person should work for everyone and even most managers just go by feel basically, which is hardly objective nor measurable let alone accountable. If my CI system is crappy and gives me poor feedback that makes it hard to tell I’m doing something wrong in my commits, it can demotivate me. But fixing it doesn’t mean that I’ll suddenly become a 10x developer either. Similarly, conditions for maximizing potential reliability and performance exists at all times in systems but at least we can open up our editors and go inspect running systems while we really can’t do that with people.
Sometimes firing someone ironically gives them a wake up call and they’ll do great for their next job. Sometimes they don’t learn, sometimes they’ll never recover. Sometimes promoting people helps them, sometimes they become overwhelmed and performance drops again (I’m not speaking Peter Principle either).
I worked for a telecom company for ten years. The first few, everything was good, but we were struggling and raises and bonuses were non-existent and layoffs were frequent. Eventually I realized I was making a lower salary from when I started so I just started working less. Toward the end I was working less than four hours a week, getting positive feedback from my manager and skip-manager.
It hurt my career staying at a dead-end job but it gave me years of free time doing pretty much whatever I wanted.
I don't completely agree with this. Money can affect developer productivity, but it takes time to realize its value. Money makes things easier. That person whose being paid the smallest amount possible, is probably trying to just scrape by. Paying people more money frees their personal life up to handle other things, less stress about family stuff etc. I never believed the "leave your personal life at the door," thing. How you live affects your work, and making life easier makes people more productive (on average, not all cases). Now this isn't an immediate thing. It's an investment and it takes time. The person has to realize that money is available (emotionally) and start to trust it will be there.
If I pay above market rate I'll attract better devs for sure, and the caliber of folks in my hiring pipeline will get better. It's not obvious at all unless you know where to look. College is the prime example of that. If you pay better you'll have more new grads applying and they will prioritize you over other offers (unless you are an exceptionally prestigious employer). But even then, you'll never talk to the student who interned twice at FAANG and got a firm offer a year before graduation. You can get that guy only if you are willing to employ him at FAANG salary for two summers.
Employing these guys won't make my existing hires any better than they are in the immediate future.
However
Better hires leads to better teams. I find that certain developers have a multiplicative effect that applies to other devs. They mentor, document, review and help everyone grow. That might actually slow them (taking half a day to explain high level architecture to a lowly junior coder) until you realize the junior coder is now capable of answering questions from his teammates.
Maybe when developer productivity measurment becomes standard accross the industry we will realise that tech workers are in fact workers. Cogs in a machine. And not independant individuals imposing their will to the world through sheer will like some Randian hero. Maybe then it will then be plainly evident that developers are as alienated as any service worker, and in the end as disposable in the eyes of the shareholders.
Will we then organize with other workers to create better working conditions for everyone or will there be fewer and fewer developers working with ever more powerful technology chasing richer than ever VCs?
> Maybe when developer productivity measurment becomes standard accross the industry...
I suggest you look to database models/schemas standardization for an indication of how close this is coming to fruition. I personally can't measure developer productivity at a fine-grained level until requirements are stabilized, and I personally cannot stabilize requirements unless the domain is so well known the data store is standardized. I had hoped SAP would lead the charge through empirically iterating towards standards, but they left out the huge small and mid-size business markets with what they use today. And what they use today is still far from industries' standards.
We're no closer to standardization than when I started in software decades ago. We don't even have standard means of storing, transforming, displaying and tracking metadata upon calendars, addresses, phone numbers, names, and lots of other ephemera I can rattle off, within a single stakeholder industry, not to speak of within the software industry in general. There have certainly been efforts to standardize like Silverston's, but they haven't caught traction.
I'd sure like to see that happen, because it would short-circuit a lot of the discussions I engage with stakeholders to only the site-specific requirements, where I really add business value. Instead, I have to derive the data model from intricate discussion of their requirements, since they themselves have not agreed upon the parts that are common across their respective industries, so I end up at the start of dicussions with all sorts of little twisty pieces of a data model, all alike.
In every organization I've worked in, it was obvious who the high performers were and who the low performers were. It was obvious to everyone. The only blind spots were people usually seriously misjudged their own performance.
The problem, however, is that management is always being pushed to make objective measurements. For example, to fire someone, you have to first put him on an improvement plan with objective measurements. Otherwise, you're wide open to a lawsuit over discrimination, etc. You have to prove to a judge someone isn't performing, or that you gave raises based on performance.
Management also gets pushed into these attempts at objective measurements by attempts to optimize the numbers like what works great for a manufacturing process.
For better or for worse, manager's productivity is normally taken from the developers/engineers the manager manages, namely the team's overall productivity, so we're back to the same problem of how to measure developer's productivity
I once talked to a retired hardware engineer, a fellow who made real electronic devices, not software. He told me that, over the whole course of his career, 80% of the projects he worked on never made it to market. In other words, 4/5th of his total "productivity" turned out to be waste. Make of it what you will.
Any decently competent technical leader can tell if a developer is being productive or not. It's stupid to waste time trying to measure something that is virtually unmeasurable.
"It's unmeasurable, but everyone can tell." is that what you're saying?
Seems like a No True Scotsman fallacy to say only good technical leaders can tell if a developer is being productive and in the same breath say it's unmeasurable.
hours worked, bugs fixed, tickets closed, costs saved, clients saved, KPIs/OKRs hit, time in queue, hours-to-close-ticket, uptime, SLAs hit... surely some collection of indicators, while not a pure signal, would let you highlight outliers either above or below the curve.
I don't get orgs that use stats like commits/LoC/PRs as KPIs. Most time for software engineering ought to spent ensuring you're building the right thing which requires a lot of collaboration, writing design docs, thinking about the problem, etc to avoid 'building the wrong thing' which is probably the 'default' behavior and hard to avoid. Software engineering is only really valuable if you can easily extend and build it on it to enable whatever product or service you're selling to change as the business changes. If you're churning out throw-away code you never reuse you don't realize any of that value and you will lose.
I did have the idea of directly tying value to a graph of code that enabled a certain user journey. Sorta like 'CUJ-coverage' instead of test coverage. So if a user spent $20 at checkout, every line of code that was touched to enable that user's journey would be credited with that $20. I think this would be an interesting metric I'd probably respect but there are still probably a lot of blindspots this methodology doesn't capture.
I saw your text as a light grey so I decided to re-read it a few times. I absolutely agree with you. The people who complain the least should be paid the most attention when they do.
Why is it ironic? The stakeholders for increased developer productivity go beyond just developers. Even the slightest increase in developer productivity, let alone the ability to objectively measure it, is the holy grail of software development. Companies with access to nearly infinite resources can and would deploy them for a marginal gain in developer productivity. So much emphasis is spent on hiring the most brilliant minds and then on managing their projects and time so why not on optimizing their output.
Productivity is important, sure. But as with all other professions in which people interact, the interpersonal skills and behavior tend to be more important IMO. Productivity can be massively impacted (positively and negatively) by how well people communicate and get along with each other.
As an individual I often wonder if my contributions are meaningful. The author says, “individual performance is best left for individual contributors to measure in themselves and each other.” How can individuals possibly measure their own performance if it can’t be measured externally?
The answer is in the article itself. It gives you real historical data so you can predict how long the project will take with evidence, rather than just a feeling, hope, or guess.
> Velocity is an aggregate measure of tasks completed by a team over time, usually taking into account developers’ own estimates of the relative complexity of each task. It answers questions like, “how much work can this team do in the next two weeks?” The baseline answer is “about as much as they did in the last two weeks,”
If there's one thing the last 50 years of software development has conclusively proven, is that estimating the number of man months (hours) a project will take doesn't work.
> have never been sure how summing together something that is supposed to have no relationship with time magically provides an estimate of anything
Most people dramatically underestimate the amount of time something requires. As long as you give them a clear conversion rate between story points and hours, they will estimate the task in hours -- incorrectly, despite having made the same mistake hundred times in the past -- then convert the hours to story points and tell you the result.
Then someone notices that you have like 200 man-hours in sprint, and you have only selected story points for 100 man-hours. Which in fact is perfectly okay, if you understand that the "100" is the underestimate, and the realistic estimate would actually be close to 200, so you should be happy about the plan! But most people will not get it, and they will insist to plan properly for 200 man-hours. If you don't have enough political power to stop them, they will make you plan for 200 man-hours.
Then at the end of the sprint, everyone is stressed out, and they only completed 50% of planned stories. Because they underestimated how much time the tasks would take... just like research shows humans always do, no matter how many times they got burned in the past, no matter how much you yell at them to make better estimates.
(By the way, the problem with making realistic estimates is not just that individuals suck at it, but also that social forces actively prevent it. Research shows that people who make more realistic estimates are considered less competent than their colleagues, precisely because everyone notices that their estimates are longer that they believe they should be. And no one later changes their opinion just because the estimate turned out to be correct. Like, really, people who estimated something to take 2 weeks and delivered it in 3 weeks were judged as more competent by managers than people who estimated it to take 3 weeks and delivered in 3 weeks. The former made a better impression at the beginning, and the latter didn't provide a better result at the end, so the former made a better overall impression. This is how human brains work.)
So the smart way out is to make a metric that is taboo to convert to hours. Give vague verbal descriptions, like 1 is "trivial", 2 is "fairly easy", 3 is "simple", 5 is "medium", 8 is "kinda difficult", 13 is "tricky", and 21 is "needs to be split to smaller stories". People will first feel weird about it, but then they get used to it, and they will start delivering consistent ratings... like, the kind of story that gets assigned 5 story points in January will probably also get assigned 5 story points in December.
Then all you need is calculate velocity, which is, well, the conversion rate between the story points and hours. But you can't say that, or it will ruin the magic! You just say "during the last sprint, we implemented 50 story points, so for this sprint we will also plan 50 story points", and hope that people will accept that, without making the conversion explicit. And it works...
...until someone says: "Hey wait, so if we have 200 man-hours and plan 50 story points, that actually means that 1 story point equals 4 hours, right? And why are we giving this specific story 3 story points? 12 hours sound too much to me, I am pretty sure we could do it in 8 hours, or even 4 hours if we work hard, right?" (The rest of the team is silent, either because they agree, or they don't want to be seen as less competent.) And then you get another sprint when people plan too much, complete 50% of it, and get another stern talk about being more careful about making estimates.
It is a psychological trick that only works if you stop estimating stories in hours. It always breaks when someone insists on connecting the dots, converting the estimate to hours, and "fixing" it because it is "too much". If we could reliably estimate stories in hours, we wouldn't need story points, but experience shows we can't!
(But if you tell this to people, they will insist that they absolutely can make proper estimates, or that professional developers should be able to make proper estimates. Well, they can't, and we don't live in the should-universe.)
> I live in the real world so I estimate in hours
Do you make your estimates in front of other people who sometimes second-guess them? How often you actually meet your estimates?
I have never felt my individual productivity go up.
It feels like as I progress my individual work stays the same, but helping others eats any efficiency gains I personally make.
As if when you are new to a module, you are slow because you don’t know anything, then once you have expertise, you are slow because you know everything and are helping others.
> As if when you are new to a module, you are slow because you don’t know anything, then once you have expertise, you are slow because you know everything and are helping others.
This suggests that the proper way to keep team productivity high is to have all team members working on the product since the beginning, and treat them well so that they don't quit and don't have to be replaced by new ones. Maybe even start with slightly more people on the project than necessary, so if a few of them quit during the project for unrelated reasons, you can still finish the project with the remaining ones.
Probably not going to happen, because this goes against maximizing short-term productivity at the beginning of the project. The short-term productivity is maximized by having the team as small as possible, and only worrying about problems after they happen.
This is the only way you can scale your time, by ramping up others to be as efficient as you are. Although you might be becoming less of a contributor individually, you are enabling the larger group. This type of productivity can definitely be tracked based on how many people you have helped and their corresponding lineage of knowledge and work output.
The only way to do it I can think of: have two teams or individuals develop the same thing simultaneously and measure the time required to get a result of the same quality. This should be done in longer term to take into account code quality (poor code quality slows down future development).
I did this once for a medium complexity task. The quality ended up the same because both developers had good taste. One developer took 2 hours for the job, the other took 2 weeks. And people don’t believe in 10x developers...
A few things can play havoc with this type of measurement. One is that the way we determine the "quality" of the code is based on the current scope of the project.
If the scope right now is pull a bunch of values out of spreadsheets and generate reports on them, the highest quality code would be the most terse: it looks up the files, get the information, then displays it. If tomorrow the scope changes to "do that, but in realtime, across multiple machines", the highest quality code is the one that implemented a database and REST API.
Since scope changes all the time, we can never evaluate which set of code is the highest quality.
I've had a number of projects to either add features or fix a bug in large volumes of truly weird (and sometimes jerkoff code), COLT and JES3 being particularly flagrant examples. It can take weeks to find out where the bad code and less than half a dozen lines to fix the problem.
In just about any system of productivity metrics, these two episodes would mark me as dismally productive:
In the bank I was working for, the incidence rate of online banking mainframe reIPLs went from every few days to zero.
At a telecommunication provider, data center reIPLs similarly reduced.
This assumes direct managers want productive developers - this is not my experience. The goal of managers is to increase the number of people they manage, and get more money. I have time and again done things fast only to have blocks put in place to slow things down - no one wants the job done easily and go home, where's the money in that. The inability to measure productivity is a direct result of this imho.
One of the most useful programmer metrics that I've found is code churn: (new lines + deleted lines) / total changed lines. Instead of telling you how much work your programmers are doing, this metric tells you what kind of work your programmers are doing. Small numbers mean bug fixing (end of project and maintenance) and large numbers mean new development and features.
What about high deletion amounts? I've merged PRs with hundreds of thousands of lines deleted and none added. It took quite a bit of sleuthing to figure out someone had left entire copies of directories side by side with different names, where one was completely unused. Conversely, someone had a huge addition to the repo that actually was total garbage.
You can measure productivity just fine on any tasks that repeat. How long does it take you to run the right tests, find the implementation for a failing test case, make a merge request, create a patch release, pull up the logs in case of an incident? All these tasks repeat over and over again and a good developed can do them much quicker.
Sure, it's easy. Count how many lines of code they write per day. Likewise, aeronautical engineering productivity can be measured by counting kilograms of mass added per day.
Most software projects get managed with a ticketing system that logs the work to be done as individual tickets. Counting the number of cards a developer closes over a certain period allows us to see what actual work is getting closed off.
Measuring closed tickets is an excellent metric if the tasks are written well and assigned based on business priority. When more tickets get closed, more good things are happening with the project, be that bugs getting closed off or features made.
Also ticket != value. Lots of tickets for things that involve almost no work and things that actually make a difference to the customer/product are not equal.
Everything I work on is new products/projects and tickets come in all sizes and shapes, and often change daily as some exec crams in more new ideas or some designer or product person "clarifies" the ticket, even after the work is done. Tickets are often written and estimated long before decisions are actually made. Defects are written that require a lot of investigation only to discover it's some other teams problem and you can't do anything or turns out to be a temporary service outage no one communicated or misconfiguration in some CMS or even plain simply not understanding what the product does.
Measuring productivity by tickets closed is a whole pile of dead snakes.
What you are saying is that my colleague who has been working on one ticket, solving a critical bug, for the last 2 or 3 weeks is a an unproductive one since he haven't closed a ticket for a while ?
Counting closed tickets is indeed a measure for something, but by itself it's far from being a good indicator.
That assumes all tickets are the same. A developer might take on a very difficult task with a lot of hidden technical complexity that ties them up for weeks. Another might pick up little bugs and small text updates. With no other insight, the metric is meaningless. It becomes easy to game by avoiding any difficult and time consuming tickets - such as refactoring - as much as possible, instead picking the quick and easy things that make your metric look good.
Sorry but I strongly disagree. In fact I’ll come out and say its perhaps one of the worst ways you can measure productivity.
At best you’re measuring _activity_ not productivity. You just turned a group of smart people into headless chickens jumping on whatever ticket so they can to look busy. Which cultivates an environment of fear, which in turn kills deep thought and creativity... two essential ingredients for good software.
I could even argue that ticketing systems are the bane of good software, making real priorities intransparent... but that’s a rabbit hole I won’t go into here.
Instead I’d argue we shouldn’t be trying to measure developer productivity at all.
Productivity in software development is non-linear and difficult to assign individually.
How do you measure the productivity of that “lazy guy” that had an amazing shower thought one morning, implemented it by lunchtime, which in turn leads to the company making millions more by the end of the year?
Or what about the person on the team that spends most of their time supporting the rest of the team, unblocking them and helping them be productive?
Two examples of why we shouldn’t even be trying to measure developer productivity.
My own experience after 25 years in this industry is the moment someone says “but how do we measure developer productivity?” is the moment that companies software products begins a long, slow death.
Ultimately what development teams and companies (not individuals) should be measured on is _results_ that positively impact customers and business.
When the product is a success, no one cares about individual productivity.
I think there is some value to that metric but I would not call it excellent. A really good developer might come up with a slight requirement change or an engineering detour that makes many tickets meaningless, he might improve tooling such that things that took hours now take minutes, he might be able to write tests or come up with new processes that increase quality 10x. If we think of guys like Jeff Dean, the ability to close tickets written by project managers is not what makes them stand out.
dahart|5 years ago
I saw this perhaps most acutely with a company I sold - for a couple of years I was more productive than on nearly any other large software project I've worked on, because I knew the ins and outs of everything. The developers who bought it and took over are probably better developers than I am, and they are unquestionably excellent coders, yet it took a couple of years for them to get productive at making even medium sized changes. It became incredibly obvious to me how handicapped you are diving into something someone else made, especially if the original designer isn't there anymore.
Meshing really well with managers & PMs is probably the next biggest factor in my own experience, but it doesn't come even close to the gap between being there from day 1 vs coming in much later.
> Productivity tracking tools and incentive programs will never have as great an impact as a positive culture in the workplace.
I'm a fan of choosing to use time management apps and productivity tools to manage my own budgets. But I admit that I hate it when I have to do it for someone else.
bit_logic|5 years ago
The natural trajectory for a project is to keep adding features until it collapses from its own weight. Only the long tenure developer can fight this and revitalize a project by removing the useless excess.
reactor|5 years ago
BerislavLopac|5 years ago
Building software is a knowledge business, and there are three types of knowledge involved:
[0] https://en.wikipedia.org/wiki/Brooks%27s_law[1] https://en.wikipedia.org/wiki/G._K._Chesterton#Chesterton's_...
freedomben|5 years ago
z3t4|5 years ago
kwanbix|5 years ago
unknown|5 years ago
[deleted]
sytse|5 years ago
I think that complexity is hard to measure and therefore easy to game.
At GitLab we only measure tasks completed, the number of changes that shipped to production, with the requirement that every change has to add value. This measure has been used throughout R&D https://about.gitlab.com/handbook/engineering/performance-in... to assess productivity for multiple years now with good success https://about.gitlab.com/blog/2020/08/27/measuring-engineeri...
When you tell new engineers about this target they see a great opportunity to game it, just ship smaller changes. It turns out that smaller changes are quicker to ship. Lead to better code and tests. Have lower risk of cancellation and problems in production. And lead to earlier and better feedback.
Inspired by Goodhart’s Law I'll propose the following: A measure that when it becomes a target improves productivity. ~Sijbrandij's Law
BlargMcLarg|5 years ago
So we get young, naive engineers to focus on small changes. Cool, probably as it should be, you gotta start somewhere. And when these developers get hungry for bigger projects, when they get bored implementing the umpteenth small and by that point (for them) trivial change, how do you encourage them to tackle bigger technical problems? Those that lay the foundation for the new people to do their job more easily and on-board quicker? Or did you actually not tell us all, and you measure far more than just the number of changes?
jeffbee|5 years ago
voxl|5 years ago
brlewis|5 years ago
jeffbee|5 years ago
craftinator|5 years ago
sdesol|5 years ago
https://imgur.com/Yb8WvJY
Over 150 days, this developer's churn really fluctuates and that's because they work on different things, that requires different amounts of code. And if you look at the following:
https://imgur.com/oNmsMSV
you can see they still commit regularly, but as the Reviewability section shows, their changes are mainly small ones, which sort of aligns with what Sid (sytse) mentioned, which is mainly focusing on small changes.
If you look at the bigger picture:
https://imgur.com/vmiOtgU
https://imgur.com/5vf3kWj
The churn for the project microsoft/vscode fluctuates quite a bit as well.
Based on what I've learned so far, you really need a good baseline (that can vary greatly from one developer to another) to be able to determine if somebody is more/less productive.
bhuber|5 years ago
cratermoon|5 years ago
fizx|5 years ago
pingpongchef|5 years ago
learc83|5 years ago
Sarah and Bob make clocks, but sometimes they make hats, and sometimes they make screws, or hammers, or lamps. And sometimes the things they make get sold to customers, but sometimes other employees take them home, sometimes they make parts for each other to use when making bigger projects, and often they help each other and other employees out on unrelated projects. And sometimes they do repairs too. Oh yeah they also paint portraits that hang up around the office.
Try coming up with a measurement for their individual productivity that is easy enough to be useful, hard to game, and cheap enough to make it worth the price.
The first step is to figure out how to measure the value of all the stuff they make...
unknown|5 years ago
[deleted]
nikolay|5 years ago
nradov|5 years ago
A good manager can get a reasonable subjective sense of individual productivity but won't be able to quantitatively measure it.
mundo|5 years ago
tomatohs|5 years ago
> Productivity: the effectiveness of productive effort, especially in industry, as measured in terms of the rate of output per unit of input.
We can all agree "lines of code" is a shit metric, and we can't say "# of bugs closed," because each will have variable difficulty and value. Programmers employed by a business are in charge of automating repetitive tasks, not performing them (the classic measure of productivity).
I perform UX research on APIs. Here, we standardize the "output unit" and therefore can get a better idea of a developer's productivity. Every developer performs the same task, so we can simply measure time spent.
There will never be an ethical solution to measure developer productivity during the workday; this isn't Ford's assembly line.
QuercusMax|5 years ago
BrandonMarc|5 years ago
Bit of a cold scenario, but one way to game it out is hypothetically removing the current dev and then hiring someone better at double the pay.
Or, less unfair seeming, double the pay by hiring a second dev. That might not double productivity ... depending on the situation it might 1.5x it, or just as easily 4x it.
devonkim|5 years ago
Sometimes firing someone ironically gives them a wake up call and they’ll do great for their next job. Sometimes they don’t learn, sometimes they’ll never recover. Sometimes promoting people helps them, sometimes they become overwhelmed and performance drops again (I’m not speaking Peter Principle either).
throwaway0950|5 years ago
It hurt my career staying at a dead-end job but it gave me years of free time doing pretty much whatever I wanted.
kemiller2002|5 years ago
digitalsushi|5 years ago
908B64B197|5 years ago
If I pay above market rate I'll attract better devs for sure, and the caliber of folks in my hiring pipeline will get better. It's not obvious at all unless you know where to look. College is the prime example of that. If you pay better you'll have more new grads applying and they will prioritize you over other offers (unless you are an exceptionally prestigious employer). But even then, you'll never talk to the student who interned twice at FAANG and got a firm offer a year before graduation. You can get that guy only if you are willing to employ him at FAANG salary for two summers.
Employing these guys won't make my existing hires any better than they are in the immediate future.
However
Better hires leads to better teams. I find that certain developers have a multiplicative effect that applies to other devs. They mentor, document, review and help everyone grow. That might actually slow them (taking half a day to explain high level architecture to a lowly junior coder) until you realize the junior coder is now capable of answering questions from his teammates.
commandlinefan|5 years ago
Well... if you knew how to spot somebody twice as good as the one you have now, why didn't you hire that guy in the first place?
suprfnk|5 years ago
So what is a "higher quality of developer"?
HourglassFR|5 years ago
Will we then organize with other workers to create better working conditions for everyone or will there be fewer and fewer developers working with ever more powerful technology chasing richer than ever VCs?
908B64B197|5 years ago
If a company is willing to sacrifice engineering talent and institutional knowledge for short term gains... Good luck staying in business.
Reference: Every outsourcing project I've seen.
yourapostasy|5 years ago
I suggest you look to database models/schemas standardization for an indication of how close this is coming to fruition. I personally can't measure developer productivity at a fine-grained level until requirements are stabilized, and I personally cannot stabilize requirements unless the domain is so well known the data store is standardized. I had hoped SAP would lead the charge through empirically iterating towards standards, but they left out the huge small and mid-size business markets with what they use today. And what they use today is still far from industries' standards.
We're no closer to standardization than when I started in software decades ago. We don't even have standard means of storing, transforming, displaying and tracking metadata upon calendars, addresses, phone numbers, names, and lots of other ephemera I can rattle off, within a single stakeholder industry, not to speak of within the software industry in general. There have certainly been efforts to standardize like Silverston's, but they haven't caught traction.
I'd sure like to see that happen, because it would short-circuit a lot of the discussions I engage with stakeholders to only the site-specific requirements, where I really add business value. Instead, I have to derive the data model from intricate discussion of their requirements, since they themselves have not agreed upon the parts that are common across their respective industries, so I end up at the start of dicussions with all sorts of little twisty pieces of a data model, all alike.
WalterBright|5 years ago
The problem, however, is that management is always being pushed to make objective measurements. For example, to fire someone, you have to first put him on an improvement plan with objective measurements. Otherwise, you're wide open to a lawsuit over discrimination, etc. You have to prove to a judge someone isn't performing, or that you gave raises based on performance.
Management also gets pushed into these attempts at objective measurements by attempts to optimize the numbers like what works great for a manufacturing process.
xornox|5 years ago
Why productivity of developers must be measured, but productivity of managers not?
reallydontask|5 years ago
carapace|5 years ago
BrandonMarc|5 years ago
Also consider, frustrating though I'm sure that was, he probably still got paid for his effort in the 80%.
triceratops|5 years ago
1123581321|5 years ago
burade|5 years ago
NortySpock|5 years ago
Seems like a No True Scotsman fallacy to say only good technical leaders can tell if a developer is being productive and in the same breath say it's unmeasurable.
hours worked, bugs fixed, tickets closed, costs saved, clients saved, KPIs/OKRs hit, time in queue, hours-to-close-ticket, uptime, SLAs hit... surely some collection of indicators, while not a pure signal, would let you highlight outliers either above or below the curve.
unknown|5 years ago
[deleted]
mwigdahl|5 years ago
rajacombinator|5 years ago
siliconc0w|5 years ago
I did have the idea of directly tying value to a graph of code that enabled a certain user journey. Sorta like 'CUJ-coverage' instead of test coverage. So if a user spent $20 at checkout, every line of code that was touched to enable that user's journey would be credited with that $20. I think this would be an interesting metric I'd probably respect but there are still probably a lot of blindspots this methodology doesn't capture.
qz2|5 years ago
digitalsushi|5 years ago
kemiller2002|5 years ago
anxiostial|5 years ago
orky56|5 years ago
jhunter1016|5 years ago
forbushbl|5 years ago
valenterry|5 years ago
awinter-py|5 years ago
have never been sure how summing together something that is supposed to have no relationship with time magically provides an estimate of anything
also not sure why teams are using the central source of truth for progress as the 'daily todo list making' tool
I live in the real world so I estimate in hours
lgunsch|5 years ago
> Velocity is an aggregate measure of tasks completed by a team over time, usually taking into account developers’ own estimates of the relative complexity of each task. It answers questions like, “how much work can this team do in the next two weeks?” The baseline answer is “about as much as they did in the last two weeks,”
If there's one thing the last 50 years of software development has conclusively proven, is that estimating the number of man months (hours) a project will take doesn't work.
Viliam1234|5 years ago
Most people dramatically underestimate the amount of time something requires. As long as you give them a clear conversion rate between story points and hours, they will estimate the task in hours -- incorrectly, despite having made the same mistake hundred times in the past -- then convert the hours to story points and tell you the result.
Then someone notices that you have like 200 man-hours in sprint, and you have only selected story points for 100 man-hours. Which in fact is perfectly okay, if you understand that the "100" is the underestimate, and the realistic estimate would actually be close to 200, so you should be happy about the plan! But most people will not get it, and they will insist to plan properly for 200 man-hours. If you don't have enough political power to stop them, they will make you plan for 200 man-hours.
Then at the end of the sprint, everyone is stressed out, and they only completed 50% of planned stories. Because they underestimated how much time the tasks would take... just like research shows humans always do, no matter how many times they got burned in the past, no matter how much you yell at them to make better estimates.
(By the way, the problem with making realistic estimates is not just that individuals suck at it, but also that social forces actively prevent it. Research shows that people who make more realistic estimates are considered less competent than their colleagues, precisely because everyone notices that their estimates are longer that they believe they should be. And no one later changes their opinion just because the estimate turned out to be correct. Like, really, people who estimated something to take 2 weeks and delivered it in 3 weeks were judged as more competent by managers than people who estimated it to take 3 weeks and delivered in 3 weeks. The former made a better impression at the beginning, and the latter didn't provide a better result at the end, so the former made a better overall impression. This is how human brains work.)
So the smart way out is to make a metric that is taboo to convert to hours. Give vague verbal descriptions, like 1 is "trivial", 2 is "fairly easy", 3 is "simple", 5 is "medium", 8 is "kinda difficult", 13 is "tricky", and 21 is "needs to be split to smaller stories". People will first feel weird about it, but then they get used to it, and they will start delivering consistent ratings... like, the kind of story that gets assigned 5 story points in January will probably also get assigned 5 story points in December.
Then all you need is calculate velocity, which is, well, the conversion rate between the story points and hours. But you can't say that, or it will ruin the magic! You just say "during the last sprint, we implemented 50 story points, so for this sprint we will also plan 50 story points", and hope that people will accept that, without making the conversion explicit. And it works...
...until someone says: "Hey wait, so if we have 200 man-hours and plan 50 story points, that actually means that 1 story point equals 4 hours, right? And why are we giving this specific story 3 story points? 12 hours sound too much to me, I am pretty sure we could do it in 8 hours, or even 4 hours if we work hard, right?" (The rest of the team is silent, either because they agree, or they don't want to be seen as less competent.) And then you get another sprint when people plan too much, complete 50% of it, and get another stern talk about being more careful about making estimates.
It is a psychological trick that only works if you stop estimating stories in hours. It always breaks when someone insists on connecting the dots, converting the estimate to hours, and "fixing" it because it is "too much". If we could reliably estimate stories in hours, we wouldn't need story points, but experience shows we can't!
(But if you tell this to people, they will insist that they absolutely can make proper estimates, or that professional developers should be able to make proper estimates. Well, they can't, and we don't live in the should-universe.)
> I live in the real world so I estimate in hours
Do you make your estimates in front of other people who sometimes second-guess them? How often you actually meet your estimates?
unknown|5 years ago
[deleted]
rileymat2|5 years ago
It feels like as I progress my individual work stays the same, but helping others eats any efficiency gains I personally make.
As if when you are new to a module, you are slow because you don’t know anything, then once you have expertise, you are slow because you know everything and are helping others.
Would be interesting to measure this somehow.
Viliam1234|5 years ago
This suggests that the proper way to keep team productivity high is to have all team members working on the product since the beginning, and treat them well so that they don't quit and don't have to be replaced by new ones. Maybe even start with slightly more people on the project than necessary, so if a few of them quit during the project for unrelated reasons, you can still finish the project with the remaining ones.
Probably not going to happen, because this goes against maximizing short-term productivity at the beginning of the project. The short-term productivity is maximized by having the team as small as possible, and only worrying about problems after they happen.
orky56|5 years ago
RivieraKid|5 years ago
ptr|5 years ago
craftinator|5 years ago
If the scope right now is pull a bunch of values out of spreadsheets and generate reports on them, the highest quality code would be the most terse: it looks up the files, get the information, then displays it. If tomorrow the scope changes to "do that, but in realtime, across multiple machines", the highest quality code is the one that implemented a database and REST API.
Since scope changes all the time, we can never evaluate which set of code is the highest quality.
Lorean|5 years ago
BXLE_1-1-BitIs1|5 years ago
In just about any system of productivity metrics, these two episodes would mark me as dismally productive:
In the bank I was working for, the incidence rate of online banking mainframe reIPLs went from every few days to zero.
At a telecommunication provider, data center reIPLs similarly reduced.
chadcmulligan|5 years ago
GartzenDeHaes|5 years ago
sixstringtheory|5 years ago
choeger|5 years ago
knaq|5 years ago
The real underperformers go negative.
sharker8|5 years ago
LeviIsaac|5 years ago
Measuring closed tickets is an excellent metric if the tasks are written well and assigned based on business priority. When more tickets get closed, more good things are happening with the project, be that bugs getting closed off or features made.
coldcode|5 years ago
Also ticket != value. Lots of tickets for things that involve almost no work and things that actually make a difference to the customer/product are not equal.
Everything I work on is new products/projects and tickets come in all sizes and shapes, and often change daily as some exec crams in more new ideas or some designer or product person "clarifies" the ticket, even after the work is done. Tickets are often written and estimated long before decisions are actually made. Defects are written that require a lot of investigation only to discover it's some other teams problem and you can't do anything or turns out to be a temporary service outage no one communicated or misconfiguration in some CMS or even plain simply not understanding what the product does.
Measuring productivity by tickets closed is a whole pile of dead snakes.
2rsf|5 years ago
Counting closed tickets is indeed a measure for something, but by itself it's far from being a good indicator.
danjac|5 years ago
harryf|5 years ago
At best you’re measuring _activity_ not productivity. You just turned a group of smart people into headless chickens jumping on whatever ticket so they can to look busy. Which cultivates an environment of fear, which in turn kills deep thought and creativity... two essential ingredients for good software.
I could even argue that ticketing systems are the bane of good software, making real priorities intransparent... but that’s a rabbit hole I won’t go into here.
Instead I’d argue we shouldn’t be trying to measure developer productivity at all.
Productivity in software development is non-linear and difficult to assign individually.
How do you measure the productivity of that “lazy guy” that had an amazing shower thought one morning, implemented it by lunchtime, which in turn leads to the company making millions more by the end of the year?
Or what about the person on the team that spends most of their time supporting the rest of the team, unblocking them and helping them be productive?
Two examples of why we shouldn’t even be trying to measure developer productivity.
My own experience after 25 years in this industry is the moment someone says “but how do we measure developer productivity?” is the moment that companies software products begins a long, slow death.
Ultimately what development teams and companies (not individuals) should be measured on is _results_ that positively impact customers and business.
When the product is a success, no one cares about individual productivity.
exlurker|5 years ago
zarkov99|5 years ago
908B64B197|5 years ago
Don't forget the weight.
I've seen single tickets taking weeks for bug investigation.
ExcavateGrandMa|5 years ago
[deleted]