top | item 30516413

What is developer productivity and how to measure it?

97 points| arthurcoudouy | 4 years ago |axolo.co

88 comments

lacker|4 years ago

I don't agree with most of the advice in this article but rather than complain let me suggest an alternative.

As a line manager, with software engineers reporting directly to you, you should be able to use your personal judgment to understand the productivity of your software engineers. Don't measure it with acronyms, with metrics like the number of commits, or by paying attention to how many hours a week people are working. Pay attention to whether people get things done, and are they getting big important things done, or only little nice-but-not-critical things. Make sure you communicate enough so that individual software engineers understand how you think and what you prioritize.

As a manager-of-managers, it is going to be very difficult for you to measure developer productivity. It's tempting to look at metrics like the number of code reviews a developer does. But these can at most be a sanity check, not the core metric to go for.

Instead, you can measure productivity of teams. Is the team getting things done, and are they big important things, or only little nice-but-not-critical things? Sometimes, a line manager will insist that everyone on their team is performing excellently, and yet you observe the team overall is not achieving very much. Probably one of the two of you is incorrect, and you should dig in to figure that out. The opposite also happens, where a manager states that everything is a disaster, but you observe that the team has actually delivered a lot.

The other thing you can do is to teach your line managers how to judge individual productivity. There's no silver bullet, it's just a natural outcome of having conversations about who is productive and who is not and how to tell and what to do about it, so be sure to have enough of those conversations.

None of this is easy to quantify, but the hard truth is, there is no natural mapping from numbers to developer productivity and it is usually a bad idea to try to quantify productivity. You are much better off using human language and intelligent thinking to evaluate productivity, rather than reductionist metrics.

blurker|4 years ago

I too have come to think that no simple metrics will ever replace the need for a competent manager who can use intangible, subjective context to evaluate their team. I think that even if you get some metrics that work well initially, the system will change such that the metric becomes the goal and the metrics then become much less effective.

arthurcoudouy|4 years ago

I can't agree more with you. I tried to sum my thoughts in my first reference

> One of the most common myths — and potentially most threatening to developer happiness — is the notion that productivity is all about developer activity, things like lines of code or number of commits. More activity can appear for various reasons: working longer hours may signal developers having to "brute-force" work to overcome bad systems or poor planning to meet a predefined release schedule.

The SPACE framework is not about measuring quantitative data only. I feel the need to explain how certain metrics might be interesting, but rather to identify key issues or unexpected events during engineering sprints. Without data analysis, you would not be able to understand why there is a drop of productivity during certain periods, and usually, those drops were created by the management (too many meetings or lack of follow-up)

passivate|4 years ago

I don't see how you can make the leap from "its hard to measure" to "no metrics are useful". As with anything you have to use your judgement, experience, and its a case-by-case thing. Everything is a signal. Lines of code, number of bugs fixed, number of bugs found, severity of bugs, hours in office, meeting project milestones, contribution in team meetings, etc, etc. Its up to you how to interpret each signal. As you rightfully said, there is no silver bullet.

jgust|4 years ago

There's an entire class of products I'll name "internal platform tools" whose primary objective is to improve the developer experience with the intent of having the side effect of increased developer productivity by making it easier & more enjoyable to build things within a company. The teams working on these tools need to understand how their products perform the same as a team building some widget for a "paying" customer.

Without some quantifiable metric, how do these teams know if their products are getting better or worse? The discussion always goes to measuring developer happiness & developer productivity because we want with some degree of confidence to be improving or at least maintaining these metrics.

giantg2|4 years ago

We just need to be careful that this unquantifiable, subjective rating doesn't include biases.

kelnos|4 years ago

Absolutely agree.

I think there is a conflict of interest there, though. Managers have a vested interest in saying that their team is highly productive. Managers of highly productive teams get raises and more head count, and eventually promotions. Anything else reflects poorly on the manager.

So the managers-of-managers do need to keep their eyes on this too, but I also agree with you that it's harder for people in that higher-level position to evaluate this. I guess, as you hint at, the manager-of-managers can look at team output overall, and if that's below expectations, that's a starting point for discussion with the line manager.

astrobe_|4 years ago

Funny. I read a book a long time ago about developer productivity. They started with saying: "measuring productivity witk KLoCs is terribad". And later on: but we only have that, so let's use it anyway. "Stopped reading there". And here in 2022 it's exactly The Same Thing:

> Measuring developer outputs can be detrimental

And then:

> Design and coding: The number of design papers and specs, work items, pull requests, commits, and code reviews, as well as their volume or count.

All they do is add more and more metrics. But this has the exact same problem as with the infamous KLoCs measure: how do you interpret it? How do you know it is not gamed, to begin with? Actually, now you have two problems: collecting and analyzing this mass of metrics can have a significant cost.

astrobe_|4 years ago

One more thing: "put your money where your mouth is".

Bug bounties work, why wouldn't "feature bounties" also work?

You say you want those features, preferably bug-free, for this deadline. And there's $5K for the team if the objectives are met.

Then, your metrics problem boils down to how to impartially measure customer satisfaction, or how well the objectives are met (in some contexts, bugs are unavoidable etc.).

Metrics can still be important to help the team identify their problems (or rather, confirm that their intuitions about the problem). It's an optimization problem: measure first, then do something about the actual bottlenecks.

That said, some programmers are such nerds that more money is not the highest motivation. One can use some creativity here.

arthurcoudouy|4 years ago

I'm personally not a huge fan of collecting quantitative data to evaluate engineering productivity. The context of such metrics is usually more important that the data itself, meaning using some discrepancies in your results to identify business needs or issues. When I work with quantitative data, I try to find pattern rather than analyzing the data iself (why do pull requests last longer on Monday afternoon? do we have too many meetings there?..)

lgleason|4 years ago

More lines of code = more liability. Also more commits, frequency of commits etc. does not equate to more productivity. In fact in some instances you actually are introducing more liability into a codebase doing that. The flaw with most of these metrics of "productivity" is that they inherently are saying that coding is analogous to a factory worker building something when in reality it is analogous to someone designing the things the factory worker has to assemble.

While I'm not a fan of subjectivity in ratings, the challenge is that it is very difficult, and I would argue, virtually impossible to do it objectively. So what happens instead is that when metrics are used to evaluate engineers, the smart ones figure out how to game them. Does that make them, or the team more productive? Nope. Can that have un-intended consequences that actually make the code less stable, and decrease productivity. Yup!

But if you're going to go with these measurements you might as well go big. Throw out anything related to Agile, require estimates that are accurate within 15 minutes and severely punish engineers for not getting estimates right. Might as well also add in heavy documentation requirements too. After all, this rigorous measurement etc. has all worked so well in the past <dripping sarcasm for this last paragraph>.

ryandvm|4 years ago

Measuring developer productivity is like observing quantum state, the act of measuring it generally fucks it all up.

Rather than task the developers with all manner of bureaucratic Agile bullshit like tracking hours, arguing about story points, submitting to kindergarten-style daily stand-ups, velocity tracking, retros, etc; I would suggest a different tact. How about measuring developer productivity by observing if they're building what you need at the rate you need it built. If not, then you need to figure out if you can afford to replace them with someone that can.

astrobe_|4 years ago

> Measuring developer productivity is like observing quantum state, the act of measuring it generally fucks it all up.

This is a much more entertaining version of Goodhart's law [1]

[1] https://en.wikipedia.org/wiki/Goodhart%27s_law

criticaltinker|4 years ago

> because productivity and contentment are linked, it's feasible that satisfaction can operate as a leading indicator of productivity; a drop in satisfaction and engagement could foreshadow impending burnout and lower output.

Great review of the hazards involved in quantifying developer productivity - the correlation above has been true everywhere I’ve ever worked.

If the company you’re working for:

- is not investing in improving the developer experience

- is not listening to developer complaints about slow, tedious, or error prone processes

- is perpetually pushing tech debt onto a backlog that only grows

Then chances are you work for a company whose leadership does not understand and value software engineering. They likely see it as a cost center, and they likely incentivize managers by rewarding initial delivery of projects, at the expense of maintainability and developer sanity.

I know I’m preaching to the choir, I just had to put it out there for all the young engineers. Don’t waste too much of your life and happiness trying to patch those sinking ships.

52-6F-62|4 years ago

Well put!

> your life and happiness

Also remember that these things are why you're here in the first place, and not to be the best Level 4 SWE Management Trainee in the trans-western division this quarter.

It's only healthy to maintain perspective. It can hurt in the short run sometimes, but there's only misery if you don't keep it.

vgordon|4 years ago

As many readers pointed out, team velocity and process bottlenecks are a much more valuable focal point than individual developer metrics. But for this, having the ability to observe what is happening and dig deeper into the data is critical, so that you can back iterations on your improvement efforts with data.

The research is also slowly laying the foundation of what are useful metrics to track and what excellence looks like for the industry. Unfortunately, those metrics are typically difficult to measure because the underlying data often spans multiple engineering systems: Lead Time, the poster child of DORA metrics, requires data from at least your source control and your CI/CD systems.

Btw, you might be interested in checking out Faros Community Edition: https://github.com/faros-ai/faros-community-edition – an open-source engineering operations platform we’ve been building for this very purpose. Our goal is to bring visibility into engineering operations, and make it very easy to query and leverage data both within and across your systems. It’s container-based and built on top of Airbyte, Hasura, Metabase, dbt, and n8n.

rm8x|4 years ago

this is neat, 'first review time' is definitely one of those softer metrics that can make a meaningful difference

thenerdhead|4 years ago

> Each organization can set a wide range of metrics to follow every week, such as:

> Number of commits.

> Average commit size.

> Frequency of code reviews.

> Number of code reviews.

> Time to review.

> and so on...

No. This has been tried many times and companies think this is how you measure productivity, but it is not even sustainable. Developer productivity is not about moving the needle, it is about outcomes, and not outputs.

An outcome is finally merging an unsustainable PR that has sat for a month. It is not how many comments, reviews, meetings, or commits needed to get to the outcome.

The only people I know who want to implement these terrible measurements are the type of people who have ambitions as large as Mount Everest but die on the decline back down. The real goal is to be more like a f1 pit crew where you leave out the metrics and end up performing better than if you measured them.

passivate|4 years ago

>Developer productivity is not about moving the needle, it is about outcomes, and not outputs.

What criteria does your team use to measure the outcomes and/or during a post-mortem?

> The real goal is to be more like a f1 pit crew where you leave out the metrics and end up performing better than if you measured them.

But all F1 pit crews have defined measurable metrics for success. I don't see the analogy here? Can you help me understand it?

tonyedgecombe|4 years ago

>An outcome is finally merging an unsustainable PR that has sat for a month.

Ultimately the desired outcome is money arriving in the bank. Which is even less fathomable.

drfuchs|4 years ago

There's a perhaps apocryphal story that, to avoid motivating programmers to pad out their comments, IBM decided to measure productivity not by number of lines of source code written, but rather by number of bytes of object code generated. And then when a new release of the PL/I compiler came out, management was quite pleased to learn that it had improved everyone's productivity significantly!

raygelogic|4 years ago

I really don't think you can measure developers by their productivity. the impact of productivity is predicated on design meeting requirements, the accuracy of requirements is predicated on stakeholders knowing what they need.

the only quality that matters is how effective the software is in its business function. how effective does it make stakeholders? how well does it capture engagement by users? the right question to ask changes in business context, but if you can't answer it, you might as well throw darts and flip coins. if you can measure the impact of their code before and after deployment you might have a chance, but it's probably hopeless.

as far as I can tell it boils down to a subjective and qualitative assessment of developer performance. you can also take the contrapositive: where would we be without this person? how long would we have taken to get there without them? what would we not have learned without this person?

I'm nervous about the implicit bias that comes with this kind of perspective, but I think it's the best we have for now.

t3e|4 years ago

There is no end to the search for a developer productivity metric, but it refuses to be found for reasons that are fairly obvious to technical people, but that doesn't stop people from trying – for decades. So now they've retreated to these vectors called "frameworks" that try to obscure with complexity the fact that they are not in any way able to "measure what matters" – in this case, the ratio of value output to value input – nor are in any way deserving of the term "metric". I contend that such non-measures are of absolutely no value to engineering managers; they're management theater and purely a distraction and a waste of time.

Let's leave aside for a moment that this piece begins with an impressively uninformed and circular definition – "Developer productivity, in general, refers to how productive a developer is during a specific time or based on any criteria." – and focus instead on the question of why does this stuff keep popping into existence; what's behind it?

As a tech exec who's researched and given several talks on this to large audiences of non-technical execs like CEOs and CFOs, I believe the root causes are an understandable and intense desire for "visibility" and exec accountability coupled with a set of false beliefs held by non-technical managers including "anything can be measured if you try hard enough" and "nothing can be managed unless it's measured" and the classic quantitative fallacy of "things that can be measured are more important than things that can't be". Besides, it's only fair that if the VPs of sales and marketing have to stand up and talk about funnel metrics and sales rep productivity (with real metrics like net new bookings divided by fully loaded sales rep cost) that the VP of engineering - an often enormous fraction of a SaaS company's budget – should be similarly held to account for some number, any number, we just need a number, so we can look for "trends" (actually, noise). It also seems to be driven by a push from HR for fairness in promotions and terminations, which is also totally understandable, yet misguided.

I have a wisecrack response to non-technical executives when discussing this which is "how do you measure your own productivity?" that helps them understand the absurdity of what they're trying to do and how common it is that no true measure of productivity exists. People really struggle to understand that some metrics, no matter how great it would be to have them, simply do not exist, and so we have this – measurement theater.

[edit: fixed typo]

elpakal|4 years ago

How about starting with mean time to merge a code change? I get that there are other variables that contribute to productivity but things like satisfaction, collaboration etc are extremely difficult to measure well and just IMHO pretty tangential (disclaimer - I work on a dev prod team and my entirety of last year was spent building engineering metric dashboards and discussing what to measure, and it's not easy so I get it).

ravenstine|4 years ago

While we're at it, let's measure productivity by lines of code added. /s

Using merge time is a terrible metric, perhaps worse so than deadlines because it can be more effectively weaponized. What would the incentive be for the developers other than to rush the code review process? Merges are not where you want to be rushing anything, but rather the opposite. If project deadlines are necessary, allowing code review its due time affords developers the ability to informally schedule things without sacrificing craftsmanship for what in reality is a vanity metric. Some code needs the be carefully considered and given time while other code doesn't necessarily need much review or worry at all, but no one can tell that by looking at mean time. If a developer is asked why some tasks had a longer than average mean time, then now they have to waste even more time by explaining themselves. In the worst case, the incentive to rush the review process results in more time wasted on bugs that could have been caught before they even had a chance to be merged.

Am I misunderstanding your view of how mean time to merge would be used?

sixstringtheory|4 years ago

Sounds good to me. Then I imagine the question is all about the kinds of changes that are being merged, which leads to: How many features can be produced per unit time? How many bugs are produced per feature that must be fixed therefore slowing the rate of feature production?

The questions of whether features are appreciated by users, or which bugs should be fixed or not, or if a product is feature complete or needs more, are questions of business, and not developer, productivity and efficiency.

And regarding documentation, I consider that an integral part of code/software that can be judged similarly w.r.t. quality and impact, having its own features and bugs.

blurker|4 years ago

Using a metric like mean time to code change could incentivize brute forcing work. Developers who are aware of the metric may work unsustainable hours and that could lead to burnout. Also, they may try to cut corners on tests, reviews or documentation in order to ship more things faster.

I think that qualitative metrics like satisfaction and collaboration could be helpful, especially when combined with traditional metrics like mean time to merge a code change. Taking my previous example of overworking or cutting corners to achieve high numbers, a qualitative metric for something like satisfaction might indicate a problem where a work output metric wouldn't.

But I think that any combination of metrics will be an oversimplification that could lead to problems if they are the only thing that matters. I'm not sure where the balance lies. I like that metrics can offer an objective view of performance and make it easy to spot trends. But I am wary of them oversimplifying things and dehumanising the team.

sdesol|4 years ago

> How about starting with mean time to merge a code change?

I'm currently experimenting with using "Mode time" as I think it is less susceptible to data skew from outliers. See example below:

https://oss.gitsense.com/insights/github?p=days-open&q=days-...

For popular open-source projects that I used in the link above, the Mode time to merge is less than a day which is quite good in my opinion. And as you can expect, if you look at larger pull requests (>=10 file changes) the overall percentage drops by half as the link below shows:

https://oss.gitsense.com/insights/github?p=days-open&q=pull-...

I think what the link above shows is, you can't just willy-nilly use merge time to measure productivity, since there are a multiple variables at play.

Full disclosure: The link that I referenced is my tool

ok_dad|4 years ago

> Measuring developer outputs can be detrimental because there are not enough data points to understand if the unproductiveness was caused by the developer himself, or by his surroundings/company.

How about just doing your best as an organization and as a people manager to make your developers happy and fulfilled? That increases productivity and motivation to succeed more than anything, IMO. Give great pay raises regularly, give a ton of time off, get rid of people managers who are jerks, etc. If your company has goals, and your developers aren't producing code to meet the goals, your goals are probably too high, you have too few developers, or your developers aren't motivated to complete the goals because they are being treated like shit or don't agree with the goals.

Management always wants to think that they are right in every decision and the employees are the ones who are unproductive, but after decades of working for "the man" in about 10 different industries in different positions/careers, I have found the fault lies with management 80 to 90 percent of the time due to some leadership failure or combination of failures. The problem is poor leadership and lack of motivation, no doubt in my mind. I've also led large groups of people (in the military) and by far the best thing I could do for them was make their personal and work lives better by not getting in the way and by not acting like a dickhead. Adding metrics to things just caused more useless work for me. You can't force change in a system via metrics, the only place where measurement changes outcome is in quantum physics.

I hate to go on a "capitalism vs. communism" type rant, but the best places I have ever worked, with the best "productivity", have been flat orgs where the developers and other employees are included in the decision making and the management and execs are open and caring and don't try to put profits and the business above the personnel. When everyone shared the success or failure of the company on equal terms, we could all get things done that were unthinkable.

anamax|4 years ago

> Management always wants to think that they are right in every decision and the employees are the ones who are unproductive, but after decades of working for "the man" in about 10 different industries in different positions/careers, I have found the fault lies with management 80 to 90 percent of the time due to some leadership failure or combination of failures.

Getting crap reviews/evaluations because a project failed due to management screwups is the universe's way of saying "you should have left long before, but leaving now is your best available alternative."

Much of this discussion is how to protect people from the consequences of staying in a bad situation.

The correct response is to not to try to fix things, but to leave. Staying only perpetuates the problems.

Starve the beast.

AtlasBarfed|4 years ago

Features, Schedule, Cost

Well, schedule and cost at least have straightforward measurements.

The issue then is features. Or is that it?

The "pick two" model really is just the business view. Invariably you will also have:

- adherence to process (ideally process would be an overall enhancement to productivity, but it usually becomes a net-negative)

- maintenance costs (patching, libraries, language versions, database versions)

- infrastructure churn and upkeep

- random org shit: meetings, more meetings, training, certifications, HR, ticket walls, etc

- documentation. Is that important?

- ... are the requirements known? settled? at least ballparked?

As I see here, invariably measuring developer productivity is of course blaming the victim: WHY AREN'T YOU MORE PRODUCTIVE, and of course shrugging away the nigh-unlimited ways an org can hamstring or frustrate a developer.

unknown|4 years ago

[deleted]

jnash|4 years ago

If you are manager of developers and it isn't clear to you who the core developers are and what each member of the team contribute (or not) then you should be fired for incompetency. Talking to developers, and keeping an eye on who does what and knowing the skills of each developer, is the minimum I would expect from a manager. If you can't do that then stop being a manager.

unknown|4 years ago

[deleted]

gerardnico|4 years ago

For every productivity metrics don't forget to balance it with a quality one.

The reality is always in the middle.

Example: If you solve a problem quickly that leads to support ticket being open that's not good.

CraigJPerry|4 years ago

I’m getting a chuckle at the hubris in the comments so far.

Possibly the world expert at this point (Dr Nicole Forsgren) in this exact topic comes up with a framework based on the best of what she knows from years of studying this and refining her approach.

Random HN commenter: ahh just measure time to commit.

Random HN commenter: biases are cool, so just use personal judgement.

RangerScience|4 years ago

Article is confusing because it presents SPACE with proper "header" styling opening the section, but then rolls right into DORA as if it's another paragraphs instead of an entirely different section.

(you're still not wrong tho haha)

nitwit005|4 years ago

I cannot find anyone suggesting using time to commit as a metric in the existing comments.

BXLE_1-1-BitIs1|4 years ago

Good research leads to good design resulting in a small amount of code in the right places that either:

Fix a bug, Add feature set Do something new.

My epithet for one programmer was "He writes a lot of code"

Extra code makes it harder for the next guy to figure out what's going on, and has more space for bugs.

But that came from a "productive" developer and that code can tie down a dozen maintainers in dozens of customer sites.

The productivity is job creation for a bunch of folks whose main ambition is finding a job where they don't have to work with crap code.

I've done a number of projects where I got rid of multiples of code compared what I put in.

The best example was where I replaced a subroutine with a single character constant.

whoomp12342|4 years ago

we(as an inudstry) measure it by claiming we do scrum but in actually just create pomp and circumstance and just do what we would do anyways. It gives us a number, we dont care if its accurate or effective

lgleason|4 years ago

I have yet to see engineers that are even close to being accurate with estimates because they are in essence inventing something new with a bunch of unknowns. Put another way, there are two types of engineers. Those who are bad at estimating who readily admit to it and those who lie about their skills at estimation.