Measuring developer productivity
This will undoubtedly be the first of many times I talk about measurement, but let’s get this out of the way: the “productivity” we’re going to talk about here is a theoretical measure of an organization’s ability to convert investment into value. There’s no single actual metric you can look at and improve and suddenly you’ll know that your organization is twice as productive as it was before. I’m sorry.
This theoretical measure is a function of hundreds or even thousands of individual and fundamentally human factors and actions, and defining the guts of that function to understand exactly how to influence its output would be what I like to call “math-hard” vs. just “work-hard.”
For the sake of this space, let’s agree that developer productivity is also a theoretical measure, with the same constraints: it attempts to measure the value created by investing in a product development organization. In theory, every new investment in a product development organization should increase the value the organization can create, by more than the cost of that investment, over some acceptable period of time.
In reality, this isn’t a robot-powered factory we’re dealing with: we’re trying to quantify the value of the output of an organic, emergent, and fundamentally human system, one that is often affected by “outside” forces considered generally interruptive and immutable. In the course of trying to define developer productivity such that you can boil everything down to one number, you may look back and realize you’ve done very little to make anyone more productive.
But if you can’t measure it, you can’t improve it, so where does that leave us? Well, we can’t measure productivity, but we can measure factors that we believe are a step or two away from the mythical productivity metric — if the right set of people can agree that they matter. There’s also some amount of “you know it when you see it” here, aka sentiment. The immeasurability of it makes everyone uncomfortable, but the effusive thanks from happier engineers — and product managers happy to be shipping features faster — usually makes it easier to accept.
Frameworks for measuring ‘productivity’
There are a few frameworks that try to address this measurement challenge. They’re frameworks in the most fundamental sense of the word, sort of like a skeleton is a framework for a person while very much not being a person. All of these frameworks tell you where to look, but they don’t tell you what to do once you spot a problem, and only you can define what better will look like or how good might change over time.
I’ve worked with SPACE, and in the vicinity of something DORA-ish; there are others. Their names tend to be an acronym, which is your first clue that they’re not going to try to boil this down to one number, and neither should you.
DORA is the simplest, but that simplicity really leaves it up to you to figure out what to do about any problems you see, and to figure out whether they’re problems at all. DORA drives you to deploy <things> at speed and quality, which is hard to argue with on the surface, but gets interesting when we start to talk about what a <thing> is and isn’t. Pull requests are a thing — but what if you could eliminate the need for 20% of them with a new tool that product managers could use, also eliminating a ton of back-and forth between PM and engineer to get the change right?1 The thing isn’t the pull request, the thing is the change itself, and it’s rare that companies can consistently even identify a thing, let alone track its progress through the system.
I like SPACE because you can fit DORA-ish things into it if you want, while also recognizing the complexity of a system that, let me remind you, is fundamentally human. It’s good at highlighting the tradeoffs of decisions that, seem, on the surface, to be obviously positive or neutral for DORA-type measurements. SPACE is also good at keeping the door open for innovation, while I can see how a rigid embrace of DORA would bake in certain assumptions about your processes that might be worth questioning.
Where to start
If you’re just starting on your productivity journey, I’d think about the tradeoffs that SPACE highlights, but focus first on implementing DORA metrics and building the muscle for caring about them: they’re well defined and relevant to any company that’s shipping software. If you’re already tracking these basic metrics, SPACE provides good pointers to a broader understanding of productivity in your organization, but it’s a lot less prescriptive than DORA, so you’ll have to do some thinking on your own as well.
All of these frameworks tell you where to look, but they don’t tell you what to do, or how to increase your likelihood of success as you pursue better. I’ll spend a lot of these posts talking productivity-adjacent metrics, choosing them, capturing them, driving organizational alignment, and some other hard-earned lessons. In the meantime, these articles have been incredibly instructive in my own learning:
Measuring Engineering Productivity is an excerpt from Software Engineering at Google, by Ciera Jaspen, lays out more of the rationale for attempting to measure productivity, talks about QUANTS and the goals/signals/metrics approach, discusses when it’s absolutely not worth trying to measure a thing, and highlights the importance of qualitative data to complement quantitative data.
Measuring Engineering Efficiency at LinkedIn (paywall) shares the experience of Max Kanat-Alexander. Don’t miss the last section, which talks about how company culture can make this a particularly challenging problem to solve.
We didn’t eliminate 20% of pull requests, but at Indeed, we did build a tool that let country managers test changes to text strings in the user interface without needing the help of an engineer. Not only did this speed the delivery of the thing, it also allowed us to test things that never would have been tested at all, due to finite engineering resources.