- The 1st Question
- The 2nd Question
- The Background
- The Metrics
- The Tools
- The Prices
- The Conclusion
The 1st Question
"Q: How can we measure software development teams performance?"
But, of course, then :
"Q: Which metrics?"
Not so easy.
|Lines Of Code||Bad. We all know this. Too many lines = bloat, too few = complexity.|
|Commits||Ugly. Could be gamed, very simply. Unclear value.|
|Velocity||Ugly. Could be gamed. Is relative.|
|Utilization||Ugly. 100% isn't a good thing.|
|Features||Ugly. Lots of features being released isn't always what you want.|
Etc. There are lots of bad ways to track software development performance.
Lucky for us, the hard-work (and hard research) has already been done by other people.
The 2nd Question
So we can, but...
"Q: Why are we measuring performance?"
So teams can improve, and so we have the tools to communicate the impact of changes and decisions on those teams.
- Did the re-org last month help us?
- How are we doing this month vs last?
- Is code debt catching up to us?
Etc. Without metrics, it becomes a more subjective discussion. We love numbers.
Also, the right metrics can give engineers as a way to communicate harder to value root causes, technical debt, training, innovation, investments can be measured and valued - if we have the tools to do so.
A book, published in 2018 focuses on this question and answers it.
Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations
It says good measures have 2 characteristics:
- They focus on global - so teams are not misaligned, they instead push towards the same goal.
- They focus on outcomes, not output - busy work isn't helpful.
The book has a LOT more than this, well worth reading if you haven't already.
|Cycle Time||Time it takes a task from the start of development to it being live in production.|
|Change Failure Rate (CFR)||The percentage of deployments we have to rollback or apply a hotfix to.|
|Mean Time To Recovery (MTTR)||Hours elapsed from the start of a service failure to system recovery.|
|Deployment Frequency||How often we deploy to production.|
The Accelerate book goes into a lot of detail about why these are good, how they correlate to high performing teams and benchmarks for team categorization. That's outside the scope here - step 1 is to track them.
Tracking and optimizing for these metrics can have a profound impact on product delivery. As well as being an excellent way to have those harder conversations with stakeholders around things like technical debt, re-platforming or other larger non feature lead work.
OK great. How do we track this? It will depend. Each metric may need its own solution.
|Spreadsheet||Can work, but it could be a lot of work.|
|Ticketing system||To the extent they support it, its good, but likely you won't get exactly what you need.|
|Dedicated Solution||Will give you the exact data, but you pay for it - is it worth it?|
As a last resort only, instead look to automate things. Can be an OK place to start, or if tracking it any other way is even more time consuming. But generally, you want to avoid this.
Can get quite close with the data it has. JIRA, for example, offers a range of reports (as would other tools). The closest in JIRA is the "Control Chart".
You get cycle time, which is a lot of value and interesting (if a bit confusing) graphs. The other metrics require additional data (e.g. deployments, failures, etc) which this graph doesn't show and the tool itself may not have access to.
For some teams, the Control Chart + spreadsheet could be enough.
Accelerate metrics are popular, so people have built things to automate/improve/display the metrics.
There are quite a few companies which do this, most tie into your ticketing and source control systems, to extract the needed data. A quick survey of the marketplace gave me the following services:
|LinearB||https://linearb.io/||We correlate and reconstruct Git, project and release data so you get real-time project insights and team metrics with zero manual updates or developer interruptions.|
|Haystack||https://www.usehaystack.io/||Remove bottlenecks, optimize process, and work better together with insights from your Github data.|
|Plural Sight - Flow||https://www.pluralsight.com/product/flow||Accelerate velocity and release products faster with visibility into your engineering workflow. Flow aggregates historical git data into easy-to-understand insights and reports to help make your engineer teams more successful.|
|WayDev||https://waydev.co/||Waydev analyzes your codebase, PRs and tickets to help you bring out the best in your engineers' work.|
|CodeClimate||https://codeclimate.com/||Velocity turns data from commits and pull requests into the insights you need to make lasting improvements to your team’s productivity.|
Plenty of options, they provide very similar services (with their own take/pros/cons).
They all provide some or all of the accelerate metrics, but their ability to fit your team is something you will need to test for yourself.
Luckily, integration with these services seems to be VERY fast, you hook up your version control and optionally ticketing system, and they provide instant value/data.
Pricing isn't simple, each SaaS solution has different breaks and ways of pricing/scaling/segmenting - but all seem to price per dev.
For my needs, I'm thinking about total yearly cost, so I did some quick maths.
The following is based on a tier which includes being linked to a ticketing system to maximize the features/metrics available.
I went with a Fibonacci increment, to cover most cases, you can copy the sheet if needed.
As you can see, there is quite a difference in price, with the cheapest options depending on team size.
- Performance of software teams can and should be measured
- Accelerate Metrics https://amzn.to/3e9EPcK are the industry standard
- Tools can help, but won't cover ever case
- Small teams can get services for free, with low adoption time
- Larger teams should probably see if they get value, consider carefully TOC and start with a trial/cheapest option, before going all in on a large additional spend. It can add up FAST.
What did I do?
Currently, testing LinearB with several teams, which so far have seen good value from the data.