Measuring your Support Queue

Last week I wrote about how we use a dedicated “SWAT Team” to handle the inevitable unplanned work which threatens to creep in and disrupt our sprints. This week I want to talk about how we measure our SWAT Team, what KPIs we use and how we know whether we are doing a good job.

There are hundreds of posts out there which discuss the merits of KPIs for developers and how the wrong ones encourage behaviour such as cherry picking or incorrect prioritisation. I agree with them entirely, that’s why it’s important that the measurements you do take should reflect the customer experience, rather than the team’s performance.

For example, counting the number of tickets solved would be a poor metric because only quick wins would be looked at. Equally timing how long was spent on each ticket would also encourage people to rush in order to get better stats. However, recording how long a customer waited (from raising a ticket to resolution) measures the customer experience – which is, after all what we should be more concerned about!

In my team we use two main statistics to measure how well we’re performing. The first is one I’ve mentioned before. Average Support Ticket Age is crucial for us because it places a numeric value on how long customers are currently waiting for our help.

The second metric we use is Cycle Time, this is the time between a Ticket being passed to the team and it being resolved. In other words, how long is it taking to solve tickets passed to us This is taken as an average of tickets closed over the last six weeks.

The reason we like these two values is not only that they give a customer’s perspective on our work, but because they balance each other nicely. If we measured Cycle Time alone then we’d get fantastic results by simply solving tickets as they come in but longer running and challenging tickets would be left behind. Equally by focusing only on the older tickets it’s likely we’re missing urgent tickets and quick wins which could be resolved quickly. It’s only by continuously improving both values do we provide a good service.

You’ll notice that we don’t worry too much about the volumes of tickets. I find this doesn’t actually matter, the number of tickets being raised varies from month to month, from customer to customer, and will change as customers leave and (much more ideally) join us. If the team is becoming overloaded with tickets then this will be highlighted in the metrics we already have (as we won’t solve the tickets as quickly). A measure of tasks in the queue is less import than whether the team is keeping up with the required workload.

A final point to make us where we start and finish timing. If you’ve ever read The Goal you’ll know that one of Eli Goldratt’s key points is whether you are measuring the right thing (he points out that efficiency increases do not necessarily produce an increase in profit). The decision you have to make with your KPIs (particularly when a ticket was opened) is whether to start your timer when the customer raises the ticket, or when it’s passed to your team. There is no perfect answer here, if you want to understand the entire customer journey then you need to look at Customer to Customer timings, however – if your team only plays a small part in that journey (as in our development team’s case as we have several support teams before us in the process) then your metrics will be less valuable if they include areas outside your control. Consider what you’re measuring, but never forget that you may only play a small part of the customer’s overall journey.

Hopefully I’ve given you a few ideas? Do you agree with my views? How do you measure the performance of your support teams?

Monitoring Support Ticket Age

There are various ways you can monitor how effective your support process is. You can record the number of tickets opened and closed, you can keep track of queue counts, or you can ask your customers!

When I develop a KPI I want a simple numerical value, something I can examine, track, and use to decide whether we are improving or struggling.

One of my favorite statistics to measure is the average age of tickets in the queue. The benefits of this are:

  • You can calculate it at any time and don’t need to measure tickets added and closed for a prolonged period of time.
  • You can get a feel for how long tickets are sat waiting for a resolution.

In other words it gives a rough value of how long a customer can expect to wait before you respond to them.

However, as with all measurements the act of measuring it can change the result. In our world this can have a very direct effect, if you only focus on how old tickets are you encourage a First-In-First-Out approach (as Developers and Support Analysts fight over the oldest tickets in an effort to bring down the average). This can undermine your attempts at ticket triage and prioritising.

This is a known risk when working out KPIs for your team. KPIs are often easy to manipulate (in our example a developer choosing to pick up an old P4 ticket over a new P1). It’s a risk yes, and any skilled problem solver will quickly work out how to game this measurement if they need to. However, I would hope you have enough confidence in your team to feel confident they will not to ignore urgent tickets for the sake of an arbitrary numerical goal.

What’s important is identifying the behaviours you want to encourage in your team. One of my first priorities in any role is to give customers a good level of service (and make sure they aren’t kept waiting too long for answers and support). If I want to achieve that I need a value which helps me record it, until I find a better one ticket age is my KPI of preference.
What KPIs do you use? Make your suggestions in the comments below…