Managing Interuptions and Cancelling a Sprint

One of the more frequent questions I get about Scrum is how to manage interuptions within a Sprint. Interuptions and ad-hoc requests are a fact of life. Unless you’re working on a product which hasn’t launched yet there are going to be customer queries, and if you’ve not launched yet there’s a fair chance there will be questions from the sales and marketing teams or other stakeholders. There are a few options on how to deal with these. You can refuse to answer any query or request for help, or you can be the supportive colleague we all want to be. The real trick is how to balance these queries while still delivering the work the team is being paid to do.

In the past I have offered a few suggestions on how to approach this. What always strikes me though is how many teams either accept or push back on a request without understanding the impact it will have.

Should I support the business by answering quetions or deliver on the Sprint Goal? Photo by Robert Nagy on Pexels.com

Lets consider each piece of work which comes into the team as a new Product Backlog Item. This could be a support request or a question about how a particular piece of functionality works. As with all work it is the Product Owner who is ultimately resposible for maximising the value delivered by the team.

To do this we often need to pieces of information. What is the value this PBI delivers and how much effort will it entail. Now, there is no doubt that a level of pragmatism applies here. If the PBI is a quick email and won’t make any different apply the two minute rule and just do it. However, to avoid small task whiplash I recommend turning off email and IM notifications and replying to them in bulk.

Lets assume that a request for support is going to take more than 2 minutes. Perhaps the application is running slow in production. For the PO to make an informed decision he or she needs to understand what the impact is on the end user, are we talking a mild frustration (which could be looked at in the next sprint) or the system being rendered unusable. We also need to understand how long it will take to resolve. Remember relative sizing, is this likely to be more or less complex than the issue we looked at last week?

The first question which must be asked is “Can we respond to this request without jepodising our Sprint Goal?” If the answer is yes, then more often and not that’s exactly what the team should try and do. They should also use the Daily Scrum to continually reassess that assumption as the Sprint progresses.

However, lets say the team can’t help the business and meet the Sprint Goal. What should they do? Abandon the business in their time of need or give up on the Sprint Goal?

This is where the second question comes in, the urgency of the request. Lets go back to our example of poor performance in production:

  • This is a really big deal and needs to be looked at immediately
  • It’s a moderate issue, we should finish off the current Sprint Goal and then change priorities next sprint
  • The current Product Goal is more important and we don’t intend to look at this any time soon

I will just stress this again, this is the Product Owner’s decision, no one elses.

If the PO believes that whatever this support request is would deliver more value than hitting the Sprint Goal and they have agreed that the team can’t do both then this renders the Sprint Goal obsolete and the Sprint should be cancelled. The team should resolve whatever the issue is and then return to sprinting.

If the Product Owner decides to prioritise the work in a future Sprint (or not) then the work should be shown very clearly in a backlog. This is where transparency is key. Stakeholders grow frustrated when they request work and it vanishes into a black hole. By sharing exactly what the team is working on and seeing their work in the priority list stakeholders are able to engage with the PO to challenge priorities.

However the work is prioritised team members shouldn’t feel pressured into juggling both the Sprint Goal and the requests. That’s a great way to burn our engineers and put Sprint Goals at risk by hiding work in the gaps.

When one of the biggest challenge teams face is when the number of these ad-hoc requests means team sprints become disrupted over and over again. However, unless teams have a process in place when new work comes in then they will never stand a chance of maintaining sprints to continue to develop their product. When this process becomes embed and easy to apply it creates stability and prevents teams being pulled in different directions.

How do you manage disruptions? Do you cancel your Sprints when the goal becomes obsolete? Do you keep your backlog transparent to avoid stakeholder frustration – add a comment below or drop me a message on twitter.

Measuring your Support Queue

Last week I wrote about how we use a dedicated “SWAT Team” to handle the inevitable unplanned work which threatens to creep in and disrupt our sprints. This week I want to talk about how we measure our SWAT Team, what KPIs we use and how we know whether we are doing a good job.

There are hundreds of posts out there which discuss the merits of KPIs for developers and how the wrong ones encourage behaviour such as cherry picking or incorrect prioritisation. I agree with them entirely, that’s why it’s important that the measurements you do take should reflect the customer experience, rather than the team’s performance.

For example, counting the number of tickets solved would be a poor metric because only quick wins would be looked at. Equally timing how long was spent on each ticket would also encourage people to rush in order to get better stats. However, recording how long a customer waited (from raising a ticket to resolution) measures the customer experience – which is, after all what we should be more concerned about!

In my team we use two main statistics to measure how well we’re performing. The first is one I’ve mentioned before. Average Support Ticket Age is crucial for us because it places a numeric value on how long customers are currently waiting for our help.

The second metric we use is Cycle Time, this is the time between a Ticket being passed to the team and it being resolved. In other words, how long is it taking to solve tickets passed to us This is taken as an average of tickets closed over the last six weeks.

The reason we like these two values is not only that they give a customer’s perspective on our work, but because they balance each other nicely. If we measured Cycle Time alone then we’d get fantastic results by simply solving tickets as they come in but longer running and challenging tickets would be left behind. Equally by focusing only on the older tickets it’s likely we’re missing urgent tickets and quick wins which could be resolved quickly. It’s only by continuously improving both values do we provide a good service.

You’ll notice that we don’t worry too much about the volumes of tickets. I find this doesn’t actually matter, the number of tickets being raised varies from month to month, from customer to customer, and will change as customers leave and (much more ideally) join us. If the team is becoming overloaded with tickets then this will be highlighted in the metrics we already have (as we won’t solve the tickets as quickly). A measure of tasks in the queue is less import than whether the team is keeping up with the required workload.

A final point to make us where we start and finish timing. If you’ve ever read The Goal you’ll know that one of Eli Goldratt’s key points is whether you are measuring the right thing (he points out that efficiency increases do not necessarily produce an increase in profit). The decision you have to make with your KPIs (particularly when a ticket was opened) is whether to start your timer when the customer raises the ticket, or when it’s passed to your team. There is no perfect answer here, if you want to understand the entire customer journey then you need to look at Customer to Customer timings, however – if your team only plays a small part in that journey (as in our development team’s case as we have several support teams before us in the process) then your metrics will be less valuable if they include areas outside your control. Consider what you’re measuring, but never forget that you may only play a small part of the customer’s overall journey.

Hopefully I’ve given you a few ideas? Do you agree with my views? How do you measure the performance of your support teams?