The Cost of Fixing Bugs Late

Have you ever been in a situation where you’ve written a piece of code and it’s almost there but there are a few niggling issues which “can wait for V2?”. Have you ever wished, maybe months later that you’d gone back and resolved them at the time? You’re not alone!

More and more developers and managers are beginning to realise the true cost of fixing bugs late, not only on their sanity but on the company’s time and money.

Let’s say for the sake of argument that you’ve finished a piece of code except for a few edge cases. You’re under a lot of time pressure so you decide it’s of a high enough quality and you push it through to QA. The cost to the business of fixing that issue actually increases dramatically the longer it’s left.

Let’s look at a few scenarios:

You fix the issue immediatly

  • Fix time – 30 minutes

You send to QA and it’s rejected there

  • Fix time – 30 minutes
  • Build time – 2 hours
  • QA deployment time – 1 hour
  • QA signoff testing – 24 hours
  • Fix time – 30 minutes

QA passes as an “edge case” but the customer disagrees

  • Fix time – 30 minutes
  • Build time – 2 hours
  • QA deployment time – 1 hour
  • QA signoff testing – 24 hours
  • Deploy to UAT – 3 hours
  • Customer UAT – 1 week

Missed in UAT but discovered as a significant issue in Live

  • Fix time – 30 minutes
  • Build time – 2 hours
  • QA deployment time – 1 hour
  • QA signoff testing – 24 hours
  • Deploy to UAT – 3 hours
  • Customer UAT – 1 week
  • Customer live deploy – 3 hours

 

As you can see, even with these rough estimates the time it takes to resolve the customer’s issue increases dramatically the longer it’s left. Not only that but the cost of time to the business of paying staff to run extra QA cycles or rounds of UAT spirals out of control. We haven’t even considered the final case where it goes to the bug queue to die and a completely different developer has to learn the feature and resolve the problem.

The graph is actually a very common shape*.

graph-2

I’m a big fan of practicality, no software is going to be perfect and it’s unrealistic to expect that you’re going to find and fix each and every issue before shipping to a customer.

However, hopefully this has made you stop and think and has provided a strong argument for making that extra 30 minutes to resolve the issue before it causes a real headache for the business!

*Thanks to fooplot.com for the graphing software.

When and Where to Automate Testing

A year ago I undertook an interesting piece of R&D to write Selenium tests for our main UI. I watched the pluralsight course, learned the difference between WebDriver and the IDE, and started building my Page Object Model. My simple test took the best part of three weeks to build and executed in around half the time a decent QA would take if you gave them a double shot of espresso.

I patted myself on the back, demonstrated the work to our Product Owner, and then advised that we should shelve the work because I wasn’t confident to hand over the mind boggling complexities of waits, framesets, and inherited view models into general practice.

Over time I lost confidence in the project. If it had taken me days to generate even the simplest of tests how would a junior developer fair when asked to automate complex financial scenarios. Quietly I put the idea to the back of my mind and concentrated on other more pressing matters.

That was until last week. We’ve been doing some work on our BACS integration and as part of the regression test I’d enlisted the help of one of our QAs to mock each response code and import it into the system. The process was tedious, repetitive, and I hated myself when I had to tell him he’d missed a vital check off each scenario.

As I was speaking to him cogs in my head began to turn. I’d gone off the idea of large scale test automation because of the complexity of our UI but the BACS processing system doesn’t have a UI. I could knock something together in a couple of hours which would create customers, mock BACS files, and schedule our JobServer. Even more powerful, if I used a technology like SpecFlow our QA could write the tests he wanted, I could automate them, and we’d be able to iterate over every scenario within a couple of minutes. Even more exciting was the idea we could send the feature files off to our Product Owner and banking partner and ask them to verify the behaviour was correct.

Later that week, after we’d proven our project to be a success and found a handful of low priority bugs to be corrected I started to wonder why this automation project had delivered such value where the previous UI automation had failed. I decided this was because:

  • The BACS process was an automated mechanism already so the test automation steps were simpler
  • The UI had been designed for human use, it wasn’t a chore to run through but it was complex for Selenium to navigate the controls
  • Mocking and importing BACS files was repetitive, slow, and tedious

The project turned out to be a huge success, we’re already planning on how we can expand the solution to cover other payment integrations such as SEPA.

The next time you’re considering whether or not you should automate an area of testing consider the nature of the tests. Do they use complex UIs? Do you have to repeat very similar tests over and over again? Do they currently take up a large portion of your testing cycles?

Try to automate the hidden process which run over and over and you wished you could test every time if you had the time!

Why you MUST run your Unit Tests as part of your Build Process

I was browsing StackOverflow the other day (as many geeks are known to do from time to time) and read an answer which describes writing Unit Tests as being like going to the gym. Now, I’m not sure about the final few sentences about finding another job but I do really like the analogy. It’s hard when you start out but the more you do the easier it gets and the more value it adds.

Now, like going to the gym there are couple of fundamentals which if forgotten can undermine everything. Unit Tests don’t require proper warmups or cool downs but they absolutely, definitely, must be run on the build server for their real value to be exploited.

Let me explain why.

Developers are often hackers by nature, we install different things and experiment. That’s what makes us good at our job, it’s also what makes every developer’s computer slightly different so software which runs on one machine could behave very differently on another.

What’s worse, if your developer is in a rush or having a bad day they may completely ‘forget’ to execute the tests before committing code. Now, instead of vigorously tested and proven code being added to source control you’ve got a risk.

Tools like NCrunch help reduce this but the only absolutely foolproof way to ensure that all your tests are executed and passed before deployment is to get your server to run them. Missing out this vital link means you’re opening your build and deployment process up to human error, something you should be very nervous of doing.

Almost all build servers have the capability of running tests so dig out the manual and make sure that when the tests fail, the build fails!

The Phoenix Project Book Review

I recently read The Phoenix Project, the definitive and often referenced DevOps Business Novel by Gene Kim, Kevin Behr, and George Spafford.

I have to admit I’ve been putting off reading this book, perhaps I had some misgivings about replacing Scrum techniques with the new craze of DevOps. Having decided that I was going to invest my time I was tested once again when I realised that Bill, the book’s main protagonist is an infrastructure guy… I’m a Development Manager, I want to know what DevOps can do for do for me not what I can do for the other guys!

That however was the last time I paused before devouring the remaining 370 pages.

Bill’s first few days after his promotion are littered with disasters. He has a project to deliver, infrastructure problems to resolve, and political ambushes to endure. In other words, he’s in the situation we’ve all experienced when teams aren’t communicating and colleagues are so stacked up with their own projects and priorities that they simply can’t assist each other.

With the timely arrival of Eric, the new potential board member Bill slowly equips himself with the tools he needs to resolve his infrastructure headaches and get the Phoenix Project back on track. Eric, acting as a kind of DevOps Yoda helps Bill gain an understanding of the four different types of work, the Theory of Constraints, Systems Thinking and see the value of Automated Deployments.

There’s a nice appendix at the end which helps consolidate the ideas and explains some of the ideas away from the story. There are also some mind blowing statistics about how some of the big web companies such as Amazon, Google, and Netflix.

So would I recommend this book to others? Absolutely! In fact I have, repeatedly to friends and colleagues! Whether you’re a developer, infrastructure guru or IT manager this book will introduce you to several new ways of thinking about workload management and IT processes.

DDDNorth 2016

I was very excited when I saw the announcement of DDDNorth earlier this year. I’ve attended the conference several times before and to see it was once again in Leeds really gave me no excuse not to attend!

Despite having a rather late night the previous evening I was one of the first there, in fact I think I accidentally tailgated a speaker inside and sat quietly in the corner until the masses arrived.

For those of you who don’t know the DDD events are free, one day conferences focused at .NET developers. They’re held around the country and have a very high standard of speakers. DDDNorth is our annual event and usually circulates between the Universities of Leeds, Sunderland and Bradford.

The first session I attended was “Machine Learning for Muggles” by Martin Kearn and it was one of the best geeky talks I have ever attended.

Martin demonstrated how to use the Azure Machine Learning API to upload your own data for analysis. He then took a photo of the audience and used facial recognition and text analysis to ask questions like “show me the happiest person”. Before today I’d always assumed machine learning was only for the likes of Google, before the first coffee break I was already wondering whether I could use the APIs in our own products.

The second session I attended was “10 More Things You Need to do to Succeed as a Tech Lead” by Joel Hammond-Turner. I’d attended one of his talks before (entitled 10 Things You Need to do to Succeed as a Tech Lead) and with my recent change in role I felt it was a wise choice session.

I was right. Joel gave us a good list. It contained tips on requirements, instrumentation, and team training. I was particularly impressed by his thoughts on managing and measuring technical debt!

The last session of the morning was “You read The Phoenix Project and you loved it! Now What?” by Matteo Emili. Having recently bought and demolished TPP while on holiday this was the session I was perhaps most excited about. Matteo argued for pragmatism, he told us that in order to sell process changes the value must be determined. He quoted the common phrase “you can’t manage what you can’t measure” and discussed in depth how you should use the Build-Measure-Learn. There were even a few tips and ticks towards the end to help us get our automation deployments going!

It was lunchtime, which meant sandwiches and GROC talks (I have always been curious to know what that stands for).

I arrived in time to see a few interesting mini-talks. The first was on using the C# interactive window in Visual Studio, the second was around ARM templates for Azure, and the final one was entitled “What’s the Point of Microsoft?” and was a tongue in cheek presentation about how the big tech players compete in today’s software and hardware world.

I’d love to continue, I’d love to tell you about the Microsoft BOT API talks I had lined up for the afternoon. Alas it was not to be, it was around this time that the office called to tell me that several hundred members were having issues with their accounts. No rest for the wicked!

I’d like to take the time to thank the DDDNorth team for putting together yet another fantastic event. The speakers this year were superb and if I get the chance I fully intend to catch the sessions I missed elsewhere in the country at a later date. See you next year!

Hello Kitty

It seems like a weekly occourance nowadays, whenever you turn on the news or open a technical blog you’re faced with another story where a huge organisation has been hacked and personal information has been stolen.

This time it is Hello Kitty who have lost their members’ personal information. According to wired.com more than 3.3 million user accounts have been breached. Unfortunately this is not uncommon, over the last few years we’ve heard similar stories from Sony, Ashley Madison and Experian.

What strikes me time and time again when I read these articles is how poorly my personal information is protected. Let’s look at the Hello Kitty story, wired.com tells us that users passwords were hashed with SHA1 but not salted.

Any computer science graduate or junior developer should be able to tell you that this is insufficient! I consider myself a geek so I’m always interested by this sort of thing. Let’s spend a minute to work out what’s going on…

A hash is basically a repeatable one way encryption algorithm. If you use SHA1 to hash a piece of text (such as a password) it’s trivial to repeat but (almost) mathematically impossible to work out the original string from the hashed value. This is great for developers, I save the hashed value in my database and when you enter your password I simply hash the value you gave me and see if it matches my saved version. Neither you, me or any hacker can work out what your password is even if they have somehow stolen my entire database (which I’m also protecting between DMZs, firewalls and physical security).

At first glance the security used at Hello Kitty should have been enough. Because the hackers who stole the data can’t dehash the passwords they can’t use them to log in, try them on other sites (such as your email account) you’d think this information has limited value.

This is where there’s a gaping flaw in the security and unfortunately it’s down to the users of the site. Any hacker worth their salt (pun intended) knows the most commonly used passwords. Which means I don’t need to know your password, all I need to do is hash “123456” and see which of the 3.3 million users used it. Next I try “password”, then “12345”. You won’t match everyone, but how many of these 3.3 million will use one of the top 100 most common? The top million?

So what’s the solution? We use a salt. This is what Hello Kitty should have done, a salt is a random piece of data which is appended to the password. This is often a piece of random text unique to the user which is concatenated onto the user’s password before the hashing process. This time, even if the hacker has access to the salts their lookup table of common passwords is rendered useless. They would need to calculate the top passwords with the salt for each member, a this is a hugely expensive operation and we’re back to brute force. 

Some of the most recent algorithms involve hashing a password multiple times. The effort for us to hash the hash of a password a few hundred or thousand is negligible but our hacker? They’ll have to calculate the hash of every password they want to try, for each member individually multiplied by the number of times you iterate!

I don’t want to go any further into the technical side. What I take you to take away is that somewhere, someone made a decision at Hello Kitty that a basic SHA1 hash was sufficient. That’s not only a huge error of judgement but it has ultimately left their clients feeling vulnerable as passwords are distributed across the web and has destroyed their investor confidence as the company’s reputation is dragged through the mud. Don’t let this happen to your company, don’t let this happen to your users!

Here’s my message.

  • If you’re a developer and you are not confident in your system’s security, don’t be afraid of looking foolish. Raise the questions.
  • If you are a manager who’s reportee raises a security concern, take it seriously. Your entire teams’ livelihood and clients’ confidence is at risk.
  • If you’re an Internet user, look at the common passwords list and avoid them like the plague. Use strong passwords and don’t repeat them across sites. If you don’t feel confident, don’t hand over your personal information!