Get Your Head Out of the Clouds

If you are one of the large number of Senior IT folk that are being roasted by their CxOs over the “Error 503” outage caused by Fastly this week, take note that it’s no good blaming your suppliers or bad luck. The scary main board Execs with their sharp pointy sticks are going to poke you hard to remind you that you get excessively paid to keep the lights on as your number one priority.

Obviously, you might grumble that’s it’s not your fault, but actually it is. It’s no good spluttering and cursing into your G&Ts while sat in your home office at the bottom of your garden. If any IT used by your firm is broke, it’s your problem. Until you’re sacked; then it is Someone Else’s Problem. You can moan that you’re now running the business on cloud services because everyone else is. But looking like a lemming won’t do your career prospects any good. So, what should you do? SCARPER. Either literally if you ain’t got the cajones, or follow the acronym below:

Supply Chain Security

Your cybersecurity specialists have been warning you for some time (even before the SolarWinds fiasco) of the dangers that could impact your company from malicious agents infiltrating your IT supply chain and compromising you through your direct service providers. What you also need to do is to get your SPs to confirm their Disaster Recovery/Business Continuity Plans if one of their sub-suppliers craps out.

Cloud Control

Sure, current received wisdom is: “The answer is Cloud – what’s the question?” But the whole cloud services market is even more immature that the rest of yer typical flaky IT ecosystem, so rather than relying on your chosen crappy cloud provider, spread your bets on either dual supply, or hot back-up. All the major cloud providers have dropped the ball in recent months and left you with sore buttocks after grovelling to your board to explain. Yes, it would be more expensive, and you would need to resist using vendor-specific services and APIs, but at least you could flex or move supplier if necessary.

All Eggs in One Basket

Through outsourcing, many of you have almost certainly handed too much power to your [pickpocketing] ‘partner’ and can no longer change who supplies specific services. And in most cases don’t even know who they are using any more. They will have promised premium provision when you signed the contract but will have quietly shifted you to their Poundland providers when you weren’t looking. When the smelly stuff hits the fan, the SI is unlikely to be as responsive as you would have been because they won’t get sacked for failure – you will.

Risk Assessment

Too few companies do true risk assessments for their systems. It’s no good assuming that your vendors will always meet their SLAs, which are themselves rarely what our business has asked for. Remember that 99% availability means three and a half days and nights of downtime each year. Besides the standard IT gremlins, you also need to factor in the black swan events (e.g., COVID, wars, volcanoes, yet another financial meltdown, etc)

Plan B

Having a Plan B when your normal ops go SNAFU is common sense. Of course, I would also recommend having Plans C & D in your back pocket, even if Plan D is to do a Macavity. However, Plan B should not just be the dusty document you knocked up in case you were ever asked for one. Your alt-plan should be fully worked through with agreed process & system changes, be fully tested, and contracts signed to invoke quickly if it becomes necessary. 

End-to-end Testing

Of course, you all diligently test your key systems and processes regularly, don’t you? But sometimes those pesky complicated full end-to-end regressions tests are too much of a pain, and you’ve run out of time/money/tether, so they get ‘deferred’ until a quiet time (i.e., never). However much pain this testing might be, it is nothing compared to wrath of the business when the drains-up wake on your latest IT disaster highlights the missing test certificates, and you find your own personal end staring down at you.   

Resilience & Redundancy

If you want any Rest and Relaxation (R&R) in your professional life, the answer to all the challenges above is to ensure you have true R&R (Resilience & Redundancy) in your IT service provision. Of course, this can be painful, costly & time-consuming to implement and manage but recall where the buck stops for IT and make the right case for relevant risk management.

If you’re struggling with how to justify the business case to do it properly, you either might want to rethink whether you’re in the right job, or you’ve already been outsourced, in which case you no longer care.

John “I, Cloudius” Moe

Leave a comment