2024 Gartner® report: Tips to bolster your disaster recovery program
No items found.

Cookie consent

By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.

September 22, 2022

How to survive and thrive despite resilience events

At this year’s DRJ Fall conference, Chief Resilience & Control Officer Mike Butler and resilience expert Mark Heywood discussed the mindset that organizations need to adopt in order to be operationally resilient in the new normal. Early in the session, Mark mentions a number of threats and disasters from the past couple of years that many would say could not have been predicted - from the COVID-19 pandemic (and everything that followed) to the Suez Canal blockage. In fact, many of these events could have been predicted, and some of them even were, but the fact remains that no matter how diligent you are, it’s impossible to fully predict every event or disaster that could impact your business, or how many unprecedented events at once can create overlapping impacts for you and your customers.

In light of this, it’s no longer enough to just be great at disaster recovery - you need to have the mindset in your organization that means you can consistently provide the services your customers need even in the face of resilience events. Prevention is better than cure, but in an increasingly unpredictable world, you need to build to a level of resilience where even if you are a victim of unexpected circumstances, you can minimize the negative impact. It’s all about building resistance to failure into your operations.

For many organizations, how they test their recovery processes bears no resemblance to how they would behave during a real resilience event. As Mark noted in the session,

“The question I feared most from my bosses was, am I confident in the recovery? And what I wanted to say was ‘yes, if you give me 16 weeks to prepare.’”

But doing it this way doesn’t prepare you for a real-world incident. When you’re a victim of a cyber attack or one of your data centers goes down, you don’t have weeks to plan and execute a recovery - especially if your customer-facing services are impacted. Instead, teams need to learn to “practice how they play” because when they’re faced with a real incident, they will play how they’ve practiced.

The steps to achieve a “practice how you play” attitude to resilience

During the session, Mark and Mike shared their recommendations for creating a more mature resilience process and moving to a practice-how-you-play mentality. 

1. Record every step of your resilience plan

The first step towards achieving this mindset is a simple one: writing things down. If you don’t have a single, definitive process recorded for end-to-end recovery, you can’t begin to refine the process into something more reflective of your needs.

2. Make your resilience plans executable and automated

The next step is to make that plan executable and augment it with automation. This allows you to turn what would be a static recovery document into a living process that is built into the DNA of the organization. Have teams use the recovery runbooks in business-as-usual so that everyone is familiar with their roles and the runbooks will always be accurate and up to date. 

3. Gain visibility into resilience

Once your recovery processes are written down and executable, there is the opportunity to gain greater visibility. Having to constantly provide updates in the midst of trying to solve the problem piles on extra pressure and takes people away from the real work of recovery. As Mike recalled in the DRJ session, 

“Half of my job was answering executives during a recovery event asking me, ‘are we nearly there yet?’ Real orchestration gives me the answers to the questions I most feared from my executives when I wanted to avoid giving career and pension limiting answers.”

Having a toolset to automatically orchestrate recovery across teams and technology, with real-time visibility and orchestration, makes it easier to understand how successful the recovery will be and can buy thinking time, which is valuable in a crisis. 

Cutover’s Collaborative Automation platform gives you the ability to immediately execute a whole series of recovery plans with ease to deliver quicker recovery and better resistance to failure. Bringing together people and resources and getting the right permissions to react can take time, and disasters don’t care if it’s the middle of the night or a weekend. When you have a fully executable process and the data to make better decisions, things can move much more quickly.

Key takeaways from the session:

  • The implausible has happened a number of times recently and planning for those things is important.
  • Resilience can be a differentiator in terms of having the ability to serve customers during a resilience event that may cause others to drop services.
  • Business Continuity Management is dead, long live Resilience! Learn how to thrive and survive under resilience events.

Watch the full video from the DRJ session below to learn more about how to not only survive but thrive during resilience events.

Find out how Cutover can help you move to unannounced testing.

Ky Nichol
IT Disaster Recovery
Latest blog posts