Gartner® report: 9 Principles for Improving Cloud Resilience
Download
No items found.
Blog
May 7, 2024

Webinar recap: On-premises and AWS cloud disaster recovery challenges

Last week industry experts from Bank of America, Cutover and AWS discussed on-premises and AWS cloud recovery challenges and strategies in this webinar hosted by AWS.

Speakers included:

  • Alla Piltser: Advisor, Brya Advisory, LLC | Ex MD, Head of Digital Solutions and Automation, Tech Infra, Bank of America
  • David McCann: Technology advisor and board member, Cutover
  • Ky Nichol: Chief executive officer, Cutover
  • Mark Donovan: Director of WW tech and principal SA, Amazon Web Services
  • Yaniv Ehrlich: GM, AWS elastic disaster recovery, Amazon Web Services

This post is a summary of some of the points discussed in the webinar. Watch the full session on demand here.

One of the key points highlighted by the session is that organizations are up against significant challenges when it comes to maintaining resilience for their hundreds or thousands of applications. Below are some key challenges that were raised:

Application portfolio challenges for cloud disaster recovery

Every institution has an application portfolio and migration and modernization of this portfolio is continuing to increase. This can be challenging for a number of reasons:

Cloud disaster recovery cost optimization and budget constraints

Despite increasing modernization, according to Dave McCann, many enterprises are only 15-50% migrated to the cloud. Getting to 100% is challenging enough, but most of these organizations are also now facing the complications of investing in AI, LLM, and microservices, leading to competition for budgets. Cost optimization and budget constraints are still a strong pressure despite pushes to innovate and modernize, so IT operations teams have to increase their productivity to keep up, with an imperative to adopt new tools to achieve this without adding to headcount.

Application and data dependencies

This increased reliance on technology has led to hundreds of new interdependencies due to the increased use of apps, distributed services, and SaaS solutions. A lack of clear understanding of application dependencies and their criticality makes it difficult to prioritize recovery efforts during an outage. When an outage occurs, multiple teams will need instant and accurate awareness of resource dependencies, which becomes more difficult the more complex your technology ecosystem is.

The complexity of data with multiple databases, file servers and object storage, multiple operating systems and maintaining a high availability environment adds more expense to remaining resilient, too. Some organizations are concerned that they won’t be able to restore their systems after an outage due to a lack of resources for disaster recovery and failover environments because of this higher cost. 

Legacy systems present cloud disaster recovery challenges

The existence of legacy systems in major enterprises presents added risk when it comes to resilience as well as creating roadblocks for modernization and increasing automation.

Legacy ITSM and ITIL tools and processes

In mature enterprises, there are still legacy applications or a legacy infrastructure stack where resilience is not always part of the design and architecture. Legacy ITSM tools to support ITIL processes exist but often include manual steps and handoffs which increase the chance of human error and inconsistencies.

Shortage of skilled workforce and flat headcount

A lack of expertise and an inability to communicate effectively across different teams can hinder an organization’s ability to effectively prepare for and respond to incidents. 

Outdated technology and technical debt

A mixture of outdated technology and technical debt makes existing applications and technology complex and expensive to integrate with modern disaster recovery systems and conduct proper end-to-end testing.

Cloud and SaaS technologies

The ongoing shift to SaaS and cloud technologies means that application portfolio modernization is increasing, with 30-50% of many organizations’ application portfolios becoming SaaS. This means that external services own a lot of data and there is a lot of integration with internal tools, complicating the execution and tracking of disaster recovery efforts.

Hybrid cloud complexity

Many enterprises operate with on-premises databases, particularly in banking, government, and healthcare, and managing a hybrid on-premises and cloud infrastructure adds additional complexity to disaster recovery planning and execution.

Cloud-centric disaster recovery solutions

Cloud-centric DR solutions create limitations on meeting regulatory compliance and there are potential cost issues dependent on the architecture, storage and resources required for recovery.

Multi-cloud and cloud disaster recovery shared responsibility

70% of large enterprises run on multiple public clouds, so while individual applications do typically run on one cloud, data may be shared across clouds, so good data management and networking practices are central to recovery. 

AWS has a shared responsibility model with customers on resilience. AWS is responsible for the security of the cloud, including hardware, AWS global infrastructure and software, while the customer is responsible for security in the cloud, including client-side data, the operating system, platform, applications, identity and access management, and more. Organizations need to fully understand the scope of their responsibility when it comes to cloud resilience and recovery to ensure their disaster recovery plans are comprehensive.

Organizations have an obligation to minimize customer impact

Organizations have an obligation to reduce the impact that outages or disasters have on their customers. This includes whether customers and employees are able to access systems during a disaster or recovery, and protecting consumer data. Adopting new technology can increase risk in this area as there is an increased reliance on e-communication channels and digital tools, as well as the involvement of third parties. Cyber attacks are also increasing in sophistication and volume, necessitating robust disaster recovery plans to ensure data and applications can be restored quickly. 

Regulatory compliance requirements

Regulatory compliance requirements, such as DORA and SEC, are becoming commonplace for cyber resilience, and many organizations struggle to ensure they have accurate, accessible evidence of their ability to meet these requirements.

Emerging and proliferating sovereignty standards

75% of countries have implemented some data localization rules which has major implications for organizations’ IT infrastructure footprint and data architecture.

Cloud disaster recovery strategies and solutions

Clearly, navigating cloud recovery and resiliency in this increasingly complex and demanding environment comes with a number of serious challenges. Fortunately, technology also offers solutions to these challenges when used effectively. Two key areas that can help organizations ensure their disaster recovery preparedness are:

Cloud disaster recovery automation

Implementing automated disaster recovery procedures can minimize human error and expedite the recovery process.

Regularly testing and updating cloud disaster recovery plans

Disaster recovery plans need continuous testing and revision to account for evolving threats, infrastructure changes, and new applications. Because of this, it’s best practice to not only test failovers but also run from failed over sites on a regular basis.

Watch the full webinar here to find out what solutions the experts offered to these challenges and how Cutover can help in these key areas and more!

Kimberly Sack
Cloud disaster recovery
Latest blog posts