Step-by-Step Guide to Creating a Disaster Recovery Plan

December 17th, 2020

At a time when less-than-great news has become the norm, it’s hard to act surprised when a crisis looms. Although we continue to hope for the best, we’ve all come to expect the worst—which is where having a disaster recovery plan ready to roll is crucial.

A comprehensive recovery plan will minimize the effect of a natural disaster on business continuity, compliance, and data loss. A good plan also helps speed up recovery from cyberattacks, such as those recently reported by Japanese game developer Capcom, Italian beverage maker Campari, and toy giant Mattel.

If your organization’s disaster recovery plan is out of date, insufficient, or, worse, nonexistent, let these events motivate you to review, revise, or create a recovery strategy now, before you need it. 

Here are eight steps to creating a disaster recovery plan that will help prevent data loss, facilitate business continuity, and ensure your regulated data and SLAs remain in compliance.

 

Step 1: Create a Disaster Response Team and Document Responsibilities

Your disaster response team will spearhead recovery efforts and disseminate information to employees, customers, and stakeholders during a crisis. 

Assign each team member specific tasks during the response and document them so everyone knows who is in charge of what. You will also need backup staff for key team members in case a designated lead isn’t available during a crisis.

Step 2: Set Clear RTOs and RPOs

Recovery time objective (RTO) is the length of time an application can be down before the business is negatively impacted. RTO varies widely among applications because some can be down for only a few seconds before the business, customers, or users are impacted, whereas others can be down for hours, days, or even weeks. 

RTOs are calculated based on application importance:

  • RTO near zero: Mission-critical applications that must failover 
  • RTO of four hours: Less critical, so there is time for on-site recovery from bare metal
  • RTO of eight or more hours: Nonessential applications that can be down indefinitely

Recovery point objective (RPO) is the most data that can be lost before the business is significantly harmed (i.e., how much buffer you need between an outage and the most recent working backup). 

RPO is based on how much you are willing to spend to backup a particular application, because it can get expensive quickly:

  • RPO of near zero: Use continuous replication (mission-critical data)
  • RPO of four hours: Use scheduled snapshot replication
  • RPO of 8-24 hours: Use existing backup solution (data that can potentially be recreated from other repositories)

Step 3: Make a Blueprint of the Network Infrastructure

Creating detailed documentation of your entire network infrastructure will make it much easier to rebuild the system after a disaster, especially if the network was corrupted by a cyberattack. 

Different components of the system have different levels of importance to business continuity, so be sure to indicate the priority of each service as mission-critical, essential, or nonessential so they can be restored in the appropriate order. Don’t forget to include system dependencies in your blueprint, because they may impact how you prioritize recovery.

Step 4: Select a Disaster Recovery Solution

Storage capacity, recovery timeline, and configuration complexity will affect the cost of a disaster recovery solution. In many cases, you are choosing between a solution that offers quick recovery times but may lose days of data and a solution that maintains system availability but kills you with high complexity and costs.

Look for a disaster recovery solution like Arcserve UDP Cloud Direct that affordably protects your systems and applications from data loss. Arcserve also minimizes complexity by letting you manage backup and disaster recovery and restore service-level agreements from a single web-based UI.

Step 5: Create a Checklist of Criteria for Initiating the Disaster Response Plan

Not every incident warrants a full-fledged deployment of your disaster response plan. Creating a checklist of criteria to identify what constitutes a disaster helps your recovery team know when it’s time to jump into action without wasting resources or money by overreacting to a minor threat.

For example, a temporary power outage and a direct hit from a category 4 hurricane require very different responses.

Step 6: Document the Disaster Recovery Process

To ensure data and operations are restored quickly after a disaster, create step-by-step instructions in plain language so your team can start the disaster recovery effort as soon as it’s safe to do so. 

Store a copy of the disaster recovery plan away from the network—preferably in the cloud—to protect it from corruption during a ransomware attack or physical loss from a natural disaster.

Step 7: Test Your Disaster Recovery Plan

Conduct regular tests of your disaster recovery plan to ensure it will work when you need it to. Run a partial recovery test twice a year and a full recovery simulation annually.

Additionally, it doesn’t hurt to periodically spring surprise drills on the company so you can get an accurate assessment of how well the processes will work in the event of a real emergency. 

Step 8: Review and Update Your Disaster Recovery Plan Regularly

Post-COVID-19, there will be a lot of movement within companies. Changes may include employees leaving or joining the company, policies being modified to meet new regulations or standards, or business units being consolidated. 

Your disaster recovery plan needs to be reviewed and updated regularly to reflect these changes and how they impact the recovery process. For more details on protecting and restoring your organization’s data and applications before, during, and after a crisis, download How to Build a Disaster Recovery Plan.