In 2023, PwC conducted the Global Crisis and Resilience Survey where they interviewed 1,812 executives. 96% of the business leaders said that they had experienced some sort of disruption in the past two years. 76% of them said that those disruptions had a medium-to-high impact on operations.
That very year, Accenture published a report titled “The Cyber-Resilient CEO” where 96% of the CEOs said that they understood that cybersecurity was key to organizational growth and stability. The study also found that three-quarters of the participants were concerned about their business’s ability to minimize or prevent damages from a cyberattack. Therefore, it's no denying that resilience is a big concern for many companies.
The author James Carse once said, “To be prepared against surprise is to be trained. To be prepared for surprise is to be educated”. And that is the point of these response and recovery plans, it's not just about compliance. It's about minimized downtime, reducing financial penalties, and increasing trust.
Last week we looked at the first pillar of a security contingency plan – Incident Response (IR), which in a nutshell refers to the procedures that a business will take to detect, manage, and eradicate an incident. This week we’ll focus on Business Continuity (BC) and Disaster Recovery (DR).
A BC plan is a playbook or a set of procedures that outlines how a business can sustain its key processes in the event of a disaster (These can include but are not limited to human errors, software bugs, and natural disasters). In other words, it aims to aid the entity in operating as normally as possible during an incident. DR, on the other hand, is geared towards the protection/ recovery of IT systems.
(Note: Due to overlapping some organizations might use an alternative approach. They might view the two plans as one instead of silos).
When developing a BC and DR plan, it’s important that you and your organization get familiarized with two metrics: Recovery Time Objective (RTO), which refers to the acceptable time it takes to restore business processes after an incident, and Recovery Point Objective (RPO), which refers to an acceptable amount of data an entity is willing to lose during a disaster.
So what steps should organizations take to develop a BC plan?
- Carry out a risk assessment: Irrespective of the organization’s size and structure, understanding the different risks and knowing how you’ll deal with them is key. This should be a team sport. This should factor in incidents like natural disasters, cyberattacks, human error, unplanned downtime, power outages, and data corruption.
- Identification of the core processes and systems within the organization: This can be achieved through a Business Impact Analysis (BIA). This is different from a risk assessment. A risk assessment helps organizations identify threats and vulnerabilities as well as their likelihood and impact. A BIA, on the other hand, helps identify critical business processes and determines how an impact (operational, economic, reputational, and/ or contractual) could affect the business.
- Identification of ways to mitigate risk as well as prepare and recover from the loss of core elements: Once you’ve understood the risks and impacts you can brainstorm strategies to reduce these. For example, if your region is prone to floods. You could create a policy and procedure to allow your employees to work from home. In order to deal with the loss of critical items you could consider installing an antivirus, security system, and/ or fire alarms to deal with natural and/ or man-made incidents (These can include cyberattacks, vandalism, and other hazards).
- Create a recovery team: This team will help assess the losses and initiate the actions pertinent to recovery. It's vital that you document and communicate the different roles and responsibilities.
- Evaluate and update your plan: This plan is a living document and must reflect the evolving/ updated risks. You can test this by running tabletop exercises. Here you will, typically, invite the business continuity team and the relevant stakeholders and run a mock security event. The objective is to initiate a discussion around roles and responsibilities, coordination, and the ability to make decisions. You could also conduct tests using different software and tools within the operational environment to obtain quantitative metrics. NIST SP 800-84 provides guidance on conducting such tests and exercises.
An example of a BCP template can be found here.
(Note: For more information, you could consult ISO 22301. It establishes the requirements of a business continuity management system ).
- Conduct a risk assessment and a BIA: The purpose of this is the same as in the case of a BC plan. With regards to a risk assessment, you can perform it using a qualitative or quantitative approach. There is no right or wrong method. Your impact assessment should factor in scenarios like, but not limited to, loss of income, recovery costs (this could be equipment, even public relations), business downtime, fines associated with lack of compliance, and/ or reputational damage.
(Note: In order to properly manage the incident, it is useful to create an inventory of the different assets. In addition to it being periodically reviewed, it is recommended to use labels. These can be, for example, critical, important, and unimportant. The goal is to help the entity prioritize and apply an adequate level of protection ).
- Define suitable measures and procedures: Here you should outline the different procedures your organization should trigger in the event of a disaster. These can include Backups and Recovery, the use of redundant infrastructure, and alternative worksites. When it comes to backups you might want to implement the 3-2-1 rule. This means you have three copies of your data, stored on different types of media (This could be in your data center and cloud) and one copy in an offsite location (This helps ensure that in the event of a natural disaster, for instance, your information isn’t destroyed).
- Identify roles and responsibilities as well as consider the elements of communication and reporting: Here it might be useful to create roles like an Incident Reporter (Someone who notifies the relevant stakeholders (like executive management, relevant third parties, customers, etc) and possibly the authorities when an undesirable event takes place), Supervisor (Someone who is accountable for ensuring that the different team members are performing their assigned tasks) and an Asset Manager (Someone responsible for securing the assets during a disaster).
- Evaluate and update your plan: Just as in a BC plan, this “phase” is equally important and must be carried out periodically. Again the goal of this is to test the effectiveness. This also means that new assets must be added to the plan.
An example of a DR plan is the one shared by Southern Oregon University. You can find it here.
(Note: While creating a BC and DR plan it's useful to define a scope and your objectives. It’s also important to consider risks associated with Third-Party and the Supply Chain).
(Note: As mentioned earlier regular testing is key. There isn’t a fixed frequency. Some organizations will test these plans two to four times a year. The exact number might depend on aspects such as industry and organization).
Next week we’ll discuss the importance of patching.
This article is part of a project called Security Chronicles, written jointly with
Walter Buyu
.