CrowdStrike Outage Risk
Yesterday, 19th July, at around 11.00 am IST, a significant global incident unfolded. My computer screen turned blue, and I noticed a ripple effect across the office. People were getting up, asking questions, and the air was filled with a sense of unease. Soon, WhatsApp messages were circulating, not just within our organization, but from different corners of the globe. It became evident that this was a problem of global proportions.
The news started coming about Microsoft's (CrowdStrike) impact on airports, hospitals, TV broadcasters, and many other institutions that depend on it. The reason was the faulty security update, not a cyber-attack. Most parts of the world have come to a standstill, and they are so dependent on the internet that it is like their own bloodstream. From a risk management professional point of view, this is a concentration risk. Concentration risk is studied in the investment subject, but it can be anywhere with a high dependence on one system, and its outage can crash the organisation.
This incident is not a cyberattack but an eye-opener to a world where dependence on technology is increasing. A few years from now, the next generation will not know how to handle the world manually, if necessary.
Now is the time to ask some critical questions while the impact of this incident is still fresh. Once the immediate crisis is over, it's easy for people to forget the lessons learned. We must act now to prevent future incidents and ensure we are prepared for any eventuality.
1. Many presses observe key risks, such as data risks, where data is not within their jurisdiction. Many organisations across the world have now moved to cloud-based servers from their physical servers. When organisations moved to the cloud, did they analyze such risks, quantify their impact, and reserve the cost of recovery in their balance sheets? Such analysis is a part of risk-based decision-making, every organization have a Chief Risk Officer and Board; did they raise such risk, and is it documented? This is fundamental to risk management.
2. This is one risk that crystallised yesterday. We do not know how many such “live wires” are yet to be explored. Risk management is about first identifying the risk, analyzing it, and planning for mitigation. I think it is time to open our eyes and start building robust risk management practices to prevent such crises in the future.
3. It is time to move away from managing the crisis to managing the risk. Of course, implementing risk management has a cost. However, many academic studies have proven that it enhances firm value. Crisis management is expensive and damages the reputation.
Recommended by LinkedIn
4. There is no point in analysing the crystallised risk, finding reasons, and doing nothing. It is now time to engage in proactive risk management.
5. In the last 10-15 years, we have seen global incidences such as the 2008 economic crisis, 2020 COVID-19, and Crownstrike. Do we need more global incidents to learn how to manage risk?
6. The Chief Risk Officer (CRO) role is critical as he is the only person within the organization who looks at risk; the rest of the others work for business growth. Regulators should strengthen the hands of CROs.
7. Let us learn how to implement risk management and reduce crisis management.
Life Insurance professional
5moThe financial impact or the cost of such risks is enormous in terms of cancelled flights, loss of business due to customers shifting elsewhere or cannot process a potential deal in time compliance issues, volatility of sensex. How can we estimate that? Indian railways had a in house system so they were not impacted but tomm that system can be hacked or hit a bug. I think there will always be some residual risks which we cannot cover as we will never be able to imagine all possible scenarios. AI models can help here.