Software Updates Shouldn't Be a Roll of the Dice: It's Time to Make Provenance Non-Negotiable
The CrowdStrike Outage: A Call for Provenance and Attestations in Software Update Management
In July 2024, a routine Falcon content update from CrowdStrike caused issues on Windows machines, leading to boot loops and blue screens of death. CrowdStrike acknowledged the problem and attributed it to a defect in the update. They promptly reverted the update and guided remediation. However, the incident caused significant disruptions for organizations relying on CrowdStrike's security solutions. Delta Airlines, for example, experienced prolonged operational disruptions that impacted passengers.
Although CrowdStrike's quick response lessened the immediate impact of the outage, the incident raises questions about how such a critical flaw could have slipped through their development and testing processes and how customers choose to apply updates from their suppliers. It highlights the need for a more comprehensive approach to software update management that goes beyond traditional testing and phased rollouts
Best Practices: A Multi-Layered Approach
The collective mass of outages makes one point abundantly clear: organizations should employ best practices for all updates, such as phased rollouts, testing in isolated environments
While traditional update best practices remain essential, the CrowdStrike outage highlights the need for a deeper level of insight and control. Software provenance
In the case of CrowdStrike, provenance data could have illuminated the specific changes introduced in the faulty update, potentially revealing the error before deployment. Moreover, provenance would have enabled a more surgical rollback, pinpointing the exact problematic version and facilitating a quicker recovery for all affected parties.
This incident is not unique. The SolarWinds supply chain attack of 2020, where malicious code was injected into software updates, underscores the devastating consequences of compromised software. Provenance would have helped detect the unauthorized modifications and alerted organizations to the breach. Similarly, the Log4j vulnerability in 2021, affecting countless applications due to its widespread use, could have been more effectively contained with provenance information, enabling rapid identification of vulnerable systems.
By incorporating provenance into their software update strategies, organizations can:
Incorporating provenance not only strengthens the update process but also empowers organizations to make informed decisions about the software they deploy, ultimately fostering a more secure and resilient digital ecosystem.
Critical Elements of Software Provenance:
Recommended by LinkedIn
Why is Software Provenance Important?
How is Software Provenance Achieved?
The Challenge of Vendor Resistance
While the benefits of provenance are clear, some software vendors have been reluctant to embrace it fully. This reluctance stems from concerns about exposing proprietary information, potential legal liabilities, and the added complexity of implementing provenance systems. However, as incidents like the CrowdStrike and SolarWinds outages demonstrate, the lack of provenance can have severe consequences for vendors and their customers.
CrowdStrike recognizes the importance of DevOps maturity
As bad as it was, the impact of this event was mostly inconvenience. The next one could be far worse. A targeted attack by a sophisticated, state-sponsored actor, often referred to as an “advanced persistent threat”(APT), could wreak havoc far in excess of what we just experienced. APT groups could analyze the CrowdStrike update process to identify similar vulnerabilities in other update mechanisms, such as operating system updates, antivirus updates, firmware updates, and intrusion detection systems. Once an attack is successful, APT groups could concurrently spread disinformation about the outage, exaggerating its impact to create additional panic. Or, they could pinpoint their attacks to systems that could cause more damage or be life threatening such as hospital systems, power generation, or water supplies.
Conclusion
As the world moves towards a more integrated software supply chain, we need to adopt provenance and attestation technologies to enhance the trustworthiness of our updates. By embracing these technologies, organizations can reduce the risk of future outages and build greater confidence in the software they rely on.
The CrowdStrike outage will hopefully catalyze change in software update management. This incident is another call to action for companies to demand a more proactive and comprehensive approach to software update management from their suppliers, ensuring the trust and reliability of critical software components. It's important to our financial security, to our customer service, and to our national security.
We just dodged a bullet. We may not be so lucky next time.
References
#CrowdStrike #DevSecOps #cybersecurity #softwareupdates #provenance #attestations
I Help Organizations Shift Compliance Left | Veteran | Co-founder
6moThank you for the Witness and Archivista shout out! Attestations are more than security. Having proof, independent from the developers pipeline, that a QA process or any other test completed provides a huge amount of risk reduction.
Retired From Ingersoll Rand
6moWell said Bob. Can you imagine the repercussions if we had implemented an upgrade that caused those outages?
Subject Matter Expert | Solution Adviser | Application Architect at Farmers Insurance
6moThanks for reminding all the technics and approch summarizing , the CrowdStrike Falcon Sensor bug that caused BSOD issues on Windows machines can be seen as a wake-up call for IT validation and verification practices. It highlights the importance deployment verification to minimize disruption.
Employee Experience Director of IT Strategy, Execution and Professional Development
6moMiss working with you!