Our Assumed Trust in Cyber Security Vendors

Imagine you are a security manager being asked to do a security assessment on a new software for your organisation. It will be deployed across all Windows workstations and servers and operate as a boot-start driver in kernel mode, granting it extensive access to the system. The driver has been signed by Microsoft’s Windows Hardware Quality Labs (WHQL), so it is considered robust and trustworthy. However, additional components that the driver will use are not included in the certification process. These components are updates that will be regularly downloaded from the internet. As a security manager, would you have any concerns?

I would be, but what if it were a leading global cybersecurity vendor? Do we have too much assumed and transitive trust in cybersecurity vendors?

The recent CrowdStrike Blue Screen of Death (BSOD) incident has raised significant concerns about the security and reliability of kernel-mode software, even when certified by trusted authorities. On July 19, 2024, a faulty update from CrowdStrike, a widely used cybersecurity provider, caused thousands of Windows machines worldwide to experience BSOD errors, affecting banks, airlines, TV broadcasters, and numerous other enterprises.

This incident highlights a critical issue that security managers must consider when assessing new software, particularly those operating in kernel mode. CrowdStrike’s Falcon sensor, while signed by Microsoft’s Windows Hardware Quality Labs (WHQL) as robust and trustworthy, includes components that are downloaded from the internet and not part of the WHQL certification process.

The CrowdStrike software operates as a boot-start driver in kernel mode, granting it extensive system access. It relies on externally downloaded updates to maintain quick turnaround times for malware definition updates. While the exact nature of these update files is unclear, they could potentially contain executable code for the driver or merely malware definition files. If these updates include executable code, it means unsigned code of unknown origin is running with full kernel-mode privileges, posing a significant security risk.

The recent BSOD incident suggests that the CrowdStrike driver may lack adequate resilience, with insufficient error checking and parameter validation. This became evident when a faulty update caused widespread system crashes, indicating that the software’s error handling mechanisms could not prevent catastrophic failures.

For security managers, this incident serves as a stark reminder of the potential risks associated with kernel-mode software, even when it comes from reputable sources. It underscores the need for thorough assessments of such software, paying particular attention to:

1. Update mechanisms and their security implications

2. The scope of WHQL certification and what it does and does not cover

3. Error handling and system stability safeguards

4. The potential impact of software failures on critical systems

While CrowdStrike has since addressed the issue and provided fixes, the incident has caused significant disruptions across various sectors. It has also prompted discussions about balancing rapid threat response capabilities and system stability in cybersecurity solutions.

In conclusion, this event emphasises the importance of rigorous security assessments for kernel-mode software, regardless of its certifications or reputation. Security managers must carefully weigh the benefits of such software against the potential risks they introduce to system stability and security.