I did my FAIR analysis fundamentals course a few years ago and here are my thoughts on it.
FAIR stands for Factor Analysis of Information Risk, and is the only international standard quantitative model for information security and operational risk. (https://www.fairinstitute.org/)
My interest to learn more about FAIR came from two observations.
The first was that we had many definitions of what constitute risk. We refer to “script-kiddies”as risks. Not having a security control is referred to as risk. SQL injection is a risk. We also said things like “How much risk is there with this risk?”
The other observation was with our approach at quantifying risk. We derived the level of risk based on the likelihood and impact. And sometimes it was hard to get agreement on those values.
Having completed the course, one of the things I like about FAIR is their definitions. Their definitions of what is a risk, and what it must included. It should include an asset, threat, effect with a method that could be optional. An example of a risk is the probability of malicious internal users impacting the availability of our customer booking system via denial of service.
It uses future loss as the unit of measurement rather than a rating of critical, high, medium & low. The value of future loss is expressed as a range with a most likely value along with the confidence level of that most likely value. As such it focuses on accuracy rather than precision. I quite like that as it makes risk easier to understand and compare. Reporting that a risk has a 1 in 2 year probability of happening with a loss between $20K to $50K but likely being $30K is a lengthy statement. However it is more tangible and makes more sense than reporting that the risk is a High Risk.
Now it sounds like I’m all for FAIR, but I have some reservations. The main one being that there isn’t always data available to determine such an empirical result. Risk according to FAIR is calculated by a multiplication of loss frequency (the number of times a loss event will occur in a year) with loss magnitude (the $ range of loss from productivity, replacement, response, compliance and reputation). It’ll be hard to come up with a loss frequency value when there is no past data to base it on. I’ll be guessing the value and not estimating it. FAIR suggests doing an estimate for a subgroup if there isn’t enough reliable data available, but again I see the same problem. The subgroup for loss frequency is the multiplication of number of time the threat actors attempt to effect the asset with the percentage of attempts being successful. Unless you have that data, that to me is no less easier to determine.
Overall it still feels like a much better way of quantifying risk. I’ll end with a quote from the instructor. “Risk statements should be of probability, not of predictions or what’s possible.” It resonated with me as it is something I too often forget.
Imagine you are a security manager being asked to do a security assessment on a new software for your organisation. It will be deployed across all Windows workstations and servers and operate as a boot-start driver in kernel mode, granting it extensive access to the system. The driver has been signed by Microsoft’s Windows Hardware Quality Labs (WHQL), so it is considered robust and trustworthy. However, additional components that the driver will use are not included in the certification process. These components are updates that will be regularly downloaded from the internet. As a security manager, would you have any concerns?
I would be, but what if it were a leading global cybersecurity vendor? Do we have too much assumed and transitive trust in cybersecurity vendors?
The recent CrowdStrike Blue Screen of Death (BSOD) incident has raised significant concerns about the security and reliability of kernel-mode software, even when certified by trusted authorities. On July 19, 2024, a faulty update from CrowdStrike, a widely used cybersecurity provider, caused thousands of Windows machines worldwide to experience BSOD errors, affecting banks, airlines, TV broadcasters, and numerous other enterprises.
This incident highlights a critical issue that security managers must consider when assessing new software, particularly those operating in kernel mode. CrowdStrike’s Falcon sensor, while signed by Microsoft’s Windows Hardware Quality Labs (WHQL) as robust and trustworthy, includes components that are downloaded from the internet and not part of the WHQL certification process.
The CrowdStrike software operates as a boot-start driver in kernel mode, granting it extensive system access. It relies on externally downloaded updates to maintain quick turnaround times for malware definition updates. While the exact nature of these update files is unclear, they could potentially contain executable code for the driver or merely malware definition files. If these updates include executable code, it means unsigned code of unknown origin is running with full kernel-mode privileges, posing a significant security risk.
The recent BSOD incident suggests that the CrowdStrike driver may lack adequate resilience, with insufficient error checking and parameter validation. This became evident when a faulty update caused widespread system crashes, indicating that the software’s error handling mechanisms could not prevent catastrophic failures.
For security managers, this incident serves as a stark reminder of the potential risks associated with kernel-mode software, even when it comes from reputable sources. It underscores the need for thorough assessments of such software, paying particular attention to:
1. Update mechanisms and their security implications
2. The scope of WHQL certification and what it does and does not cover
3. Error handling and system stability safeguards
4. The potential impact of software failures on critical systems
While CrowdStrike has since addressed the issue and provided fixes, the incident has caused significant disruptions across various sectors. It has also prompted discussions about balancing rapid threat response capabilities and system stability in cybersecurity solutions.
In conclusion, this event emphasises the importance of rigorous security assessments for kernel-mode software, regardless of its certifications or reputation. Security managers must carefully weigh the benefits of such software against the potential risks they introduce to system stability and security.
Cybersecurity and Circular Economy (CE) are not the terms taken together. Cybersecurity is often related to hacking, loss of privacy or phishing, and CE is about climate change and environmental protection. However, cybersecurity can learn quite a few things from CE, and this post will focus on our learnings from CE for cybersecurity sustainability.
In the times we live in, our economy is dependent on taking materials from the natural resources existing on Earth, creating products that we use, misuse, and eventually throw as waste. This linear process creates tons of waste every day presenting sustainability, environmental, and climate change challenges. On the other hand, CE strives to stop this waste & pollution, retrieve & circulate materials, and, more importantly, recharge & regenerate nature. Renewable Energy and materials are key components of CE. It is such a resilient system that detaches economic activity from the consumption of products.
CE is not a new concept but is popularised by a British sailor, Ellen Macarthur. Her charity advises governments and organisations on CE. The following picture is the “butterfly diagram”, which illustrates the continuous flow of materials within the economy independent of the economic activity. As shown in the picture, CE has two main cycles- The technical and the Biological Cycle. In the technical cycle, the materials are repaired, reused, repurposed, and are recycled to ensure that the products are circulating in the economy. However, in the biological cycle, the biodegradable organic materials are returned to the Earth by triggering decomposition, allowing nature to regenerate, continuing the cycle.
As noted above, the lack of CE can be devastating for the planet. Humans are producing a humongous amount of waste loitered around us is unsustainable and devastating for the humans and other inhabitants of Earth. Similarly, with the ever-increasing cost of cyber-attack breaches, businesses are vulnerable to extinction. IBM Security and the Ponemon Institute commissioned Cost of a Data Breach Report 2021. According to this report, the cost of breaches has increased by 10% in the year 2021, which is the largest is the largest single-year on year increase. The business loss represents 38% of the breach costs due to customer turnover, revenue loss, downtime, and increased cost of acquiring new business (diminished reputation).
Sustainability is about using and/or reusing something for an extended period without reducing its capability from short- to long-term perspectives. Cybersecurity is sustainable if the implemented security resources do not degrade or become ineffective over some time to mitigate security threats. Achieving sustainability is not easy and, most certainly, is not cheap. The organisations must take a principle-based approach to cybersecurity. As the manufacturing process within CE where sustainability is considered from the ground up, Security must be part of the design and production phase of the products. The system shall be reliable enough to provide its stated function. For example, a firewall should block any potential attack even after a hardware failure or a hacker taking advantage of a zero-day compromising your environment.
By nature, digital systems produce an enormous amount of data, including security-specific signals. Unfortunately, finding a needle from a haystack is challenging and often overwhelmingly laborious. In CE, we have found ways to segregate different types of waste right at the source, making it easier to collect, recycle and repurpose faster. Similarly, the systems shall be designed to separate relevant security data from other information at the source rather than leave it to the security systems. This segregation at source will help reduce false positives and negatives, providing reliable and accurate information which can be used for protection. The improved data accuracy will also help prioritise response and recovery activities due to a security incident.
CE’s design principles clearly define its two distinct cycles (technical and biological) as mentioned above in the post to deal with biodegradable and non-biodegradable materials. These cycles ensure that the product’s value is maintained, if possible, by repairing, reusing, or recycling the non-biodegradable materials. Similarly, the materials are returned to nature through the processes such as composting. Cybersecurity, despite the conceptual prevalence of “Secure by Design” principles for a long time, the systems, including security products and platforms, often ignore these principles in the name of convenience and ease of use. Any decent security architecture shall ensure that the design process inherently considers threat modelling to assess risks. The implemented systems are modular, retaining their value for as long as value. This will guarantee that the cybersecurity products, platforms and services are producing the desired outcome and are aligned to the organisation’s business requirements. There shall always be an option to repurpose or recycle components to return on security investment.
The technical cycle in CE is resilient to change dynamically. As discussed above, CE is predominantly detached from the economic conditions and shall continue to hold value until the product can no further be repaired, reused or repurposed. If the product or a component can’t be used, its materials can be recycled to produce new products by recovering and preserving their value. Cyber resiliency is not something new but is being contextualised in recent times by redefining its outcome. As we know, cyber threat paradigms are continually changing, and only resilient systems are known to withstand such a dynamic. Resilient cybersecurity can assist in recovering efficiently from known or unknown security breaches. Like CEs technical cycle, achieving an effective resiliency takes a long time. First, baseline cybersecurity controls are implemented and maintained. Similarly, redundancy and resiliency go hand in hand and therefore, redundancy should just be included by design.
I am sure we can learn many more things from CE to set up a sustainable and resilient cybersecurity program that is self-healing and self-organising to ensure that systems can stop security breaches. So I would like to know what else we can learn from CE. .
I am writing this post in a week when we saw the most significant IT outage ever. A content update in the CrowdStrike sensor caused a blue screen of death (BSOD) on Microsoft Operating systems. The outage resulted in a large-scale disruption of everything from airline travel and financial institutions to hospitals and online businesses.
At the beginning of the week, I delved into the transformation in software developers’ mindsets over the last few decades. However, as the root cause of this incident came to light, the article transitioned from analysing the perpetual clash between practice domains to advocating for best practices to enhance software quality and security.
Developers and security teams were often seen as opposed to security practices over the millennium’s first decade. This is not because they did not want to do the right thing but because of a lack of a collaborative mindset among security practitioners and developers. Even though we have seen a massive shift with the adoption of DevSecOps, there are still some gaps and mature integration of software development lifecycle, Cybersecurity and IT operations.
The CrowdStrike incident offers several valuable lessons for software developers, particularly in strengthening software development cybersecurity programs. Here are some key takeaways:
Secure Software Development Lifecycle (SDLC)
Security by Design: Security needs to be integrated into every phase of the SDLC, from design to deployment. Developers must embrace secure coding practices, conduct regular code reviews, and use automated quality and security testing tools.
Threat Modelling: Consistently engaging in threat modelling exercises is crucial for uncovering potential vulnerabilities and attack paths, ultimately enabling developers to design more secure systems.
DevSecOps: Incorporating security into the DevOps process to ensure continuous security checks and balances throughout the software development lifecycle.
Collaboration and Communication
Cross-Functional Teams: Encouraging collaboration among development, security, and operations teams (DevSecOps) is crucial for enhancing security practices and achieving swift incident response times.
Clear Communication Channels: Establishing clear channels for reporting and communication channels can help ensure a coordinated and efficient response.
Security Training and Awareness: Regular training sessions on the latest security trends, threats, and best practices are vital for staying ahead in today’s ever-changing digital landscape. Developers recognise the need for ongoing education and understand the importance of staying updated on evolving security landscapes.
Impact on Development Speed
Balancing Security and Agility: Developers value security measures that are seamlessly integrated into the development cycle. This allows for efficient development without compromising on speed or agility. Implement security processes that strike a balance between robust protection and minimal disruption to the development workflow.
Early Involvement: It is crucial to incorporate security considerations from the outset of the development process to minimise extensive rework and delays in the future.
Importance of Incident Detection and Response
Preparedness for Security Incidents: Developers should recognise the need for a robust incident response plan to quickly and effectively address security breaches. They should also ensure that their applications and systems can log security events and generate alerts for suspicious activities.
Swift Incident Response: It is important to have a well-defined incident response plan in place. It is crucial for developers to be well-versed in the necessary steps to take when they detect a security breach, including containment, eradication, and recovery procedures.
5. Supply Chain Security and Patch Management
Third-Party Risks and Software Integrity: Developers must diligently vet and update third-party components. To effectively prevent the introduction of malicious code, robust measures must be implemented to verify software integrity and updates. This includes mandating cryptographic signing for all software releases and updates.
Timely and bug-free Updates: It is essential to ensure that all software components, including third-party libraries, are promptly updated with the latest security patches. Developers must establish a robust process to track, test, and apply these updates without delay.
Automated Patch Deployment: Automating the patch management process can reduce the risk of human error and ensure that updates are applied consistently across all systems.
Continuous Improvement
Regular Security Audits: Regular security audits and assessments effectively identify and address vulnerabilities before they can be exploited.
Feedback Loops: Integrating feedback loops to analyse past incidents and strengthen security practices can significantly elevate the overall security posture over time.
In conclusion, the recent IT outage resulting from the CrowdStrike incident unequivocally emphasizes the critical need for robust cybersecurity in software development. Implementing secure coding practices, fostering collaboration between development, security, and operations teams, and giving paramount importance to proactive incident response and patch management can undeniably elevate system security. Regular security audits and continuous improvement are imperative to stay ahead in the ever-evolving digital landscape. Looking ahead, the insights drawn from this incident should galvanise a unified effort to seamlessly integrate security into the software development lifecycle, thereby ensuring the resilience and reliability of digital systems against emerging threats.