Strengthening Software Development Security: Critical Takeaways from the CrowdStrike Incident
I am writing this post in a week when we saw the most significant IT outage ever. A content update in the CrowdStrike sensor caused a blue screen of death (BSOD) on Microsoft Operating systems. The outage resulted in a large-scale disruption of everything from airline travel and financial institutions to hospitals and online businesses.
At the beginning of the week, I delved into the transformation in software developers’ mindsets over the last few decades. However, as the root cause of this incident came to light, the article transitioned from analysing the perpetual clash between practice domains to advocating for best practices to enhance software quality and security.
Developers and security teams were often seen as opposed to security practices over the millennium’s first decade. This is not because they did not want to do the right thing but because of a lack of a collaborative mindset among security practitioners and developers. Even though we have seen a massive shift with the adoption of DevSecOps, there are still some gaps and mature integration of software development lifecycle, Cybersecurity and IT operations.
The CrowdStrike incident offers several valuable lessons for software developers, particularly in strengthening software development cybersecurity programs. Here are some key takeaways:
Secure Software Development Lifecycle (SDLC)
Security by Design: Security needs to be integrated into every phase of the SDLC, from design to deployment. Developers must embrace secure coding practices, conduct regular code reviews, and use automated quality and security testing tools.
Threat Modelling: Consistently engaging in threat modelling exercises is crucial for uncovering potential vulnerabilities and attack paths, ultimately enabling developers to design more secure systems.
DevSecOps: Incorporating security into the DevOps process to ensure continuous security checks and balances throughout the software development lifecycle.
Collaboration and Communication
Cross-Functional Teams: Encouraging collaboration among development, security, and operations teams (DevSecOps) is crucial for enhancing security practices and achieving swift incident response times.
Clear Communication Channels: Establishing clear channels for reporting and communication channels can help ensure a coordinated and efficient response.
Security Training and Awareness: Regular training sessions on the latest security trends, threats, and best practices are vital for staying ahead in today’s ever-changing digital landscape. Developers recognise the need for ongoing education and understand the importance of staying updated on evolving security landscapes.
Impact on Development Speed
Balancing Security and Agility: Developers value security measures that are seamlessly integrated into the development cycle. This allows for efficient development without compromising on speed or agility. Implement security processes that strike a balance between robust protection and minimal disruption to the development workflow.
Early Involvement: It is crucial to incorporate security considerations from the outset of the development process to minimise extensive rework and delays in the future.
Importance of Incident Detection and Response
Preparedness for Security Incidents: Developers should recognise the need for a robust incident response plan to quickly and effectively address security breaches. They should also ensure that their applications and systems can log security events and generate alerts for suspicious activities.
Swift Incident Response: It is important to have a well-defined incident response plan in place. It is crucial for developers to be well-versed in the necessary steps to take when they detect a security breach, including containment, eradication, and recovery procedures.
5. Supply Chain Security and Patch Management
Third-Party Risks and Software Integrity: Developers must diligently vet and update third-party components. To effectively prevent the introduction of malicious code, robust measures must be implemented to verify software integrity and updates. This includes mandating cryptographic signing for all software releases and updates.
Timely and bug-free Updates: It is essential to ensure that all software components, including third-party libraries, are promptly updated with the latest security patches. Developers must establish a robust process to track, test, and apply these updates without delay.
Automated Patch Deployment: Automating the patch management process can reduce the risk of human error and ensure that updates are applied consistently across all systems.
Continuous Improvement
Regular Security Audits: Regular security audits and assessments effectively identify and address vulnerabilities before they can be exploited.
Feedback Loops: Integrating feedback loops to analyse past incidents and strengthen security practices can significantly elevate the overall security posture over time.
In conclusion, the recent IT outage resulting from the CrowdStrike incident unequivocally emphasizes the critical need for robust cybersecurity in software development. Implementing secure coding practices, fostering collaboration between development, security, and operations teams, and giving paramount importance to proactive incident response and patch management can undeniably elevate system security. Regular security audits and continuous improvement are imperative to stay ahead in the ever-evolving digital landscape. Looking ahead, the insights drawn from this incident should galvanise a unified effort to seamlessly integrate security into the software development lifecycle, thereby ensuring the resilience and reliability of digital systems against emerging threats.
On 10th Dec 2021, a zero-day vulnerability was announced in Apache’s Log4j library, which has made Log4shell one of the most severe vulnerabilities since Heartbleed. Exploiting this vulnerability is trivial, and therefore we have seen new exploits daily since the announcement. Some of us will be spending this holiday period mitigating this vulnerability.
Since the announcement last weekend, a lot has been written about Log4Shell. Researchers are finding new exploits in the wild and are adjusting the response. I am not trivialising the extent and impact of this vulnerability with the title of this post. Still, I would like to suggest taking a step back, bringing some calm and strategising the mitigation plan. We are in the early stages of the response, and if the past week is any indication, we are here for the long haul.
In this post, I will be focussing on the two aspects of this zero-day. Technical aspects, for sure, is paramount and requires immediate attention. However, the long-term governance is equally important and will ensure that we are not blindsided with that one insignificant application, which was ignored or seen as low-risk.
So, what is Log4Shell vulnerability?
Apache’s Log4j API1, an open-source Java-based logging audit framework, is commonly used by many apps and services. As a result, an attacker can use a well-crafted exploit to break into the target system, steal credentials and logins, infect networks, and steal data. Due to the extent of the use of this library, the impact is far-reaching. In addition, log4j is used worldwide across software applications and online services, and the vulnerability requires very little expertise to exploit. These far-reaching consequences make Log4shell potentially the most severe computer vulnerability in years.
The “Log4Shell” (CVE-2021–44228) is the name given to the vulnerability in the Log4J library. Apache Log4j2 2.14.1 and below are susceptible to a remote code execution vulnerability where a remote attacker can leverage this vulnerability to take full control of a vulnerable machine. The Log4Shell vulnerability is exploited by injecting a JNDI2 LDAP3 string into the logs, triggering Log4j to contact the specified LDAP server for more information.
In a malicious scenario, the attacker can use the LDAP server to serve the malicious code back to the victim’s machine, which will then be automatically executed in the memory. Data injected by an untrusted entity for merely logging into a file can take over the logging server. What this means for you is an instruction to log activity, but if exploited can soon become a data-leak scenario or run the malicious code for once scenario.
Simply, an event log intended and required for completeness could turn into a malware implantation event. This is nasty and requires taking all necessary steps to ensure that you don’t fall victim to this malicious scenario.
Am I affected?
Overwhelmingly “yes”, unless proven otherwise. Almost every software or service will have some sort of logging capability. Software’s behaviours are logged for development, operational and security purposes. Apache’s Log4j is a very common component used for this purpose.
For individuals, Log4jshell will most certainly impact you. Most devices and services you use online daily will be impacted. Keep an eye on the updates and instructions from the vendors of these devices and services for the next few days and weeks. As soon as the vendor releases a patch, update your devices and services to mitigate the risk associated with this vulnerability.
For businesses, it is going to be very tricky, and the true impact may not be clear immediately. In addition, even though Apache has already recommended upgrading to Version 2.17, there may be various implementations of the Log4J library. So again, keep an eye on the vendors releasing patches and installing as soon as possible.
How to find if your server is impacted or not?
The answer to this question is not straightforward. It is challenging to find if a given server is affected or not by the vulnerability in your network. You might assume that only the public-facing servers running code written in Java handle incoming requests handled by Java software and the Java runtime libraries. Then, for sure, you can consider yourself safe if the frontend is built products such as Apache’s-HTTPd web server, Microsoft IIS, or Nginx as all these servers are coded in C or C++.
As more information is coming on the breadth and depth of this vulnerability, it looks the Log4Shell is not limited to servers coded in Java. Since it is not the TCP-based socket handling code vulnerability, it can stay hidden in the network where user-supplied data is processed, and logs are kept even if the frontend is a non-java platform, you may get caught between what you know and all those third-party java libraries that might make part of the overall application code vulnerable to this vulnerability.
Ideally, every application on your network must be evaluated that is written in Java for the Log4j library. You can take the following two approaches:
Search for Vulnerable Code: Initiate a search for vulnerable code by scanning all servers and applications for vulnerable versions of Log4j libraries. Since Log4j code could be buried deep inside a Java class, a basic search for Log4j will not be good enough. To be certain, you may have to use additional tools and techniques. There are two (2) open-source scanning tools available that can list out code versions or vulnerable code: • Grype (https://github.com/anchore/grype) — Searches libraries installed on a system and displays vulnerabilities present • Syft (https://github.com/anchore/syft) — Searches for installed code and libraries and displays their versions
2. Active Scanning of Deployed Code: Nessus with updated plugins can be used for active vulnerability scanning to identify if the vulnerability exists or not. Some security vendors have also set up public websites to conduct minimal testing against your environment. Following are some of the open-source and commercial tools that can be used for the active scanning:
How can I mitigate Log4Shell and prevent an attack?
In principle, the prevention and prevention techniques are no different from a response to any zero-day, for that matter. The vulnerability is both complex and trivial to exploit, and therefore, it doesn’t necessarily mean that the vulnerability can be successfully exploited. Some of several pre- and postconditions are met for a successful attack. Some of these pre-conditions, such as the JVM being used, the server/app configuration, version of the library etc., will decide successful exploitation. On 17th Dec, Apache Foundation announced the original fix was incomplete and released the second fix in version 2.17.0.
At the time of writing, this post following is the current list of vulnerabilities and recommended fixes: • CVE-2021–44228 (CVSS score: 10.0) — A remote code execution vulnerability affecting Log4j versions from 2.0-beta9 to 2.14.1 (Fixed in version 2.15.0) • CVE-2021–45046 (CVSS score: 9.0) — An information leak and remote code execution vulnerability affecting Log4j versions from 2.0-beta9 to 2.15.0, excluding 2.12.2 (Fixed in version 2.16.0) • CVE-2021–45105 (CVSS score: 7.5) — A denial-of-service vulnerability affecting Log4j versions from 2.0-beta9 to 2.16.0 (Fixed in version 2.17.0) • CVE-2021–4104 (CVSS score: 8.1) — An untrusted deserialization flaw affecting Log4j version 1.2 (No fix available; Upgrade to version 2.17.0)
The Swiss Government’s CERT provides quite a good visualisation of the attack sequence recommending mitigations for each of the vulnerable points in the sequence.
Where from here?
Keep Calm and Carry On……. We are here for the long haul, and there is no easy fix. You may find that you have fixed one app or server today; something else will pop-up next morning. If you have not done it yet, the best will be to set up a generic incident response playbook for zero-day vulnerabilities. This will help you respond to any such event in the future in a systematic way. The key to success here is keeping an eye on the tools and techniques and their effectiveness to respond to any new zero-day.
As far as Log4Shell is concerned, we are still in the early days and can’t be sure that once patched, and there will never be something else. This is evident from the fact that in the last week or so since the announcement of the original CVE, three more have been attributed to Log4J libraries. As a result, the Apache Foundation has recommended the following mitigations to prevent the exploitation of vulnerable code.
First, upgrade vulnerable versions of Log4j to version 2.17.0 or apply vendor-supplied patches. Although, for some reason, if it is not possible to upgrade, some workarounds can be used. However, there is always some risk of additional vulnerabilities (CVE-2021–45046) that will make the workarounds ineffective. Therefore, it will be best to upgrade to version 2.17. In addition to the above, some of the common mitigations must be considered and applied. • Isolate systems must be restricted into their security zones, i.e. DMZs or VLANs. • All outbound network connections from servers are blocked unless required for their functional role. Even then, restrict outbound network connections to only trusted hosts and network ports. • Depending on your endpoint protection strategy, update any signature or plugin to prevent Log4j exploitation. • Continuous monitoring of networks and servers for any indicators of compromise (IOC). • It has been seen that even after the patch is implemented, vulnerability may persist. Therefore, testing and retesting after the patch is implemented must be ensured as part of the mitigation plan. The time we are living in, such events have become a norm. Vulnerable code, unfortunately, is inevitable, and there will always be someone who would be keen to identify such code to exploit for vested interest. It is only time when we will be hacked, and that one incident may disrupt your business. Therefore, it is paramount to develop and implement business continuity plans that can minimise the impact of such an event. These plans must be updated and tested regularly to ensure changing threat scenarios. Security incident response plans must be practised regularly as a “way of life” and adjusted when a new vulnerability or a threat scenario is identified.
In situations like Log4j zero-day, one can get overwhelmed with the sheer volume of work that we need to do to protect ourselves. As Huntress Labs Senior Security Researcher John Hammond said, “All threat actors need to trigger an attack is one line of text,” but the responders need to spend hours, days and weeks on protecting themselves. In such overwhelming scenarios, I will recommend taking a long breath and keeping calm while doing what we need to do. Reach out to people, and don’t be shy to ask for help if you are stressed. I wish all the best to all of you who will be required to stay back during the holiday to keep your businesses protected.
Cybersecurity and Circular Economy (CE) are not the terms taken together. Cybersecurity is often related to hacking, loss of privacy or phishing, and CE is about climate change and environmental protection. However, cybersecurity can learn quite a few things from CE, and this post will focus on our learnings from CE for cybersecurity sustainability.
In the times we live in, our economy is dependent on taking materials from the natural resources existing on Earth, creating products that we use, misuse, and eventually throw as waste. This linear process creates tons of waste every day presenting sustainability, environmental, and climate change challenges. On the other hand, CE strives to stop this waste & pollution, retrieve & circulate materials, and, more importantly, recharge & regenerate nature. Renewable Energy and materials are key components of CE. It is such a resilient system that detaches economic activity from the consumption of products.
CE is not a new concept but is popularised by a British sailor, Ellen Macarthur. Her charity advises governments and organisations on CE. The following picture is the “butterfly diagram”, which illustrates the continuous flow of materials within the economy independent of the economic activity. As shown in the picture, CE has two main cycles- The technical and the Biological Cycle. In the technical cycle, the materials are repaired, reused, repurposed, and are recycled to ensure that the products are circulating in the economy. However, in the biological cycle, the biodegradable organic materials are returned to the Earth by triggering decomposition, allowing nature to regenerate, continuing the cycle.
As noted above, the lack of CE can be devastating for the planet. Humans are producing a humongous amount of waste loitered around us is unsustainable and devastating for the humans and other inhabitants of Earth. Similarly, with the ever-increasing cost of cyber-attack breaches, businesses are vulnerable to extinction. IBM Security and the Ponemon Institute commissioned Cost of a Data Breach Report 2021. According to this report, the cost of breaches has increased by 10% in the year 2021, which is the largest is the largest single-year on year increase. The business loss represents 38% of the breach costs due to customer turnover, revenue loss, downtime, and increased cost of acquiring new business (diminished reputation).
Sustainability is about using and/or reusing something for an extended period without reducing its capability from short- to long-term perspectives. Cybersecurity is sustainable if the implemented security resources do not degrade or become ineffective over some time to mitigate security threats. Achieving sustainability is not easy and, most certainly, is not cheap. The organisations must take a principle-based approach to cybersecurity. As the manufacturing process within CE where sustainability is considered from the ground up, Security must be part of the design and production phase of the products. The system shall be reliable enough to provide its stated function. For example, a firewall should block any potential attack even after a hardware failure or a hacker taking advantage of a zero-day compromising your environment.
By nature, digital systems produce an enormous amount of data, including security-specific signals. Unfortunately, finding a needle from a haystack is challenging and often overwhelmingly laborious. In CE, we have found ways to segregate different types of waste right at the source, making it easier to collect, recycle and repurpose faster. Similarly, the systems shall be designed to separate relevant security data from other information at the source rather than leave it to the security systems. This segregation at source will help reduce false positives and negatives, providing reliable and accurate information which can be used for protection. The improved data accuracy will also help prioritise response and recovery activities due to a security incident.
CE’s design principles clearly define its two distinct cycles (technical and biological) as mentioned above in the post to deal with biodegradable and non-biodegradable materials. These cycles ensure that the product’s value is maintained, if possible, by repairing, reusing, or recycling the non-biodegradable materials. Similarly, the materials are returned to nature through the processes such as composting. Cybersecurity, despite the conceptual prevalence of “Secure by Design” principles for a long time, the systems, including security products and platforms, often ignore these principles in the name of convenience and ease of use. Any decent security architecture shall ensure that the design process inherently considers threat modelling to assess risks. The implemented systems are modular, retaining their value for as long as value. This will guarantee that the cybersecurity products, platforms and services are producing the desired outcome and are aligned to the organisation’s business requirements. There shall always be an option to repurpose or recycle components to return on security investment.
The technical cycle in CE is resilient to change dynamically. As discussed above, CE is predominantly detached from the economic conditions and shall continue to hold value until the product can no further be repaired, reused or repurposed. If the product or a component can’t be used, its materials can be recycled to produce new products by recovering and preserving their value. Cyber resiliency is not something new but is being contextualised in recent times by redefining its outcome. As we know, cyber threat paradigms are continually changing, and only resilient systems are known to withstand such a dynamic. Resilient cybersecurity can assist in recovering efficiently from known or unknown security breaches. Like CEs technical cycle, achieving an effective resiliency takes a long time. First, baseline cybersecurity controls are implemented and maintained. Similarly, redundancy and resiliency go hand in hand and therefore, redundancy should just be included by design.
I am sure we can learn many more things from CE to set up a sustainable and resilient cybersecurity program that is self-healing and self-organising to ensure that systems can stop security breaches. So I would like to know what else we can learn from CE. .
It was just over thirty years when Tim Berners-Lee’s research at CERN, Switzerland, resulted in World Wide Web, which we also Know as the Internet today. Who would have thought, including Tim, that the Internet will become such a thing as today? This network of networks impacts every aspect of life on Earth and beyond. People are never connected ever before. The Internet has given way for new business models and helped traditional businesses find new and innovative ways to market their products.
Unfortunately, like everything else, we have evil forces on the Internet who are trying to take advantage of the vulnerabilities of the technologies for their vested interests. As first-generation users of the Internet, everything for us was new. Whether it was online entertainment or online shopping, we were the first to use it. We grew up with the Internet. We all had been the victims of the Internet or cybercrimes at some point in our lives. This created a whole new industry now called “cybersecurity”, which is seen as the protectors of cybercrimes. However, it has always been a big challenge to fix who is responsible for the security, business or cybersecurity teams.
What is the need to fix responsibility?
Globalisation and more recently, during the pandemic, has increased the number of people working remotely. It has become an ever-increasing headache for companies. As a result, the number of security incidents has increased manifolds, including the cost per incident. The cost of cyber incidents is increasing year on year basis.
According to IBM’s Cost of a Data Breach 2021 report, the average cost of a security breach costs businesses upward of $4.2 million.
Governments mandate cybersecurity compliance requirements, non-compliance of which attract massive penalties in some jurisdictions. For example, non-compliance with Europe’s General Data Protection Rule (GDPR) may see companies be fined up to €20 million or 4 per cent of their annual global turnover.
Companies that traditionally viewed security as a cost centre are now viewing it differently due to the losses they incur because of the breaches and penalties. We have seen a change in the attitude of these organisations due to the above reasons. Today, companies see security as everyone’s responsibility instead of an IT problem.
Cyber-hygiene: Challenges and repercussions of a bad one.
Cyber hygiene, like personal hygiene, is the set of practices that organisations deploy to ensure the security of the data and networks. Maintaining basic cyber-hygiene is the difference between being breached or quickly recovering from the one without a massive impact on the business.
Cyber hygiene increases the opportunity cost of the attack for the cybercriminals by reducing vulnerabilities in the environment. By practising cyber hygiene, organisations improve their security posture. They can become more efficient to defend themselves against persistent devastating cyberattacks. Good cyber-hygiene is already being incentivised by reducing the likelihood of getting hacked or penalised by fines, legal costs, and reduced customer confidence.
The biggest challenge in implementing a good cyber hygiene practice requires knowing what we need to protect. Having a good asset inventory is a first to start. In a hybrid working environment having clear visibility of your assets is important. You can’t protect something you don’t know. Therefore, it is imperative to know where your information assets are located on your network and who is using them. It is also very important to know where the data is located and who can access it.
Another significant challenge is to maintain discipline and continuity over a long period. Scanning your network occasionally will not help stop unrelenting cyberattacks. Therefore, automated monitoring must be implemented to continuously detect and remediate threats, which requires investment in technical resources that many businesses don’t have.
Due to the above challenges, we often see poor cyber hygiene resulting in security vulnerabilities and potential attack vectors. Following are some of the vulnerabilities due to poor hygiene:
Unclassified Data: Inadequate data classification result in misplaced data and, therefore, stored in places that may not be adequately protected.
Data Loss: Poor and inadequate data classification may result in data loss due to a lack of adequate protection controls. Data may not be recovered because of a data breach, hardware failure, or improper data handling if it is not regularly backed up and tested for corruption.
Software vulnerabilities: All software contains software vulnerabilities. Developers release patches regularly to fix these vulnerabilities. A lack of or poor patch management process will leave software vulnerable, which hackers can potentially exploit to gain access to the network and data.
Poor endpoint protection: According to AV-TEST Institute, they register over 450,000 new malicious applications (malware) and potentially unwanted applications in the wild every day. Due to the inadequate endpoint protection cyber hygiene practices, including malware protection tools, hackers can use a wide range of hacking tools and techniques to get inside your network to breach the company’s environment stealing data.
Inadequate vendor risk management: With ever-increasing supply chain attacks, comprehensive vendor risk management must be implemented considering the potential security risks posed by third-party vendors and service providers, especially those with access to and processing sensitive data. Failure to implement such a process will further expose service disruptions and security breaches.
Poor compliance: Poor cyber hygiene often results in the non-compliance of various legal and regulatory requirements.
Building Accountability within your cybersecurity organisation
With ever-increasing breaches and their impacts, we shall start considering as an industry and society to motivate organisations to make cybersecurity a way of life. Cyber hygiene must be demanded from the organisations that hold, process, and use your data.
Now that we understand the challenges of having good cyber hygiene, we must also understand what we have been doing to solve these issues. So far, we have tried many ways. Some companies have internally developed controls, and others externally mandated rules and regulations. However, we have failed to address the responsibility and accountability issue. We have failed to balance the business requirements and the rigour required for cybersecurity. For example, governments have made laws and regulations with punitive repercussions without considering how a small organisation will be able to implement controls to comply with these laws and regulations.
There are no simple solutions for this complex problem. Having laws and regulations definitely raises the bar for organisations to maintain a good cybersecurity posture, but this will not keep the hackers out forever. Organisations need to be more proactive in introducing more accountability within their security organisation. Cybersecurity professionals need to take responsibility and accountability in preventing and thwarting a cyberattack. At the same time, business leaders need to understand the problem and bring the right people for the job to start with. Develop and implement the right cybersecurity framework which aligns with your business risks. Making cybersecurity one of the strategic pillars of the business strategy will engrain an organisation’s DNA.
There are many ways we can start this journey. To start with, organisations will need glue, a cybersecurity framework. Embracing frameworks like the National Institute of Standards and Technology (NIST) Cyber Security Framework (CSF)
NIST-CSF is a great way to start baselining your cybersecurity functions. It provides a structured roadmap and guidelines to achieve good cyber hygiene. In addition, CSF provides guidance on things like patching, identity & access management, least-privilege principles etc., which can help protect your organisation. If and when you get the basics along with automation, your organisation will have more time to focus on critical functions. In addition, setting up the basic-hygiene processes will improve user experience, predictable network behaviour and therefore fewer service tickets.
Research has shown that the best security outcomes are directly proportional to employee engagement. Organisations may identify “Security Champions” within the business who can evangelise security practices in their respective teams. The security champions can act as a force multiplier while setting up accountabilities. They can act as your change agents by identifying issues quickly and driving the implementation of the solutions.
Conclusion
There is no good time to start. However, the sooner you start addressing and optimising your approach to cyber-hygiene and cybersecurity, the faster you will achieve assurance against cyberattacks. This will bring peace of mind knowing the controls are working and are doing what they are supposed to. You will not be scrambling during a breach to find solutions to the problem but ready to respond to any eventuality.
Besides poor cyber hygiene, if your organisation has managed to avoid any serious breach, it is just a matter of time before your luck will run out.