Patch management is a robust procedure in IT security. Proper vulnerability management using regular scans and updating software is a core element of every effective IT security program. The approach is well-documented and, while many systems still remain unpatched due to procedural or technical issues, if done correctly, patching eliminates many cyber risks in IT.
OT/ICS Patch Management
In Operational Technology (OT) or Industrial Control Systems (ICS), however, this standard IT-approach often breaks down for several reasons. There are operational risks in scanning sensitive OT devices, and patching regularly is difficult because once patched, these systems were not designed for security and, therefore, many other risks still remain, even if patched to the latest level.
As a result, we often hear the opinion that vulnerability management and patching are not relevant in OT. This argument leads to the belief that detecting anomalous behavior patterns is the only effective defense of control systems.
Our view is that an end-to-end patch management program is an effective layer of security in ICS, especially if operators take a “360-degree risk management” approach, rather than simply monitoring for vulnerabilities and patches. This adds significant protection from untargeted and targeted threats (e.g., ubiquitous malware vs. organizationally tailored ransomware).
The value of patch management in OT/ICS environments
Although vulnerability and patching has its challenges, addressing critical security vulnerabilities, especially in OS-based devices within ICS networks, is an essential element to robust cyber security.
In the past several years, ransomware reaching industrial processes has cost companies such as Merck, Maersk, and Mondelez hundreds of millions of dollars each. In each case, the ransomware spread due to unpatched systems on both the IT and OT environments.
Even today, conducting assessments of ICS networks, we regularly find dozens of devices with unpatched critical vulnerabilities that could allow for ransomware or other debilitating malware to execute.
It is true that many OT systems are “insecure by design” and would be at risk from a targeted attacker even after vulnerability patching. However, many “commodity” attacks do not take advantage of these insecure system settings, but instead take advantage of known vulnerabilities to cause disruption.
99% of OT/ICS attacks come via the Windows and other traditional OS devices residing on the OT network. These systems’ unpatched vulnerabilities create ease of access for targeted attackers.
However, patching is more than about fixing a vulnerability or about pure cyber security in many cases. In fact, a patch can be in the following example categories:
- Security-specific (e.g., fixes a vulnerability)
- Enable security or security-features (e.g., adds MFA or encrypted authentication)
- Functional fix (e.g., stability, or feature update)
- Other (e.g., as in who cares except the 1 edge case)
In the above figure, we explore an example concept of an approved update or one from an OEM. This patching process is just one piece of the puzzle, but it lays out the fact that patching must consider the assets it affects and their exposure, the nature of the patch itself, and the value it provides BEFORE it goes to deployment.
If the patch provides high value, then surely the patch should be applied, and if the patch application process is complicated – then you defer, and/or apply compensating controls. After all, patching everywhere isn’t always feasible (or a patch might not even exist to fix an issue), but it is about reducing cyber risk to tolerable levels.
Patch management is not a silver bullet solution
Even more so than in IT, patch management is not a panacea. As we all understand, most ICS systems are not designed to be secure or “forever days” will continue to be present due to inherited legacies. They lack proper secure configuration settings, often enable many risky ports and services, do not enable complex passwords, etc. Patching alone will not solve these insecurities, but a patch or “fix” may not necessarily be security in nature and this requires insight.
For this reason, we strongly recommend a 360-degree risk management approach. What is 360-degree risk management? It is the process of MANAGING, not just assessing, the comprehensive risk of an asset based on an aggregate view of the system, its configuration, and its position in the overall process design.
A comprehensive view includes understanding four equally important dimensions:
- The nature of the patch and whether it has value to the asset or organization
- The potential vulnerability or risk to that asset
- The criticality of that asset to the overall process
- And the security or value provided by the patch or change once applied
Explicitly separate these elements so you are clear about whether a patch, update, of vulnerability has value, an asset is at risk, as well as whether the process is at risk from an attack on that asset. Determine if the security (either for a vulnerability fix or the enablement of security features) or reliability/stability of an asset is improved. If you can quantify risk in OT cyber security, the effort expended to actively reduce your organization’s risk through patching alone is self-explanatory.
The risk or vulnerability of an asset is much more than the known CVEs of an asset. It also includes additional elements that view the overall system’s security-level that are often outside of the scope of a mere patch. For example:
- Vulnerabilities: Comprehensive IT and OT device vulnerabilities and missing patches with ratings should be a standard practice.
- Insecure Configuration settings: In many OT environments, standard secure configuration settings are missing, allowing weak passwords, open and unnecessary ports and services, etc., offering potential access even if a system is completely patched.
- Users and accounts: Even if an OT system is patched, if dormant users or unmanaged accounts are present, attackers could use them to gain access and privilege to take actions that can harm the system.
- Anti-malware status: In many OT environments, protective measures such as anti-virus are out of date as updates require proper testing and impact uptimes. Other times, application whitelisting is not employed or if it is, it is not in lockdown mode. These compensating controls offer a protective measure until a device is patched appropriately.
- Network connectivity and protection: Similarly, network protections offer an additional layer of protection. However, we often find that where they exist, the rules and configurations are not as secure as they could be and offer much less protection than advertised. Regardless, if an asset is on the Internet or on a system with high levels of user activity, then the risk of compromise and exploitation is much greater.
- Backup status: This is the last line of defense in case of attack. In many cases, operators do not test and gather regular updates on the success and recency of backups.
This information needs to be overlaid on the asset criticality to the process. There are several ways of generating this, from a complete cyber-PHA analysis to using data from prior business disruption analysis, to an automated approach based on the data included in Verve from the endpoint and network.
Examples of information that is typically used:
- Device functions (e.g., domain server, HMI, core switch, etc.)
- Device connectivity or services provided to other critical devices or systems (e.g., dependents)
- User defined criticality data: Organizations that conduct PHA, disaster planning, etc. use these scores for even greater context to the prioritization of different security controls for high criticality vs. lower criticality assets.
Together, these data points provide a rich view of the risk to the process and allow for a much better prioritization of risk remediation. As described, patch management is not a one-size-fits-all solution.
There are multiple angles to look at the risk of an endpoint. And by looking across these different angles, organizations prioritize the greatest security impact with the least operational disruption or risk, but also how and when they should be applied.
Remediation, including patching, needs greater OT-specific automation
MANAGEMENT is more than assessment. Effective protection requires the ability to take actions on endpoints to secure them once the risk is identified. In OT environments, however, taking these actions often involves lengthy manual processes because much of the automation of these “IT Systems Management” actions do not exist in OT.
OT-specific remediation automation requires several elements:
- Operator/site controlled: One of the biggest risks is when administrators make changes (deploy patches, remove software or users, make configuration changes, etc.) without proper knowledge and testing by operators. We have seen organizations trip plants with mis-applied patches. Therefore, the automated actions need to be in the control of the operator/site with knowledge of the process. This is what we call “Act Local”.
- Centrally analyzed: Prioritization of risk and remediation strategies requires centralization for cost and effectiveness. It just is not possible to get consistent security with each plant or operation making their own security trade-offs. A centralized approach enables a scaled team to assess risks, define compensating controls, build playbooks for remediation, etc. This is what we call “Think Global”.
- Comprehensive execution: OT environments require simplification of endpoint management tools. The automation cannot occur through multiple different tools each with its own user interface. The more streamlined the process, the more effective and efficient the remediation.
Organizations can save $600,000 - $1,000,000 per year in operating costs by deploying an OT-specific, automated remediation approach with selective patching on specific assets. This reduces expended efforts and allows time to focus on what matters.
Needlessly to say, an organization still needs processes to understand a patch, know where to apply it, how to abide by change controls, but patching as a whole is a necessary activity that maintains or improves security. It is a People-Process-Technology concept from start to finish when you own assets - regardless if they are consumer, industrial, or critical infrastructure.
A 360-degree risk management approach enables robust protection for ICS/OT systems with great efficiency. This is due to targeting and automation of operator actions, and a layered selection of capabilities that promote risk reduction.