Over the past decade, IT Service Management (leveraging CoBit, ITIL and other standards) has become a proven, rigorous process in most large enterprises. The basic components of ITSM include: designing, planning, operating, and controlling the IT services provided to users. In practice, this includes the way hardware is configured, patched, managed, and deployed; the way software is developed and deployed; the way in which teams respond to incidents; etc. ITSM and the ITIL, CoBit and other standards has been a foundational element in improving IT cyber security posture. In fact, according to National Initiative for Cyber Education’s (NICE) Cyberseek database, 50%+ of the job openings in cyber security are related to operations & maintenance and provisioning – jobs often contained within ITSM functions.
Unfortunately, in most organizations Operating Technology (OT) assets (HMIs, servers, PLCs, relays, RTACs, and other intelligent electronic devices) are excluded from the ITSM processes. For a variety of reasons – from organizational boundaries to lack of skills of IT personnel on OT systems to regulatory requirements – ITSM practices do not extend to these types of systems. Further, these OT staffs are already under headcount pressures as operations look to increase efficiency.
We believe that organizations should embrace the concept of OTSM, paralleling their ITSM practices, but within the unique environments of operating systems. Achieving a mature level of OTSM is not only critical in terms of improving overall ROI from increasingly connected industrial systems, but also in ensuring the foundational elements of OT cyber security necessary to protect critical infrastructure from targeted and untargeted attacks.
Systems & Security Management is critical for cyber security and reliability
To ensure secure and reliable systems, rigorous systems management is a foundational element. With almost every report of a major cyber incident, the analysis calls out the importance of maintaining updated patches, secure configurations, limited access and privileges, updating anti-virus signatures, etc. None of these grab headlines like the advanced threat hunters and analysts who go deep in systems to identify the way the hackers made their way into the system or how they exfiltrated data, etc. However, they are foundational elements without which our cyber security would be much less effective.
The National Initiative for CyberSecurity Education (NICE), an initiative underneath NIST, focused on cybersecurity workforce development, breaks US cybersecurity job openings into 7 types: Operations & Maintenance, Provisioning, Protect & Defend, Analyze, Oversee & Govern, Collect & Operate, and Investigate. Of the 350,000+ cybersecurity job openings in the US as of December 2018, 50% are in the first two categories which largely consists of roles closely aligned with ITSM. Another 16% are in Protect & Defend which includes items such as management of infrastructure hardware & software as well as vulnerability management, which are closely aligned with key ITSM categories as well.
These workers and the processes that they manage are the backbone of cybersecurity. They ensure that systems are provisioned securely when moved into production. They monitor for changes to configurations that do not align with secure baselines. They ensure passwords meet organization standards. They monitor and deploy software patches necessary to maintain security of systems in the field. This is not at all intended to understate the importance of the other roles of analyzing, investigating, etc. But we often overlook this fundamental practice of reducing your attack surface, keeping up good ‘cyber hygiene’ and executing the most important asset level protective functions.
ITSM often does not extend to OT
In most organizations the procedures, policies and service agreements present to manage IT systems do not extend to the Operating Technology environment. There are many reasons for this as stated briefly above. The result is that the functions normally associated with ITSM such as asset inventory, provisioning management, patch management, configuration management, disaster recovery, incident response, etc. are, in most cases, either not managed at all or are applied at a local or business unit level without the same level of rigor, process, and consistency as in the IT realm.
To be clear, this is not a blanket statement. In some organizations we have seen IT absorb the OT function and employ similar systems management across both environments with the necessary customization to the OT requirements. In others, we have seen robust OT Systems Management, often as a result of regulatory compliance requirements such as medium and high impact assets within the NERC world. But, on the whole, we have found an ad hoc approach to OTSM, if any approach at all.
When we do see an ad hoc program we often discover the responsibility for this task often falls to an instrument & controls technician who has tuned the DCS in the past or a “plant IT” representative or a chemical engineer that runs the manufacturing system. In most cases, these individuals were not trained in systems management, or perhaps even on the IT equipment they are using. Similarly, most follow processes developed solely by operations engineering or locally for an individual plant, hospital, or facility. And in most cases they do not leverage the same toolkits as their IT counterparts due to difficulty or risk to deploy and access IT tools within the OT environment.
Now, these same individuals are being asked to pick up these new tasks often in conjunction with their day jobs – build an inventory and keep it up to date on a regular basis, patch systems on a regular basis, ensure password policies are enforced, ensure that firewall rules are properly configured, etc and ensure in the process you don’t take the plant offline. Again, this is not true in every organization, but it is certainly the norm not the exception from our experience.
Launching a new discipline called “OTSM”
We believe that there is a need for a new discipline for OTSM. Even if an organization integrates IT & OT, there will be a need to customize the policies, processes, tools and likely the team responsible for OT to ensure the sensitivity to the OT environment.
So how does an organization go about setting up a robust program? In our experience we have discovered that developing a mature OTSM process involves 4 key elements:
- Establish policies and procedures that match the specific OT environment of that organization. The great news is that in most organizations they have a base of IT policy and procedure templates to draw from, such as SANS, NIST, etc. The key is taking those and building the specific elements necessary for the unique OT environments. For instance, in a pharmaceutical company the patch management policy for a production line may differ significantly from the R&D lab where product is tested in small batches. Similarly, procedures for configuration changes need to reflect the different regulatory structures within each industry and geography. Additionally, consideration needs to be made for DCS vs SCADA deployments. Geographic proximity between the team/tools and the assets in scope make for very different dynamics in execution of OTSM functions.
- Talent/workforce development. In many cases, the personnel responsible for OTSM will be techs and local IT staff. In IT most systems management functions can be centralized and done remotely – and with the growth in the cloud this becomes even more true. However, in much of the OT environments, the actioning of systems management will require local resources (or at least local representation/oversight) – from patching to changing configuration settings to incident response. The downside risk of a patch deployment taking a machine, and therefore the plant process, offline is often too great to do remotely. Similarly, a false alarm in a manufacturing facility is significant, and in most cases incident response will require a local or at least OT-trained staff member to evaluate potential risk and remediation steps.
As a result, workforce development around key OTSM concepts such as patching, configuration management, password management, etc. will be necessary. We certainly applaud the significant training available around cybersecurity analysis, investigation, threat hunting, etc., but at least that amount of focus needs to be placed on the “other 50%” of cybersecurity – i.e. the foundational elements of Systems Management.
- Relevant Tools & Automation. Plant operations and IT budgets are not increasing. All of the effort of operations leadership is to use technology to reduce O&M budgets. OTSM must deliver automation and simplification of tasks, rather than increasing the burden. We believe it can. By using ITSM, IT leadership has significantly reduced the cost of IT management. That same effect is available in OT. Based on our experience, well-managed operations and maintenance functions not only provide a proper foundation for cybersecurity, but also improve plant uptime, reliability and throughput by identifying potential operational issues earlier and accelerating response to incidents as they occur.
- OTSM requires a significant change effort. Traditionally, industrial controls systems have been seen as long-term capital investments that will last 15-20 years between major upgrades. OTSM requires regular management: updating, configuration management, access management, vulnerability management, etc. In many cases, this will require changes to the mindsets and behaviors of team members as well as the more functional training and procedural requirements. Senior leadership is key to making this change effective within already stretched operational organizations.
We believe that success in OT cybersecurity and reliability requires a new foundation in OTSM. Systems management is a critical element to ensuring that these connected systems are protected and managed appropriately. Based on our 25 years’ of experience in managing these ICS systems, to achieve success will require both a change in mindset and skills as well as a new set of tools that allow for increased automation in a way tailored to the unique features of the OT environment.