Three easy steps towards a zero-trust strategy for OT networks

With the increasing sophistication of network breaches and malware attacks, coupled with more integration between IT and OT networks, there’s a growing consensus that a move towards a zero-trust network architecture has become warranted to mitigate threats, particularly against critical infrastructure.

Nearly every major organization and most sensitive government agencies are looking at adopting a zero-trust strategy within the next few years, at least across some network domains. But zero-trust remains an elusive goal and has proven challenging for existing infrastructure and cyber physical systems.

It has often been said that companies can’t buy zero-trust off-the-shelf, that it’s not a product or a technology. Think of zero-trust as a security policy strategy that assumes that hackers have already compromised devices and even internal systems can’t be continually trusted by default. Network implementations that used to trust every device on the internal network must shift their mindset to blocking any internal network communications except those that are minimally required, which can be both risky and complex to implement.

In a traditional network, if malware or ransomware infected a machine, it could likely scan the rest of the network to find other vulnerable applications, services, or ports. Ransomware could also reach back out to external servers to complete the breach, as well as identifying and encrypting the most sensitive assets across the network. Network segmentation, for example a Purdue Model in OT, could hamper this to some extent, but catastrophic damage was usually inevitable.

On the other hand, a zero-trust architecture can severely restrict the extent that malware or perimeter network breaches could propagate through the network and limit any damage. The challenge for security teams: they have to make devices connect to at least some other devices, otherwise there’s really not a network, and everything breaks. We have to make the default to only allow trust when necessary and subject to certain conditions or context. But how can we determine that efficiently?

Challenges to zero-trust

There are a range of technical approaches to restricting network connectivity and implementing zero-trust policies by default, including more granular segmentation or microsegmentation of the network. These can require endpoint agents, or edge firewalls that make it almost impossible to deploy in a mission-critical OT process. Especially if suddenly blocking an internal connection disrupted an important process or recovery scenario.

A zero-trust approach also relies on explicit a priori authorization to connect that really aren’t feasible because in complex and mission-critical environments those situations are impossible to enumerate completely, and legitimate connections will end up failing.

To make zero-trust work in cyber physical environments and industrial processes/devices, we need to incorporate context and intelligence into the policy decision. To make better quality decisions about connectivity, security teams need to understand what they’re trying to protect, and what processes and applications need to accomplish.

Understanding these zero-trust policies aren’t just about MAC and IP addresses or ports. We want to know the type of devices, their expected behaviors, and what hardware and software gets used. It’s about knowing how the entire OT environment behaves. Which machines speak to which other machine? With what protocol? What payload gets exchanged? At what frequency? Can we allow a minimal connection using the protocols we anticipate, to known machines, at specific times in the process based on observed prior behavior, but restricting all other unexpected communications?

We need to make these context-related decisions in real-time, with a deeper understanding of the industrial process, devices involved, unique OT protocols, established communication patterns, and more. This kind of understanding implies an AI/ML-based system that can understand or learn some of these contextual patterns and apply decisions accordingly.

Building on these concept, to achieve zero-trust policy objectives without affecting industrial processes, we suggest the following three steps:

Leverage knowledge about an asset’s cyber hygiene to allow connectivity: Grant assets access to resources based on its security hygiene. If patches are not applied, an anti-virus has not been installed on the endpoint or it has outdated signatures, prompt the user to updates the application before moving on. Thinking about OT environments, it’s quite common for automation vendors to validate Windows updates before allowing them to be installed on HMIs. A pragmatic zero-trust deployment should take this into account and allow a certain gap in patches.
Data diodes can help with key zero-trust enforcement: Use data diodes as a practical way to implement a one-way trust, protecting the most critical assets. They will require sensitive assets to initiate communication or just send data one-way towards pre-processors or IT applications. This can limit exposure to critical assets and data and achieve objectives without disrupting processes.
Practice limited proactive enforcement of monitoring and alerts: There’s no doubt zero-trust can disrupt established networks and processes. Security teams don’t want to enforce new policies without understanding the effects and they may not have the time or resources to test policies out or take the network offline for exhaustive research. For less disruption, passively observe network traffic and compare to defined policies and alert or log exceptions for a period of time.

Think of zero-trust as a journey. These implementations will have to go down a path of increasing refinement, definition, and enforcement, especially for OT and industrial process environments. Try to make connectivity and security policies in these environments with more contextual understanding than typical IT applications in virtual data centers. That’s the best way to accomplish desired security objectives – with minimal disruption and reasonable project costs.

Moreno Carullo, co-founder and CTO, Nozomi Networks