In a recent roundtable discussion hosted by ICS Village, various industry experts, including Garland Technology CEO Chris Bihary, sat on a panel to discuss the ins and outs of operational technology (OT). They gave insights into how it differs from information technology (IT), the most common threats to OT, and best practices for managing and securing OT networks.
Saying that the world of OT management is complex is a massive understatement. Engineers must handle the stress of maintaining consistent uptime and upholding safety while dealing with emerging threats targeting their vital equipment. Additionally, they're stuck working in plants built before the internet with pressure to modernize for the sake of a business bottom line with minimal resources.
As the stakes are high, here are the key takeaways you can apply to better understand and manage your OT network despite these adversities:
Before diving into OT management best practices, it's essential first to understand how OT and IT differ in terms of purpose, scope, and risk implications.
IT focuses on optimizing the business side of things. It's the software applications and data systems that enable a company to better develop, sell, and distribute their product or service, plus oversee administrative activities like human resources management, accounting, customer service, etc.
Alternatively, OT systems is the technology that is controlling operational and physical processes. For example, the machinery used to manufacture the product being sold. It could also be a power generator or other equipment producing resources for a city, such as electricity, gas, or clean water.
IT is all about getting the information needed to optimize profits. That said, an IT manager's focus is ensuring their systems follow the CIA triad of information security. Information systems must only be accessible to those authorized (confidentiality), the information used needs to be complete, accurate, and untampered with (integrity), and users must have their systems ready at all times (availability).
For OT systems, uptime is the top priority. OT and their industrial control systems (ICS) are significant components of our critical infrastructure. So if there's any downtime or slowdown caused by an environmental disaster, malfunction, or security breach, it doesn't just impact the business but the entire population it supports.
Cyber threats that can cause data loss or network shutdowns are the main risks to IT systems. Data centers must also consider temperature control to ensure the servers hosting the data and applications don't overheat.
The risks and consequences of OT are far more severe in a worst-case scenario. If an IT system goes down, people might get mad. If OT assets go down, people could die. There are much more safety considerations for OT because the equipment used can harm the individuals operating or maintaining it.
Since the scope of work around IT and OT management are so distinct, each will require a unique set of skills and, by default, a different set of job titles. IT sees roles like director and IT, network, or data center engineer.
OT personnel are more blue-collar by nature. They are electrical, chemical, and industrial engineers, as well as frontline operators who must wear steel-toed boots and handle the on-site activity.
What makes OT management tricky, at least compared to IT, is ownership. In IT management, it's much easier to track all the assets attached to a network and decipher who is responsible should anything go wrong. Aside from the network router, which the internet provider oversees, most of the system maintenance and security responsibility is on the organization, so they always have the right people and procedures ready to go.
OT management is far less convenient. A business could own certain machinery and outsource others despite the burden of maintaining security and uptime still falling on that company. It's also common for personnel to be unaware of who has responsibility for which OT equipment. Then, when something goes wrong, you're stuck in the routine of "that's not my responsibility" or "we aren't the vendors for that machine."
Similarly, there's the issue of visibility. Many engineers couldn't even tell you where some of their OT systems and devices are located. Even scarier, they couldn't tell you where the IT systems stop, and OT starts. For all they know, the two environments are intertwined, connected to the internet, and ready to be compromised by a threat actor.
A thorough assessment is the best place to start for OT asset owners looking to transform their environment. You must identify and document all critical processes, OT assets, and their dependencies within the operation. From there, you can assign ownership to the proper personnel. Robust OT management is a complete program of people, processes, and technology.
It doesn't do you any good just to buy a fancy new device and throw it in the production line. Train your engineers accordingly on the technology and ensure standardized procedures are set for maintaining, operating, and securing your OT assets. For security purposes and to preserve uptime, your ultimate goal is segmenting the OT assets from the IT systems and evolving it into a "turtle-like" state that can lock up when there's a threat and quickly open up when the danger has passed.
As previously mentioned, a breach in an OT network has catastrophic, life-threatening potential. It's not like IT, where financially motivated hackers deploy a smash-and-grab operation, such as ransomware, to make a quick buck. OT threats, often adversarial nation-states, are trying to cause significant issues to our critical infrastructure. A successful attack affects large populations by shutting down the electric grid or poisoning the water supply.
Security is not something to take lightly in your OT network, and it starts at the physical layer. As our CEO Chris Bihary always says, "The truth is always in the packet." In other words, OT visibility is vital for monitoring purposes. You need sensors on your network, ideally starting with your most critical assets and processes, that can pull and transfer packet data to your security devices for analysis.
Leverage security frameworks like the SANS Five ICS Cybersecurity Critical Controls, Zero Trust, and the NIST Security Framework to construct a blueprint for your OT security program. Always remember that one solution will not make you secure. Maintaining a strong posture takes many layered controls of people, standardized processes, and technology.
While you should never mix the two types of assets in the same environment, you should make friends with your colleagues in IT to create harmony between the two sides. Make time to understand their unique objectives and pain points so you can prepare for the worst. Cross-functional activities like incident response planning will involve both the business and operational stakeholders.
Looking to take your first step toward enhanced network flexibility, visibility, and security but not sure where to start? Join us for a brief network Design-IT consultation or demo. No obligation - it’s what we love to do.
Key Definitions
If the inline security tool goes off-line, the TAP will bypass the tool and automatically keep the link flowing. The Bypass TAP does this by sending heartbeat packets to the inline security tool. As long as the inline security tool is on-line, the heartbeat packets will be returned to the TAP, and the link traffic will continue to flow through the inline security tool.
If the heartbeat packets are not returned to the TAP (indicating that the inline security tool has gone off-line), the TAP will automatically 'bypass' the inline security tool and keep the link traffic flowing. The TAP also removes the heartbeat packets before sending the network traffic back onto the critical link.
While the TAP is in bypass mode, it continues to send heartbeat packets out to the inline security tool so that once the tool is back on-line, it will begin returning the heartbeat packets back to the TAP indicating that the tool is ready to go back to work. The TAP will then direct the network traffic back through the inline security tool along with the heartbeat packets placing the tool back inline.
Some of you may have noticed a flaw in the logic behind this solution! You say, “What if the TAP should fail because it is also in-line? Then the link will also fail!” The TAP would now be considered a point of failure. That is a good catch – but in our blog on Bypass vs. Failsafe, I explained that if a TAP were to fail or lose power, it must provide failsafe protection to the link it is attached to. So our network TAP will go into Failsafe mode keeping the link flowing.
Single point of failure: a risk to an IT network if one part of the system brings down a larger part of the entire system.
Heartbeat packet: a soft detection technology that monitors the health of inline appliances. Read the heartbeat packet blog here.
Critical link: the connection between two or more network devices or appliances that if the connection fails then the network is disrupted.