The antidote to operational technology conservatism

The specifics of protecting and updating the OT infrastructure.

How to update and upgrade OT infrastructure so that it doesn’t lead to disruptions in the production process.

I’ve been saying it often – for years: antivirus is dead.

Such a statement might at first seem strange – especially from someone who’s been a mover and shaker since the very earliest days of all things viruses and anti-virus in the late eighties and early nineties. However, if you dig a little deeper into the AV (RIP) topic and consult some authoritative sources in the (former) field, then the statement quickly becomes quite logical: first, “antivirus” has turned into protective solutions “against everything”; second, viruses – as a particular species of malicious program – have died out. Almost. And it’s that seemingly harmless, negligible almost I just wrote there that causes problems for cybersecurity still to this day – at the back end of the year 2022! And that almost is the basis of this here blogpost today…

So. Viruses. Those Red-Listed last remaining few – where are they these days, and what are they up to?…

It turns out they tend to reside in… one of the most conservative sub-fields of industrial automation: that of operational technology (that’s OT – not to be confused with IT). OT is “hardware and software that detects or causes a change through the direct monitoring and/or control of industrial equipment, assets, processes and events” (– Wikipedia). Basically, OT relates to an industrial control systems (ICS) environment – sometimes referred to as “IT in the non-carpeted areas”. OT = specialized control systems in factories, power plants, transportation systems, the utilities sector, and the extraction, processing and other heavy industries. Yes – infrastructure; yes – often critical infrastructure. And yes again – it’s in this industrial/critical infrastructure where “dead” computer viruses are found today alive and kicking: around 3% of cyber incidents involving OT-computers these days are caused by this type of malware.

How so?

Actually, the answer was given above: OT – rather, its application in industry – is very conservative. If there were ever a field that firmly believes in the old axiom “if it ain’t broke, don’t fix it!”, it’s the field of OT. The main thing in OT is stability, not the latest bells and whistles. New versions, upgrades… even just updates (e.g., to software) are all looked upon with skepticism, if not scorn – if not fear! Indeed, operational technology in industrial control systems commonly features creaking old computers running… Windows 2000 (!) plus assorted other antique software full of vulnerabilities (there are also gigantic holes in security policies, and a whole load of other terrible nightmares for the IT security guy). But back to the “non-carpeted areas” imagery real quick: the IT kit in the carpeted areas (say, in the office – not the manufacturing shop floor or auxiliary/technical facilities) – this has long been inoculated against all viruses since it’s timely updated, upgraded and overhauled, while being fully protected by modern cybersecurity solutions. Meanwhile, in the non-carpeted areas (OT), everything’s the exact opposite; hence, viruses survive – and prosper.

Take a look at the Top-10 most widespread “old-school” malicious programs to be found in ICS computers in 2022:

Top-10 most widespread

Sality! Virut! Nimnul!

So what does that graph tell us?…

Actually, first, let me tell you that the percentages shown above relate to a “sleeping” phase for these old-school viruses. But from time to time such viruses might escape the bounds of a single infected system and spread across the whole network – leading to a serious local epidemic. And instead of full-fledged treatment, good old backups are usually resorted to – and they might not always be “clean”. Moreover, the infection can affect not only ICS computers, but also programmable logic controllers (PLCs). For example, long before the appearance of Blaster (a proof-of-concept worm able to infect PLCs’ firmware) the Sality loader was already present; well, almost: not in the firmware, but in the form of a script in HTML files of the web interface.

So yes, Sality can make a real mess of automated production processes – but that’s not all. It can mess up memory through a malicious driver, and also infect applications’ files and memory – potentially leading to complete failure of an industrial control system within days. And in case of an active infection, the whole network can be brought down – as Sality has been using peer-to-peer communication for updating the list of active control centers since 2008. The manufacturers of ICS would hardly have written its code with such an aggressive intended working environment in mind.

Second, 0.14% in a month – doesn’t sound like much, but… it represents thousands of instances of critical infrastructure around the world. Such a shame when you think how such risk could be excluded fully, simply, and with the most fundamental of methods.

And third, given that factories’ cybersecurity is so sieve-like, it’s no wonder we often hear news about successful attacks on those factories by other types of malware – in particular ransomware (example: Snake vs Honda).

It’s clear why OT folks are conservative: the main thing for them is that the industrial processes they oversee stay uninterrupted, and bringing in new tech/updating/upgrading can bring interruptions. But what about the interruptions caused by old-school virus attacks permitted by staying behind the times? Indeed, and that’s the dilemma OT folks face – and usually they settle for staying behind the times, and thus we get the figures shown in the graph.

But guess what? That dilemma can be a thing of the past with our “pill”…

Ideally, there needs to be an ability to innovate, update, upgrade OT kit with no risk to the continuity of industrial processes. And last year we patented a system that ensures just that…

Briefly, it goes like this: before introducing something new into the processes that MUST keep running, you test them out on a mock-up of the real thing – a special stand that emulates the critical industrial functions.

The stand is made up of a configuration of the given OT-network, which turns on the same types of devices used in the industrial process (computers, PLCs, sensors, coms equipment, assorted IoT kit) and has them interact with one-another to replicate the manufacturing or other industrial process. In the input terminal of the stand there’s a sample of the tested software, which starts to be observed by a sandbox, which records all its actions, observes network nodes’ responses, changes in their performance, accessibility of connections and many other atomic characteristics. The data gleaned like this permits building a model that describes the risks of new software, in turn permitting informed decisions to be made as to whether or not to introduce this new software and also what needs to be done to the OT to close the uncovered vulnerabilities.

But wait – it gets more interesting…

You can test literally anything in the input terminal – not just new software and updates to be deployed. For example, you can test for resilience against malicious programs that get around external means of protection and penetrate a protected industrial network.

Such technology has plenty of potential in the field of insurance. Insurance companies will be able to judge cyber-risks better for more accurate calculations of insurance premiums, while the insured won’t be overpaying for no good reason. Also, manufacturers of industrial equipment will be able to use stand-testing for certification of software and hardware of third-party developers. Developing this concept further, such a scheme would also suit industry-specific accreditation centers. Then there’s the research potential in educational institutions!…

But for now, let’s return to our factory stand…

It should go without saying that no emulation can reproduce with 100% accuracy the full variety of processes in OT networks. However, based on the model we’ve built up based on our vast experience, we kinda already know where the “surprises” can be expected after introducing new software. Moreover, we can reliably control the situation with other methods – for example with our anomaly early-warning system, MLAD (about which I wrote in detail here), which can pinpoint issues in particular sections of an industrial operation based on direct or even indirect correlations. Thus, millions, if not billions of dollars in losses from incidents can be avoided.

So what’s stopping OT folks racing to adopt this stand-model of ours?

Well, perhaps, so far – since they’re so conservative – they’re not actively looking for a solution like ours as they might not consider one necessary (!). We’ll do our best to promote our tech to save the industry millions, of course, but in the meantime I’ll add this: our stand-model, though complex, will pay for itself very quickly if adopted by a large industrial/infrastructural organization. And it’s not a subscription model or anything: it’s bought once, then keeps saving the day (minimizing regulatory, reputational and operational risks) for years without extra investment. Oh, and there’s one other thing it’ll keep saving: OT folks’ nerves… or sanity.

Tips