As a precursor to my CyberShock speaking session, I wanted to share some thoughts regarding difficulties in network-based threat hunting for Industroyer in IEC104 substation environments.
We’ll start with a quick review of Industroyer…
Industroyer and Industroyer 2
Industroyer was released in 2016 to cut power in Ukraine. It remains one of the few instances where targeted, protocol-specific malware was used to disrupt critical infrastructure. In April of 2022, a new variant of Industroyer returned to the scene with new capabilities but without the same success as its predecessor.
A typical news cycle followed Industroyer 2 with security vendors, CERTs, and independent researchers providing their analysis and often Indicators of Compromise (IOCs).
After the initial wave of marketing interest eroded, the research into the malware and protocol returned to a fairly normal pattern with a small volume difference in queries for the protocol itself.
ICS Analysis: More Autopsy than Triage
For ICS events, security vendors are more like historians, and when it comes to ICS attacks, you will rarely to never see the same attack twice, so intel/IOCs are only great for historical records and often rely on host-based detection.
This leaves ICS analysts in a difficult position of being a coroner more than a doctor due to having little in terms of PREDICTIVE materials but having a wealth of material on how to identify and autopsy a body.
Challenges of Threat Hunting in IEC104 Environments
Very little in terms of network IOCs exist for the protocol, but the ones that do are for commands being sent to the controller as the end stage of the attack.
How are analysts supposed to threat hunt in these environments?
Without the ability to infinitely analyze every atomic event, multiple tools exist that summarize IOCs and/or atomic events to different time ranges. This is often called “Baselining” and requires the analyst to dictate what’s normal in the environment and is prone to multiple types of bias.
Ultimately, the analyst is attempting to chain atomic events over different time ranges to dictate a threat behavior, or they’re interpreting the output of the same methodology from the output of a security vendor’s tool.
Why is this difficult?
The following picture is a long-running collection of all packets in a substation. It’s a mix of different assets speaking different protocols in an environment that’s supposed to be static outside of maintenance windows or a process upset.
The traffic is often referred to as “Bursty,” so an analyst starts sizing the time windows around the bursts and filtering out protocols that may have contributed to the spike based on system context.
What if all I care about is IEC104 for PLC/RTU enumeration?
The picture below shows IEC104 payloads in red and the TCP stack in black.
The IEC104 payloads show a fairly repetitive pattern, but the Transport layer above the IEC104 payload is “bursty.” Unlike DNP3, IEC104 leverages the TCP layer as its main transportation method, so the two patterns should not be widely different, and variations in TCP layers are often labeled as “Potential Port Scanning.”
Bursts in the payloads should show correlating bursts in the TCP layer, and that appears to be inconsistent as well.
A data science dilemma presents itself where the analyst either waits for more data to identify repeating patterns, or they dig in and try to identify patterns on a smaller scale.
The picture above shows IEC104 payloads with various types in a smaller window and presents a whole new set of questions on the patterns, volume, and correlations between the two.
If the analyst is convinced that they have determined the right period to dictate normal traffic patterns and find a baseline deviation, a whole new set of questions arise to confirm the hypothesis — often leading to them feeling like Neo staring down the opportunity to jump into a rabbit hole.
Where can analysts learn?
Econometrics is the use of statistical methods to develop theories or test existing hypotheses in economics or finance. Financial models are selected and tested, with differences between PREDICTIONS and reality.
(More details in this article: https://www.econlib.org/library/Enc/ForecastingandEconometricModels.html)
After selecting a model, multiple statistical methods were applied to take the selected model and test its prediction accuracy.
How is this related to substation threat hunting?
At the heart of all industrial control systems lives feedforward and feedback control loops.
Set points are provided by operators, violations of setpoints are communicated to HMIs and alarm servers, and the system corrects itself to maintain a steady state environment. Engineers have been maintaining steady state in these environments through a veritable smorgasbord of statistical methods to improve the systems, as seen below.
What’s the output of testing this hypothesis?
“Stock market volatility models can detect both network scanning and process data scanning in substations using IEC104.”
With the following assumptions:
- Large changes to system state are caused by seasonal changes in human activity, which cause the largest disruption to the system state after
- Any human behavior for maintenance purposes would cause a large shift in activity
- Disturbances to control loops cause further small changes to the system in a predictable, automated fashion -> alarm values are sent to the HMI/historian
- Scanning activity for networks or process data fit neither of the models
Come find out more at my talk during CyberShock 2022!
View the agenda and learn more about the conference at https://hubs.ly/Q01nh78S0