The state of cyber resiliency metrics on embedded systems

3The ability of an embedded system to identify, prevent, and respond to cyberattacks intended to disrupt its operational capabilities is defined by measuring its level of cybersecurity and cyber resiliency. The concept of cyberattacks can refer to either electronic warfare (EW) like signal jamming or to cyberwarfare, for example, such as sending malformed packets to disrupt the system. How do the metrics used to measure a system's cyber resiliency relate specifically to embedded systems, and what are the special considerations that embedded-system designers must undertake when applying cyber resiliency metrics?

First, it’s important to understand that metrics are a means to an end. Because there are costs associated with obtaining, measuring, and evaluating resiliency metrics, those costs must be offset by the benefits gained from the data the metrics deliver. To optimize cybersecurity, the resulting metrics should enable decision-makers to perform the cost/benefit analysis needed to define a system’s cyber resiliency requirements. Metrics should also enable the designer to make comparisons between cyber resiliency capabilities and perform appropriate risk assessments for cyber resilient systems.

The difficulty of measuring cyber resiliency

Measuring a system’s cyber resiliency is not that simple. For example, many aspects of cyber resiliency are hard to quantify, even when examined individually. Further complicating the evaluation of a system’s cyber resiliency is that the same metric might be prioritized differently depending on the system or program requirements.

For this discussion, let us start with the agreement that at its core, cyber resiliency is the ability of the system to continue operating exactly as intended, even in the face of cyberattack. Among other aspects, cyber resiliency may include ensuring the ­confidentiality of data.

There are multiple potential ways to measure a system’s capability to ensure continued operation as intended. Examples include measuring the system’s ability to identify anomalous behavior, measuring its ability to respond when anomalous behavior is detected, and measuring the ability of the system to prevent anomalous behavior from occurring in the first place. All of these measurements are extremely difficult to quantify, however, for any sufficiently advanced technology.

Modern processing systems have numerous interrelated subsystems that must work together in exactly the right way to maintain their operational state. This reality makes the challenge of quantifying the entire set of possible operational states unfeasible. Even if the entire set of correct operation could be identified, defined, and enumerated, it would still be impossible to relate that set to the larger set of possible anomalous states with all possible transition paths defined. This means that cyber resiliency metrics are unlikely to become a single number that identifies and quantifies the of a particular system. Instead, individual elements of cyber resiliency must be analyzed to determine how they relate to a specific system’s operational environment.

Furthermore, are constantly evolving. Approaches for cybersecurity that might have been deemed sufficient and proper at one point in time may later be revealed to be vulnerable to attack. Metrics for cyber resiliency can only be as good as the current state of the art in assessment capabilities. Because they are based on a contemporary understanding of possible attacks, these metrics must be constantly reevaluated and reassessed as new vulnerabilities and threats become understood.

Already defined frameworks and metrics

The state of metrics and assessment for cybersecurity and resiliency is a work in progress. Some very useful work has been published by the National Institute of Standards (NIST) related to the security engineering process and the defining of frameworks to allow for the evaluation of the cyber resiliency of systems. Important documents published by NIST include the Risk Management Framework (RMF) NIST Special Publication 800.73, and the associated publications 800.53 and 800.53A, that define security and privacy controls and how to assess those controls for federal information systems.

These NIST documents help define the RMF process, lay out which security controls to apply to systems, and determine how to assess those security controls. While the RMF process does not define metrics per se, it does create a framework for designing, categorizing, and assessing the security of information systems. With some tailoring for embedded systems, it can provide a workable framework in which to define the metrics of cyber resiliency for a system. Some of the controls within RMF contain language that mandates how metrics need to be defined and provides a broad outline of what that metric is tracking. (Figure 1.) Although the RMF does not define these metrics in detail, it does help to define the types of metrics that are important. Some of the RMF controls and associated referenced metrics include

  • CA-7 Continuous Monitoring: Organizations develop their own metrics to be continuously monitored
  • CP-2 Contingency Plan: Objective metrics for recovering from a cyber incident
  • CP-10 Information System Recovery and Reconstitution: Metrics for returning to operational status
  • IR-8 Incident Response Plan: Metrics to measure incident response ability
  • PM-6 Information Security Measures of Performance: Metrics to assess effectiveness of security controls
  • SA-15 Development Process, Standards, and Tools: Metrics to assess development quality

These RMF-defined metrics provide insight into some aspects of cyber resiliency that are most important to measure, such as the ability to monitor, respond, recover, and restore operational status, along with the ability to measure effectiveness and development quality. Unfortunately, effectiveness – which is probably the most interesting metric at the moment – is the least clearly defined in the RMF.

Figure1
Figure 1: This Risk Management Framework for measuring cyber resiliency is based on the NIST model. Illustration courtesy Curtiss-Wright.

The measure of a cyber resilient system’s effectiveness should quantify how well that system can resist any proposed cyberattack. Because of the constantly changing nature of cyberattacks, and the almost limitless number of possible attack scenarios, the metric for effectiveness will almost certainly need to be defined through analysis. Such an analysis requires enumerating all attack scenarios, along with probabilities of success against a given system. The bad news is that today, this type of analysis is both difficult and costly.

Drilling down

The RMF provides a high-level framework in which cyber resiliency can be designed and assessed for a system. In contrast, NIST publishes Security Tech-nical Implementation Guides (STIGs) that operate at the lowest level, with each STIG defining a set of specific controls for a specific system in order to harden it against cyberattack. The document NIST SP 800-70 provides guidance on developing and using STIGs; the set of individual STIGs can be obtained from the NIST and DISA [Defense Information Systems Agency] websites.

STIGs do not address metrics directly but instead provide guidance on how to properly configure a system to maximize cybersecurity. But applying STIGs to a system does provide an opportunity to measure cyber resiliency. As STIGs get applied to a particular system, there may be controls/guidance that cannot be applied or followed because of system functional requirements or because of the system’s technical limitations. A cyber resiliency metric used to measure system “hardness” could include the number of successfully applied STIGs, as well as the number of unapplied controls based on the control severity.

While frameworks currently exist to help define the areas for which cyber resiliency metrics should be developed, and controls exist that enable the hardening of assets that are critical to a protected system, there remains a lack of clearly defined metrics that can be applied consistently across systems to enable decision-makers to analyze the cyber resiliency capabilities of their systems.

Differences in assessing cyber resiliency of embedded systems

The RMF was designed with regular (IT) systems in mind, not embedded systems. While RMF can be applied to embedded systems, some of the controls it defines can seem out of place or difficult to implement in deployed embedded environments. This situation can lead to misapplication or confusion when trying to apply security controls developed for IT infrastructures to embedded platforms. There are many reasons for the mismatch, but most stem from assumptions about the operational environment that do not apply to embedded systems. For example, the IT infrastructure normally resides in a building with physical security controls. Embedded systems, on the other hand, typically operate out in the field and may not be manned at all. IT systems are most often multi-user, while many embedded systems provide no multi-user login or support only one user. Another example: IT systems are often used as general computing/network resources, whereas embedded systems are often purpose-built to perform a single function. Embedded systems, unlike IT systems, often require additional integration to ensure continued operational capabilities after any update.

Figure2
Figure 2: Cyber resiliency for embedded systems that may be used in or by the warfighter often must address such issues as operation in harsh conditions, use by multiple people, and integration with end uses following updates. Illustration courtesy Curtiss-Wright.

One of the biggest differences between IT infrastructure and embedded systems is assumptions about their operational lifetime and update frequencies. IT infrastructure normally has a much shorter deployed lifespan than embedded systems. For most embedded systems, the combined challenges of the defense acquisition process, safety-certification requirements, and limited physical accessibility all conspire to increase the difficulty and cost of performing a system update when compared to a functionally similar piece of IT infrastructure.

These challenges mean that any consistent approach for evaluating the cyber resiliency of embedded systems must consider the unique operational characteristics of embedded systems. A failure to consider the attack vectors, mitigating environment, and challenges unique to embedded systems when applying cyber resiliency metrics will result in confusion and the misapplication of security controls.

Possible paths forward

There are currently some efforts underway to develop new RMF overlays and additions designed specifically for embedded systems. As these resources become more developed they will likely be available for wider distribution. Moreover, discussions are ongoing at DISA about tailoring STIGs in order to more easily align them with the unique characteristics of embedded systems. This effort, if successful, should help reduce the analysis and documentation efforts currently needed to exclude those controls that are not mainly applicable to embedded systems.

While these efforts promise to help to streamline the frameworks that embedded system cyber resiliency often operate in, they do not directly address the ongoing need for metrics that will enable the comparison of cyber resiliency across products. There is actually no standardization today within the marketplace for measuring the cybersecurity effectiveness of embedded products; instead, frameworks call on the program or vendor to define their own measure of success. This approach can go two ways: Either each program must expend precious resources to do complex evaluations of individual products, or the program must trust that the all vendors are using similar evaluation methodologies, which is unlikely without some guidance.

In order to help develop effective cyber resiliency metrics for embedded systems, Curtiss-Wright is participating in government-led activities tasked with providing more rigor in this area. While these efforts are still in the very early definitional stages, the goal is to provide more structural guidance in order to enable suppliers of commercial off-the-shelf (COTS) parts to better define their cyber resiliency metrics. As these metrics emerge, decision-makers will finally have the means to perform “apples-to-apples” comparisons to measure the cyber resiliency and cybersecurity effectiveness of embedded products from different vendors.

David Sheets, Senior Principal Security Architect, joined Curtiss-Wright in January 2018. In this role, he helps guide technology development and strategy on antitamper and cyber resiliency for . David possesses 18 years of embedded engineering experience, including 10 years working on multiple U.S. Department of Defense programs architecting, implementing, and integrating security solutions. David has a Master of Science in computer science from Johns Hopkins University.

Curtiss-Wright Defense Solutions www.curtisswrightds.com