by Joe Weiss, Applied Control Solutions
Securing electric utility control systems maintains a utility’s mission to produce electricity, deliver electricity or both reliably and safely.
It is impossible to secure control systems fully; however, these systems can be made more secure to minimize unintentional incidents and “less than nation-state” cyberattacks that could cost hundreds of millions of dollars and lives.
Cybersecurity Inadequately Addressed
The premise for grid reliability is the N-1 criteria, which describes the impact if one element in the system fails or becomes out of service. It is meant to address mechanical or electric failures of equipment or facilities. The N-1 criteria, however, does not address acts of nature, malicious incidents or common cause failures that could affect multiple facilities. Cybersecurity is a common cause failure that can affect facilities across large geographic areas and can be more severe than acts of nature. Consequently, neighboring utilities might be unable to help because they could be affected, too. In addition, a major cyberattack could affect enough equipment that there would not be adequate replacements, meaning long-term outages from nine to 18 months.
Cybersecurity often is addressed as a high-impact, low-frequency (HILF) event that has led to risk assessments’ minimizing the need to address cyber.
Cyberthreats and cyberattacks, however, are becoming more common. Consequently, the low-frequency aspect of HILF might be invalid. Free industrial control systems (ICS) metasploits for electric utilities’ control systems (cyberattack software) are available on the Web. You no longer need to be a nation-state to craft a Stuxnet-type attack.
Electric Industry Cyber Regulations
From the early 2000s when the North American Electricity Reliability Corp. (NERC) critical infrastructure protection (CIP) process began—it wasn’t called CIP then—NERC and the Federal Energy Regulatory Commission’s (FERC’s) intent was to maintain the reliability of the bulk electric system.
The current version of NERC CIPs, however, reduces but doesn’t maintain electric grid reliability and can affect nuclear plant safety.
Until 2006, NERC was an industry organization focused on electric grid reliability. It had no formal authority to levy fines or other punishment for impacting grid reliability. NERC also was an American National Standards Institute (ANSI)-accredited organization, meaning it had a majority voting process of its constituency.
In 2006, FERC designated NERC as the electric reliability organization (ERO), which gave it quasi-regulatory status and the ability to levy fines. Given that NERC was now the ERO, it could have been expected that the standards’ voting process would have changed because it doesn’t make sense to have the regulatees vote on their own regulation. The voting process, however, didn’t change, which created at least the perception of a conflict of interest.
The consensus process has led to the extended duration to create a standard that can be acceptable when dealing with vegetation management but is unacceptable when dealing with rapidly changing threats such as Stuxnet. It also has led to the quagmire of what is considered a critical asset. NERC CIP allowed utilities to self-define their critical assets. If it is not considered a critical asset, it does not require a physical audit, cybersecurity assessment or both. These activities are meant to lead to remediation and further audits and assessments.
I recently returned from arguably the most comprehensive technical cyber assessment of any facility: an international power plant the utility wanted to be addressed for Stuxnet. The results were almost the opposite of what is found in facilities under NERC CIP purview. The assessment identified nearly all systems as critical, and nearly all critical systems needed some form of assessment (there were even systems that could not be secured). This should be expected. Why install new control systems if they aren’t critical? And because control systems weren’t designed for security, they should need some form of remediation. Yet, this is almost opposite of what occurs under NERC CIP: Very few assets are identified as critical, and few assets are identified as needing remediation.
Unintended NERC CIP Effects
NERC CIP version 4 introduced the “bright line” concept that sets a minimum threshold for size of power plants and substation voltage to be considered critical. This is based on the traditional utility requirement of meeting the N-1 failure criteria of the grid’s being designed to lose an asset and remain functional. Statistical analysis has shown it is unlikely to lose more than one node unless a natural—not malicious—disaster ensues.
This is where the N-1 approach falls short. Cyber is a common cause failure that can impact multiple facilities from multiple organizations; it is not a single-node event. Examples such as Slammer and Blaster demonstrate that point. Cyber is a malicious event that methodologies such as N-1 and statistical (probabilistic) assessments were not meant to address. There is a need to develop risk methodologies applicable to control system cyber that provide a measure of control system reliability and security.
Cyber is a communication issue. Consequently, the size of a facility is not critical, but whether it communicates with other facilities is relevant. A small (less than 50-MW) generator with compromised communication packets, if dispatched by the independent system operator (ISO), can bring down the entire regional grid controlled by the ISO. The conditions in Southern California associated with the unavailability of the San Onofre Nuclear Generating Station make all available generation critical for grid reliability, not just those above an arbitrary megawatt threshold.
The utilities used the bright line criteria to further reduce FERC’s authority over some facilities that are considered critical assets under version 3 of the CIP standards. The bright line criteria exclude some 70 percent of the generation capacity in North America, 88 percent of transmission assets and 30 percent of control centers and all distribution. This provides a clear road map for a hacker.
Many utilities have delegated NERC CIP compliance to their compliance organizations and have minimized the participation of control system experts. This is dangerous to exclude domain experts when attempting to protect systems without domain expertise. Often this leads to diligent engineers’ being unable to do what they consider is right because internal utility compliance organizations will not allow it.
Although serial communications are the most prevalent in substations and power plants and serial communications can be cyber-vulnerable, the NERC CIP process excluded assessing serial communications. This has resulted in having many utilities convert Internet Protocol communications to serial. Because serial communications also can be cyber-vulnerable, the NERC CIPs exclude a significant number of cyber-vulnerable systems. These include intelligent electronic devices (IEDs) in substations and programmable logic controllers (PLCs) in power plants.
Black start facilities, those used to restart the system after complete blackouts, were classified as critical assets in CIP versions 1 through 4. (In draft version 5, black start facilities have gone from medium priority to low). This meant they were required to meet the NERC CIPs and were auditable. To avoid the compliance issues, many utilities no longer identify black start units in their restoration plans. Nuclear plants that rely on these black start units can be at risk for Fukushima-type incidents after extended blackouts. As an example, one large utility had 20 units that provided black start capability. After the issuance of NERC CIP version 4, the utility has only one unit identified as black start. This utility has nuclear plants that need some form of black start capability, so this seems problematic. One can only wonder what transmission planners think about the lack of black start capability.
The recent NERC Cyber Attack Task Force (CATF) report did not address Stuxnet or Aurora, both of which are known, demonstrated vulnerabilities that can lead to significant equipment damage and extended outages. Aurora is a gap in grid protection that is a basic engineering tenet for any first-year electrical engineering student: Don’t start AC equipment out-of-phase. Aurora can be addressed only by hardware remediation because it is a physical process. NERC formed a task force to consider Aurora but has issued no opinions, recommendations or mandates to utilities to resolve the threat. Utilities must send only paperwork demonstrating compliance with current requirements five years after the Aurora test demonstration. Aurora, however, makes a utility’s substations a threat to its customer facilities, including DOD facilities. How would a utility executive explain why his utility is directly responsible for the destruction of its customers’ facilities? For Stuxnet, information and source code is publically available. To believe modified malware will not be forthcoming is naive. What are utilities doing to protect themselves from other known issues such as Flame, Duque and sKyWIper?
Recently, a paper on Stuxnet and Anti-Virus by an engineer in Iran was published in Control On-line. It demonstrates Iran’s knowledge of Stuxnet. What was the rationale for the NERC CATF excluding these vulnerabilities?
The Smart Grid Security Acceleration Group is working on substation automation; however, Aurora is not being considered. Substation automation cannot be considered secure when a gap in grid protection is not addressed.
Most important, senior management must provide resources and support. When a utility’s billing system is more cybersecure than any substation or power plant—including nuclear—something is seriously wrong. There is no simple, technological silver bullet. Implementing the same level of accepted information technology security and practices to protect the Windows-based HMI, however, should be expected. Control system cybersecurity policies must be developed because information technology security policies have affected ICSs and have not prevented some major ICS cyberincidents. Programmatically, the government should mandate NIST SP800-53, Appendix I.
NIST SP800-53 is mandatory for all federal agencies, including utilities such as the Tennessee Valley Authority and Bonneville Power Administration. How can the nonfederal utilities be held to a lesser standard, especially when they interconnect with federal utilities and create a potential vulnerability? For ICS field devices such as programmable logic controllers and remote terminal units, ICS cybersecurity technology must be developed as part of the initial ICS design. For existing legacy ICSs, security backfits must be developed. Most of all, utilities must view cyber as a reliability threat and address it with the same vigor as other reliability threats.
Joe Weiss is a cybersecurity expert with Applied Control Solutions. He is working with a utility and several major control system suppliers to determine what is needed to secure legacy control systems for reliability considerations—a first in the industry. Reach him at firstname.lastname@example.org.