Anomaly detection at Palo Verde nuclear plant

22 December 2020

A data analytics platform developed for the Palo Verde nuclear plant has been able to detect distinct anomalies and would have provided early warning of equipment failures. By Greg Alder and James Herzau

FROM THE TIME NUCLEAR PLANTS were first able to digitally collect and archive plant process and condition monitoring data, operators wanted improve plant performance and reliability by leveraging the data from after-the-fact reporting into real-time assessment.

In response, Curtiss-Wright pioneered a similarity-based pattern recognition algorithm, the System State Analyzer (SSA), in the late 1980s, proving its utility with Argonne National Labs at the Experimental Breeder Reactor II (EBR II). Computer processing capabilities at that time limited the feasibility of deploying applications such as SSA across nuclear fleets, but now improvements in information technology and high speed computing have fostered the deployment of new and more effective advanced condition monitoring applications.

One feature, anomaly detection, is a major element in most advanced condition monitoring platforms. Anomaly detection provides early warning of equipment degradation, by identifying deviations in behaviour between real-time data from system sensors and the expected values produced by a predictive model.

The predicted values for these models are generated by comparing the most recent snapshot of data against a collection of normal historical behaviour datasets, and then using the most similar datasets to determine the expected value for each sensor in the model. These datasets can come from any period in a system’s history, including steady operation or while equipment is ramping up or down. These types of similarity-based models (SBMs) have been providing effective early warnings of developing faults for over 15 years now and their effectiveness is well understood.

Anomaly detection

Any SBM model for an asset consists of a group of (primarily analogue) sensors that define the behaviour and condition of the asset, and an algorithm trained with historical data representing the normal behaviour of those sensors. Selecting the sensors and the normal data used to train them requires subject matter expertise, as do the ongoing tasks of interpreting the anomalies identified and categorising them as real issues or model tuning needs. The effort to acquire and select good normal data across all sensors, in a model that represent the full breadth of operating and ambient influences, can be a challenge for those charged with building and maintaining the models. The effort involved in reviewing the real time results can be an equal challenge. As such, good SBM models tend to be relatively small (20 to 30 sensors) to accommodate the human interaction required.

Unless additional specifically targeted models or training sets are configured, the nature of the SBM algorithms does not lend itself well to addressing transient behaviour like startups, shutdowns or rapid operating changes. This is because there is no set relationship between the datasets selected to make predictions and the point in time in the operating cycle from which they came. Algorithms are geared towards analogue sensor data, not digital readings, so useful information derived from open/close, on/off or other functions is not regularly applied in these models.

Despite these factors, the SBM models are still highly effective in accurately highlighting changes in equipment behaviour to provide early warning of degradation. As computing and monitoring technology advances, these data have become more robust, using new systems and ideas such as machine learning and artificial intelligence to reduce potential for errors.

Nuclear power plants such as Arizona Public Service’s (APS’s) Palo Verde continue to seek ways to improve efficiency and reduce maintenance.

In 2017, Palo Verde embarked on a collaboration with Curtiss-Wright to develop a data analytics platform that would provide an alternative to SBM systems and would be less dependent on subject matter experts (SMEs) to construct and maintain models, be able to make use of more information from the increasing number of sensors, including digital sensors, providing plant data, and would better address transient plant conditions. Subsequently dubbed equipment anomaly detection (EAD), it sought a platform that is sensitive to degradation with fewer false-positive indications, while at the same time making more effective use of experts at the plant.

The result was an EAD platform that includes:

A system for detecting equipment anomalies;
A dashboard for engineering and operations to review and categorise these equipment anomalies as Acknowledge, Watch, Condition Report, and Ignore;
An integration layer for retrieving contextual data (such as CMMS and logs) and updating them according to the anomaly disposition.

Advances in model training techniques have combined with processor advances to finally make the promise of neural network-based solutions achievable.

Palo Verde’s solution uses a deep recurrent auto-encoder, also known as a long short term memory (LSTM) neural network, with time series outlier detection to simplify the identification of anomalies. A LSTM neural network produces predicted values for its collection of sensors based not only on the current dataset, but also upon the data from a configured number of time steps into the past. The LSTM is trained on historical data to establish weights and tuned parameters for each neuron’s inputs and a memory of past time steps.

The nature of LSTM training makes the model effectiveness less sensitive to sensor selection and mitigates the impact of small amounts of anomalous data in the historical data used to train models. This reduces the need for SME involvement in building and maintaining the models. SMEs can shift to reviewing and categorising the anomalies identified, generating feedback that can be used to automatically adjust parameters used in outlier detection.

EAD models can be refreshed with new data at regular intervals, and can perform routine retraining automatically. The software performs hierarchical clustering on individual sensor base anomalies, then groups them into related sets for presentation to the end users on a web-based dashboard.

As end users, SMEs review the anomalies, view related plant OE documents and then categorise anomalies. They are able to take a range of activities from generating a condition report to providing feedback that the detected anomaly is not a concern, which triggers the analytic model to desensitise identification of that type of anomalous behaviour for the cluster of sensors identified.

“This technology, developed by APS employees at Palo Verde, demonstrates the ingenuity and innovation our people bring to the workplace every day,” said Jack Cadogan, APS senior vice president, site operations for Palo Verde. “The value of this technology is that it identifies potential issues in important components so they can be addressed before a problem arises. It cuts costs, increases efficiency and brings value to any operation that depends on sophisticated technology.”

Two case studies from Palo Verde highlight the utility of the EAD as a diagnostic tool.

The Palo Verde thermal performance engineer worked with the EAD developers to build and apply a feedwater system model. The model included tube side inlet and outlet temperatures, drain temperatures, extraction steam pressures and other feedwater heater instrumentation. Using historical data containing periods of known equipment issues, the EAD identified two distinct anomalies and would have provided early warning had a live system been in place at the time of the occurrences.

The benefits of the EAD platform go beyond refining anomaly detection; it improves system monitoring and management on almost every level.

At the base level, SME time is a valuable commodity, so reducing the domain expertise required to identify equipment degradation not only better utilises SME time, it also makes training of new models a simpler, more streamlined process. The automatic clustering of anomalous behaviour, as opposed to building the complex rules required for older models, saves valuable SME time as well. These clustered anomalies are easier to review and categorise into faults that can be used to identify future instances. This can even be done on historical data to seed a library of faults. In addition, the data clustered may come from disparate areas within the cycle or plant, helping SMEs make valuable connections across their systems.

EAD also employs valuable data that were previously hard or impossible to capture. EAD handles digital signals (such as on/off or open/closed) so that various modes of operation (one pump, two pump, valve line up, etc.) can also be used make predictions for expected state. It can tie in contextual information from work management system operators’ logs to provide a more comprehensive picture of plant health. Being time-based, EAD can also present data from maintenance events, operational changes or notes and other activities for the clustered anomaly for analysis, such as a pump experiencing issues the day after having the bearing greased.

It can also show a preceding anomaly that is matched to the maintenance work performed, helping to identify the failure mechanism.

The fact that it is time-series based also makes it more amenable to start up, shut down, and transient modelling — EAD can handle hot, warm, and cold starts without any rules or special attention to training needed. This makes EAD valuable for equipment within the nuclear sphere, but also for gas turbines, wind turbines, and coal plants as they do more cycling and day shifting.

Finally, this type of modelling can improve system level monitoring such as modelling at a substation level rather than at individual transformers.

Recognising the value of EAD to the nuclear industry and beyond, Curtiss-Wright Corporation has signed an exclusive agreement with Arizona Public Service Company to commercialise APS’s EAD technology.

Curtiss-Wright will add the EAD technology to its FAMOS suite of thermal plant performance and condition monitoring platforms, which will enhance Curtiss-Wright’s portfolio of digital capabilities and product offerings that support plant modernization and digitalisation. Curtiss-Wright expects to deliver a commercialised digital offering in 2021.

Anomaly One: In late May 2017, Palo Verde 2 down-powered because an air hose disconnected from a feedwater heater air-operated level control valve. The EAD model processed three weeks of training data. The resultant plot below shows the saturation temperature for this feedwater heater for the time period, with the blue line indicating the EAD predicted value and the magenta line showing the actual value. Such a deviation would have led to an investigation and early identification of the fault.

Anomaly Two: In mid-summer 2017, Palo Verde 1 down-powered to 80% after a tube leak on the 1A feedwater heater. The following plot shows the comparison between actual and EAD-predicted data using a back-tested EAD model. If the EAD had been in place prior to the actual event, the system would have detected multiple anomalies in the weeks prior to the down power.