Leveraging Machine Learning to Enhance Anomaly Detection in Rotating Machinery

Published on: 
Turbomachinery Magazine, September/October 2024, Volume 65, Issue 5

Machine learning techniques offer real-time anomaly detection that proactively identifies potential failures in turbomachinery.

Monitoring rotating machinery is essential to ensuring operational reliability and performance. However, traditional real-time monitoring methods are often limited in detecting anomalies that can lead to failures because they either rely on predefined thresholds or rules that serve as “alarm” or “trip” values or require human operators to manually review data and identify potential problems. These time-consuming and error-prone methods will eventually lead to unforeseen downtime and costly maintenance interventions.

VIBRATION ANALYSIS

Vibration analysis is a powerful tool in rotating machinery condition monitoring. Equipment vibrations are captured and processed by techniques, such as Fast Fourier Transform (FFT), to identify how the vibration is distributed over a range of frequencies related to specific faults. This approach allows for early detection of several faults such as imbalance, misalignment, and bearings wear—allowing corrective actions to be implemented before catastrophic failures.

However, the vibrations displayed in the operators’ screens in processing plants are usually amplitude values. The vibration amplitude is useful for monitoring the overall trends or setting alarms and shutdown setpoints; however, it lacks diagnostic depth, as crucial information remains obscured within the frequency domain.1 Predictive maintenance (PdM) strategies have been consequently developed to incorporate periodic vibration routes. PdM technicians periodically collect vibration data in these routes using specialized in-depth analyzers.

HOW ARE ANOMALIES DETECTED?

Early detection of vibration anomalies significantly aids in avoiding catastrophic failures. Understanding the distinction between vibration amplitude and frequencies enables the detection of abnormal conditions through various methods:

  1. Predefined setpoints (amplitude-related): Specific values configured in the control system that, when reached, trigger alerts for the distributed control system (DCS) operator or immediately trip the machine through the emergency shutdown system.
  2. Trending (amplitude-related): Experienced operators, familiar with their equipment, can discern when vibrations are increasing beyond normal levels. Trending the vibration values is useful to monitor the overall condition.
  3. Periodic vibration routes (amplitude- and frequency-related): As part of a PdM strategy, vibration technicians periodically collect vibration data using specialized devices for analysis. These routes help monitor equipment health and detect anomalies before they escalate into critical failures.

CASE STUDIES

While these methods complement each other, drawing from both human observations and system interventions, there's always room for improvement. It is essential to thoroughly evaluate these methods to uncover enhancement opportunities. Below are two failure cases that shed light on how conventional anomaly detection methods failed.

Steam Turbine Bearing Failure

A single-stage steam turbine, driving a transfer pump, underwent an overhaul. Following the overhaul, baseline vibration readings on the inboard bearing were at 1.36 mils and 0.9 mils (X & Y probes) at full load, which are considered acceptable. Two months later, however, during warming up at 900 rpm, the turbine tripped on a high-high setpoint of 3.6 mils. Could this have been prevented?

FIGURE 1 shows the increased linear vibration trend for this equipment. Considering the three detection methods mentioned earlier: The alarm and shutdown setpoints were not reached until the event, the operator did not observe this linear increase, and the failure happened between the vibration routes (the next vibration route was scheduled a month later). Such cases, albeit rare, tell us that a change in the monitoring strategy is needed—making it less human-dependent and more error-proof.

High-Cycle Impeller Failure of a Centrifugal Compressor

Advertisement

A second case study demonstrates centrifugal compressor failure.2 Two metal pieces were liberated from the impeller trailing edge at the hub side, in addition to consequential damages to diffuser vanes. The root cause was high-cycle fatigue related to running in a choke. What did the vibration tell us?

FIGURE 2 shows that the vibration levels (amplitude) were suddenly multiplied by 2, and seven months later they were multiplied by 3. The alarm setpoints were not reached, no operator observed the increase, and the changes were detected during a vibration route eight months after the first increase. Although the detection approach was somewhat proactive, since it was detected during a vibration route, it can be argued that it was late. Eight months of relatively high vibration without any observation or action raises concerns about the condition-monitoring strategy. Would detecting the increase earlier have helped? Perhaps. However, the key point is that failures may develop and go unnoticed, and as a result, they will worsen and cause prolonged downtime.

AN IDENTIFIED GAP

These case studies demonstrate conventional anomaly detection methods have potential issues that would jeopardize the effectiveness of identifying developing failures:

  1. Late detection: By the time a setpoint is reached, it might be too late—permanent damage might have occurred, e.g., in the first case study where the bearing was replaced and the bearing area had minor damages; in the second case study where the compressor impellers were damaged.
  2. Experience variation: When relying on operators to detect vibration anomalies, experience variation plays a big role. Inexperienced operators might not be familiar with the equipment’s history and trends.
  3. Manpower limitations: As the number of equipment increases, collecting vibration data on vibration routes becomes tedious.

USING MACHINE LEARNING TO DETECT DEVELOPING FAILURES

With the widespread adoption of artificial intelligence, maintenance organizations have increasingly turned their attention to applying it in their fields. Numerous models have been proposed to achieve zero-defect operation.3.4 Machine learning can provide valuable insights into equipment status because of the huge volume of operational data at industrial plants.5 It can address the limitations present in conventional condition-monitoring strategies, rendering them error-proof and more autonomous.

A data-driven approach to PdM is illustrated in FIGURE 3.6 The methodology can be broadly divided into four stages: A regression model is used to create a "predicted" signal, which is then compared against real-time data. Alerts are triggered when the deviation between the real and predicted signals exceeds a certain setpoint.

Applying machine-learning algorithms to detect anomalies in Abqaiq Plants showed positive results. So far, real-time models have been developed for 100+ rotating machines. Below are some case studies.

Backtesting a Previous Trip

To validate model performance, it was back-tested on a previous trip for a steam turbine on March 17. In FIGURE 4, the green represents the raw vibration signal as read by the sensor, and the pink is the predicted signal. From left to right, the signals were matching around 0.9 mils until Dec. 5, when the real signal deviated and an alert was generated. Three months later, the trip occurred. This case demonstrates that the developed model is capable of detecting vibration anomalies even though they are far below the alarm setpoint.

Detecting a Developing Anomaly at Early Stages

A developing abnormal condition was detected at an early stage. In FIGURE 5, the noisy signal is the real-vibration signal, whereas the stable one is the predicted signal. From left to right, the predicted and real signals almost matched at 0.5 mils until preventive maintenance was conducted during a shutdown period. Following that, the real signals started deviating, reaching 1 mil until an alert was generated on July 2. The area’s team was informed, and the issue was fixed during a planned shutdown window on Aug. 28. This shows the benefit of utilizing machine learning to detect slight increases in vibrations that are far below the alarm level where both operational needs and equipment health are met.

Detecting an Old Developing Anomaly

An old developing anomaly was discovered. When the model was applied to this equipment—a shipper centrifugal pump— high deviations were found between the real and predicted signal. In FIGURE 6, the noisy signal is the raw data whereas the stable one is the prediction model. Although the raw data show an increasing average over the year, indicating an actual issue, the discrepancies between the predicted and real signals are worth investigating. Generally, such discrepancies could be caused by selecting model features or physical issues with the asset.

Looking at the larger trend in FIGURE 7, the two signals matched two years ago until one probe experienced a step-change, followed by a step-change on the other probe. The vibrations continually increased linearly, albeit slowly, and remained below the alarm level. Operations were informed and a suitable shutdown window was selected for inspection. Had this not been detected, a sudden trip would have caused unplanned downtime.

CONCLUSION

These case studies shed light on the limitations of traditional anomaly detection methods in rotating machinery condition monitoring. These methods may fall short of promptly detecting failures despite the integration of predefined setpoints, operator experience, and periodic vibration routes. This shortcoming could lead to prolonged downtime and operational disruptions.

The introduction of machine learning algorithms offers a promising solution to enhance PdM strategies and mitigate the shortcomings of conventional monitoring approaches. By leveraging real-time operational data and advanced analytics, machine learning models can accurately identify subtle deviations in vibration patterns, even below alarm thresholds, enabling early detection of developing anomalies.

AUTHOR: Abdullah Sofiany is an Associate Engineer of Reliability & Rotating Equipment, at Abqaiq Plants, at Aramco.

REFERENCES:

[1] Marwala, T. (2012). Condition monitoring using computational intelligence methods: applications in mechanical and electrical systems. Springer Science & Business Media.‏

[2] Moyroud, F., Alas, P., & Libeyre, F. (2018). Impeller High cycle fatigue failure on a Natural Gas Pipeline Compressor Following Choked Flow Operation. Turbomachinery Laboratory, Texas A&M Engineering Experiment Station. Available electronically from https://hdl.handle.net/1969.1/175075.

[3] Wang, K. (2016). Intelligent predictive maintenance (IPdM) system—Industry 4.0 scenario. WIT Transactions on Engineering Sciences, 113, 259-268.

[4] Zhou, D. H., Chen, M. Y., & Xu, Z. G. (2013). Reliability Prediction and Optimal Maintenance Technology. Univ. Sci. Technol. China Press.

[5] Costello, J. J. A., West, G. M., & Mcarthur, S. D. J. (2017). Machine learning model for event-based prognostics in gas circulator condition monitoring. IEEE Transactions on Reliability, 66(4), 1048-1057.

[6] Zhang, W., Yang, D., & Wang, H. (2019). Data-Driven Methods for Predictive Maintenance of Industrial Equipment: A Survey. IEEE Systems Journal, 13(3), 2213-2227. doi:10.1109/JSYST.2019.2905565.