Filtering Strategy and Tuning: Balancing Conflicting Needs
This page examines filtering strategy and tuning. A balance must be achieved between noise rejection vs. speed of response. Also, detection of important transients, and symptoms that manifest as high frequency noise must be considered. This is part of the section on
Filtering that is part of
A Guide to Fault Detection and Diagnosis.
Noisy measurements interfere with fault detection and diagnosis. To a large extent, this must be addressed by filtering and smoothing techniques for the continuous variables over time, or requiring combination of evidence so that transient values in one variable will not affect the result. Filtering is usually needed on analog inputs to reduce sensitivity to noise, either as a separate function, or accomplished as part of event detection based on historical data (as in SPC tests).
Even digital inputs may need to be filtered, especially if they are generated in some remote system based ultimately on noisy analog input. An example from electrical engineering is “debouncing”, almost always used for detecting pressing of keys and buttons. The actual signal can be quite noisy due to erratic contact with oxidized (insulated) areas. One time-based approach to debouncing is “latching”: the output of the latching filter is held constant for some period of time for either the “on” result, the “off” result, or both.
Another approach is to introduce hysteresis in event detection, also called a “dead band”. As the input analog signal increases, the output transitions from false to true at a much higher value than the transition from true to false as the input signal drops. This is commonly available as part of the alarm functionality of process control systems. For electronic systems, including button debouncing, this can be accomplished in hardware using a “Schmitt trigger”. Hysteresis tuning is based on limits for the input values rather than directly on time.
Diagnostic results can be filtered as well. Conclusions such as a diagnosed fault are generally binary variables, and these can be filtered through latching. Configuring time delays for the transitions from off to on, and on to off, involve tradeoffs on the time to detect and isolate the fault vs. time to detect recovery from the fault.
Excessive filtering will slow down response towards recognizing and isolating faults or recovery from them. Too little filtering will result in conclusions that change too rapidly and are sensitive to noise. A separate white paper will detail the special needs for filtering for fault detection and diagnosis.
Any particular input signal may need to have more than one filter associated with it. One reason is that different faults are associated with different time scales. The symptoms for detecting some problems are based on the presence of high frequency noise. For example, flashing (turbulence caused by generation of vapor bubbles in a liquid stream) at a pump input or across the orifice plate of a flow sensor, can be a primary symptom for detecting problems with incorrect chemical compositions, partial pipe blockage, or low pressure. The diagnostic system needs to look at essentially unfiltered data to see this high frequency noise. High standard deviation of an unfiltered flow or pressure signal becomes a symptom. Heavy filtering (at fast sample rates) would destroy the ability to even see that noise. But another failure mode detected with the same flow or pressure sensor could be poor controller tuning. For that analysis, the high frequency noise is not relevant, and the same signal, filtered with a first order filter with a time constant of, say, 2-10 seconds is appropriate. Heavier filtering or averaging over, say, 10 minutes, would destroy the ability to see that fault. Other failure modes on large downstream equipment could be detected and diagnosed with much heavier filtering, and benefit from the noise reduction.
Heavy filtering also reduces sensitivity to “spikes” (occasional rapid, high-magnitude changes in value). Specialized spike filtering is often a part of digital filtering as well, e.g., ignoring the first rapid change, only accepting a radical change in value after it has been held for several sample cycles and represents a true step change. This benefits most subsequent analysis.
However, the presence of spikes is a symptom needed for some fault detection and diagnosis. Spikes can be a symptom of faults such as loose connections, electrical interference, or power supply problems as equipment is started up or shut down. They also occur when pumps repeatedly trip and automatically restart due to certain faults. This information would be lost in the presence of heavy filtering or spike filtering. Again, different filters for the same sensor are needed for detecting and diagnosing different faults. But, care must be taken in this case with quantitative techniques that assume independent sensor errors. The outputs of these different filters have serial correlation (correlation over time).
Some diagnosis is based on the rate of change of variables rather than the absolute values. For instance, in many process plants, the average values of variables changes with operating mode, feedstock, optimization schemes, or other causes. So setting absolute thresholds for event detection is difficult. People may find it easiest to say that “an increase in an input value X should result in a change of an output value Y” for normal operation or in describing a particular fault.
The rate of change information is lost with heavy filtering. When calculating rate of change, the raw data should generally be used, preferably over multiple previous data points to achieve some high-frequency noise filtering. But if filtered values are already available in different time scales, the filter outputs can be compared as an indicator of variable change. If a variable feeds filters with light and heavy filtering, the heavily-filtered value can be subtracted from the lightly filtered value. A positive number result indicates that the variable has been increasing over a time period, while a negative number indicates decreasing value. This is a technique borrowed from, of all places, stock market trading. The difference of the filter results (with some additional filtering) is called the MACD indicator, for “Moving Average Convergence-Divergence”.
Filters provide a mechanism for “synchronizing” data in otherwise static models, as discussed in the section on causal modeling. However, it must also be recognized that filters and threshold values both introduce timing errors as well. For cases where a variable is filtered, and then thresholded to generate an event, heavy filtering introduces additional time lag after a failure, before the filter output finally crosses the threshold to generate the event. Having a threshold far outside of normal operation has the same effect - delaying the generation of a symptom.
Return to Filtering
Return to A Guide to Fault Detection and Diagnosis
Copyright 2010-2020, Greg Stanley