Time Delays and Lags in Causal Models
This page examines time delays and lags in causal models. This is part of the section on
Causal Models that is part of the section on
Model Based Reasoning that is part of the white paper
A Guide to Fault Detection and Diagnosis.
Causality in reality implies time delays and lags
In physical systems, causality is in reality associated with some time delay or lags between cause and effect. This has to happen because mass or energy has to move, overcoming resistance by inertia, thermal inertia, inductance, or other physical phenomena.
The terminology of delay vs. lag comes from control theory. A signal passing through a time delay is one which is exactly delayed by a time period, with no change in shape of the input signal vs. the output signal. An example of a pure time delay occurs when monitoring the temperature or composition of a liquid flowing in a pipe. It takes a time delay for a unit of fluid to get from one point to another. With a lag, the response starts right away, but the complete response is spread out over a time period, changing the shape of the input signal. Examples of this include the pressure of gas in a sphere in response to input flow changes, or the voltage across a circuit with a parallel resistor and capacitor.
Time delays and lags are also introduced in the processing of data. For instance, most data is filtered to remove noise. A control system includes filters that introduce lags. Sampling periods introduce time delays. Data processing intervals introduce additional delays. Averaging filters such as moving averages, or averages over predefined times like hours, shifts, days or months introduce significant lags. So, for instance, the delays and lags between between low-level plant problems and high-level KPIs (Key Performance Indicators) can be very large.
A basic cause/effect model (such as a fault propagation model as described earlier) is a static model, where time delays and lags are not explicitly shown. Ignoring the dynamics is often a convenient approximation. That approximation is most appropriate when the interval for repeating an analysis is much longer than the time delays in the monitored system.
An alternative is to use filtering or delayed values to “synchronize” the variables - part of the model is in effect hidden inside the filtering or delays. In the example at the beginning of the causal model section, if a 10 minute delay is expected between a change in C1 and a change in E, and E is true, in theory we are comparing the value of C1 delayed by 10 minutes vs. the current value of E when doing diagnosis. In practice, as long as a problem at C1 is still present, we will still arrive at the correct diagnosis even using the current values. We will have an error, though, when C1 returns to normal (“false”). If the main concern is initial diagnosis rather than pinpointing recovery, this may be acceptable.
It is rarely worth a lot of effort to try to accurately synchronize the variables in time. One reason is that the time delays and lags vary with operating conditions. Although they may be in large part inversely proportional to flow rates in the process industries, they still may be difficult to estimate exactly. Another reason is that for events based ultimately on continuous variables, you might not have complete control over the sampling period, filtering, or averaging of those variables. The sampling, filtering or averaging introduces delays or lags.
One reason for this section is a reminder that there will be transient errors in diagnosis due to timing effects. Qualitative systems are sensitive to this when the underlying system is mainly based on continuous variables. The conversion from raw data to a crisp true/false value depends on the amount of filtering and on the threshold settings. This is difficult to avoid, because the timing effect modeling will not be perfect. This is partly mitigated when using fuzzy or probabilistic approaches that range from 0 through 1, rather than pure binary values. But, the problem does not go away completely. Instead, consider latching the conclusions presented to the users.
The time lags introduced by filtering and threshold selection for conversion to binary form can also make diagnosis based on the a sequence of events problematic. The filtering, and initial values before a failure can change the order in which the diagnostic system sees the events. Unless the sequence occurs on a much slower time scale than the event filtering, it is better instead to query for related events over a time period without specifying the exact order. (See the section on diagnosis by event query).
A good operations support system will use causal models to predict an upcoming problem in downstream effects as soon as any problems upstream are recognized. That way, the system will be “proactive” rather than just “reactive”.. This is especially important when one overall model spans very short time spans (such as low-level plant equipment or operating problems) through very long time spans (such as weekly KPIs). The predicted impact can help prioritize problem solving. You want to take action as soon as diagnosis of a low-level failure is complete, not wait until a monthly KPI shows high energy usage.
Copyright 2010 - 2013, Greg Stanley
(Return to causal models)
(Return to A Guide to Fault Detection and Diagnosis)