|Home About Us Products Services Examples Tech Resources Contact Us|
|Home > Tech Resources > Fault Diagnosis > Procedural >|
A Guide to Fault Detection and Diagnosis.
Some fault detection and diagnosis is handled by creating procedures for making decisions based on the observed data. This is direct modeling of the decision process rather than modeling the system being diagnosed. If software products beyond basic programming languages are used, the portions implementing procedural reasoning are effectively limited forms of workflow engines.
Decision trees are an example of this. In this case, although there are no explicit models of the system being diagnosed, the trees are often in reality built based again on “mental models” in the heads of the application developers (or built automatically based on other models).
In many applications, these procedures require human input, where diagnosis is considered to be decision support rather than a fully automated process going directly from sensor data to conclusions. Their origin is often in operations manuals that included troubleshooting guides, which summarized years of experience or engineering guidance. They generally suffer from being incomplete, but because of their heritage, they usually find the most common problems and also document corrective actions. They often have been extensively reviewed by engineering, safety committees and others. They are easy to understand and explain, because they directly represent the processes followed by people.
Simply automating these procedures or (more commonly) providing computerized decision support to prompt manual actions is common, even if not part of an elegant theory of diagnosis. Procedural approaches are common in industries with large operations centers, as found in the process industries and in network & systems management. They are also a natural fit for call centers for customer support, where a workflow engine for implementation may already be in use.
People often think of procedures as being coded in computer languages, and hence hard to understand and visible only to programmers. But procedures can also be explicit and visible when put in graphical form, as is done for diagnostics in products like GDA that include a graphical language. Workflow engines also usually provide a graphical representation of procedures, so these visible representations have become common. Another example of a fairly transparent procedures outside of diagnostics is is in the case of project management software, usually with several variations in graphical presentation. In a sense a graphical model is built, but this models the procedure followed as a series of tasks, rather than the behavior of the monitored system.
Methods of representing procedures / workflows are useful as part of the overall fault management infrastructure, because tests, mitigating actions, and corrective actions often require a series of steps that must be managed, whether manually or automatically. But, for reasons discussed below, procedural approaches for diagnosis (like decision trees) should generally be avoided.
Comparisons with other techniques
One major risk with many procedural approaches is that they can be “brittle” - not flexible in handling unusual situations or configuration changes. A major problem is that if data is missing or unknown at the time, a decision tree or similar sequential procedure will get stuck. A more flexible model-based system that plans tests automatically will simply search for other data, reasoning with “unknown” values as well as possible. Similarly, conflicting data can be handled in many model-based approaches, but may not even be recognized by a procedural approach.
Another problem is the sensitivity to measurement errors when decision points are based on numerical values. For instance, if a decision is based on x < 50 , consider what happens when the measured value of x is close to 50. Depending on the standard deviation of the measurement of x, there is a significant probability that x is less than 50, and a significant probability that it is greater than 50. In reality, that variable shouldn’t be a major factor in the diagnosis when it is measured near 50 - both the “yes” and “no” path are nearly equally likely, and both should be considered. But decision trees or similar procedural approaches will generally follow just a single path. This problem of forcing a crisp decision from approximate measurements near a threshold arises for other techniques as well, for instance, many rule-based systems that don’t allow for evidence combination or probability calculations.
Hard-coded systems using text-based computer languages should usually be avoided. The structure is not visible and understandable by most of the actual users of the system. It may be difficult even for the application developers to see all the effects of their changes. Systems where the procedure is represented as a directed graph or other visual representation are much easier for the end users to understand.
Procedural approaches usually will not have estimates of the uncertainty of their results (unlike, say, Bayesian approaches or methods based on distances between observed symptom vectors and fault signatures).
Procedural approaches based on queries of the event history and analysis of the relationships between objects in the monitored system can overcome many of these problems, because they can specify their tolerance for uncertainty in the queries, and will adapt to structure changes based on the object model of the monitored system.
From the standpoint of an application developer, one major objection to decision trees and especially hard-coded procedural approaches is that it is difficult to see the big picture, so that maintenance and changes over time become difficult. For instance, the entire decision tree or procedural code needs to be reexamined when adding new faults or symptoms/variables, to be sure that each diagnosis is still correct. This deficiency is shared with basic rule representation, unless extensive rule browsing capability is included. Tools based on logic diagrams may be easier to work with, if care is taken to represent each variable in only one place. Causal models emphasize the links between every variable and conclusion, so they can be easier to work with, depending on the quality of the browsers and visualization for models and rules.
Commercial products and applications
Tools such as GDA, and the OPAC language within the Gensym Integrity product line, provided a graphical language for procedural reasoning (among other approaches) for diagnostics. (GDA was limited to specific diagrams for each piece of monitored equipment, whereas OPAC allowed for generic representation for classes of equipment, relationships between objects, and generic event-based reasoning as well.)
For products supporting mainly simple procedures that can be shown in a tree format, products such as MicroMuse NetCool (used in network management) provide the ability to support procedural reasoning. It has the advantage of a simple GUI using standard tree browsers like those found in PC operating systems.
Copyright 2010 - 2013, Greg Stanley