The remainder of this page summarizes the published CDG/SymCure technical papers, and provides links to them.
A Generic Fault Propagation Modeling Approach to On-Line Diagnosis and Event Correlation
The first published CDG technical paper by Stanley and Vaidhyanathan
A Generic Fault Propagation Modeling Approach to On-Line Diagnosis and Event Correlation (pdf)
or (HTML version) was originally presented at the 3rd IFAC Workshop on On-Line Fault Detection and Supervision in the Chemical Process Industries. (IFAC is the International Federation of Automatic Control). That paper was expanded into a tutorial in the Integrity Reasoner White Paper (pdf). The abstract of the original paper is:
CDG (Causal Directed Graph) provides a methodology and framework for real-time fault management in large-scale systems, addressing the full life cycle of problem identification based on symptoms, diagnostic testing, and fault isolation, through recovery, as well as protecting the operator from “alarm flooding”. It is based on generic fault propagation models, tied to an object-oriented domain representation and scalable algorithms. CDG combines the generality of FMEA models with on-line, asynchronous event correlation and diagnosis. The architecture of CDG is described and the modeling approach is discussed with examples. Event correlation and interactive diagnosis using CDG is illustrated through a nitric acid cooling system example.
Applications for Abnormal Condition Management (ACM)
A white paper by Mark Allen of Gensym on application of the technology in process control for Abnormal Condition Management is described in Optegrity for Abnormal Condition Management (pdf) .
A technical paper by Noureldin and Roveta, Using Expert System and Object Technology for Abnormal Condition Management (pdf), was presented at the BIAS conference in Milano, Italy. The abstract for the paper is:
This paper discusses the problem of Abnormal Condition Management (ACM), defines the requirements for addressing this problem, and presents an application, developed using Gensym’s Optegrity platform, which provides generic objects for managing abnormalities on heaters. The goal of this application is to sustain operational performance and maintain continuous availability by detecting and resolving abnormal process conditions early – before they impact operations. The heater models developed for the first application can be easily reused and adapted to other heating devices by customizing the objects with graphical tools. The first application of these “generic heaters” has been installed in a refinery in the Middle East, and it is currently in the process of being deployed at other sites. A total of 80 preconfigured faults have been included for identifying the root cause of various heater problems. The application includes almost 240 messages that can be presented to operators for assisting with the diagnosis of problems and for providing guidance to quickly return to normal operation. As part of the justification of this project, a return on investment analysis was completed. The payback period was estimated to be in the range of 3 to 8 months, depending on the type of heater, the application of the heater and the existing operating conditions.
A white paper by the ARC Advisory Group summarizes the incentives for using tools like Optegrity for Abnormal Condition Management (ACM), available at Abnormal Condition Management (pdf)
Applications at BMC for systems management
White papers on application of this technology to systems management (Microsoft Exchange Servers and Windows 2000 Servers) can be found in the BMC white paper page .
Real world model-based fault management
The technical paper Real World Model-based Fault Management (pdf) by Kapadia, Stanley and Walker, is from the 18th International Workshop on the Principles of Diagnosis, Nashville TN (2007). It summarizes more recent advances and experience with applications of the SymCure product. The abstract is:
Real world fault management applications encompass a number of diagnostic activities such as symptom monitoring, root cause analysis, impact prediction, testing, and recovery. They motivate powerful knowledge representation schemes to capture domain expertise and the development of intelligent algorithms that can exploit this knowledge. There are vast opportunities for the application of state-of-the-art fault management in commercial settings and, with billions of dollars at stake, industries are eager to embrace intelligent knowledge based solutions. Over the past decade, we have developed an object-oriented model-based domain-independent methodology for real world fault management, called SymCure. In this paper, we use this experience to generalize a set of requirements for real world fault management. We present an overview of the architecture and the modeling language of SymCure. We review a sample of projects where we have applied this approach, and share the motivations, challenges, successes and failures that have been our companions along this memorable journey.