COKE: Causal Discovery with Chronological Order and Expert Knowledge in High Proportion of Missing Manufacturing Data
Understanding causal relationships between machines is crucial for fault diagnosis and optimization in manufacturing processes. Real-world datasets frequently exhibit up to 90% missing data and high dimensionality from hundreds of sensors. These datasets also include domain-specific expert knowledge...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
16.07.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Understanding causal relationships between machines is crucial for fault
diagnosis and optimization in manufacturing processes. Real-world datasets
frequently exhibit up to 90% missing data and high dimensionality from hundreds
of sensors. These datasets also include domain-specific expert knowledge and
chronological order information, reflecting the recording order across
different machines, which is pivotal for discerning causal relationships within
the manufacturing data. However, previous methods for handling missing data in
scenarios akin to real-world conditions have not been able to effectively
utilize expert knowledge. Conversely, prior methods that can incorporate expert
knowledge struggle with datasets that exhibit missing values. Therefore, we
propose COKE to construct causal graphs in manufacturing datasets by leveraging
expert knowledge and chronological order among sensors without imputing missing
data. Utilizing the characteristics of the recipe, we maximize the use of
samples with missing values, derive embeddings from intersections with an
initial graph that incorporates expert knowledge and chronological order, and
create a sensor ordering graph. The graph-generating process has been optimized
by an actor-critic architecture to obtain a final graph that has a maximum
reward. Experimental evaluations in diverse settings of sensor quantities and
missing proportions demonstrate that our approach compared with the benchmark
methods shows an average improvement of 39.9% in the F1-score. Moreover, the
F1-score improvement can reach 62.6% when considering the configuration similar
to real-world datasets, and 85.0% in real-world semiconductor datasets. The
source code is available at https://github.com/OuTingYun/COKE. |
---|---|
DOI: | 10.48550/arxiv.2407.12254 |