Causal Knowledge in Data Fusion Subject to Latent Confounding and Measurement Error
Data fusion is the process of integrating data from multiple sources to produce more accurate and reliable information. It is often the case that data are subject to latent confounding and measurement error in real-world scenarios. In this paper, we evaluate fusion strategies based on different leve...
Saved in:
Published in | IEEE/SICE/RSJ International Conference on Multisensor Fusion and Integration for Intelligent Systems pp. 1 - 8 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
04.09.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Data fusion is the process of integrating data from multiple sources to produce more accurate and reliable information. It is often the case that data are subject to latent confounding and measurement error in real-world scenarios. In this paper, we evaluate fusion strategies based on different levels of contained causal knowledge to solve quality prediction under varied conditions of latent confounding and measurement error. We show that the machine learning-based fusion strategy achieves the best prediction quality when data are independent and identically distributed (i.i.d.). However, in the presence of latent confounding, the causality-based fusion strategy makes prediction models more robust against severe distribution shifts. Moreover, the out-of-distribution (OOD) generalizability of prediction models is also affected by measurement error in the data. If causal knowledge needs to be inferred from data by applying causal discovery methods, we demonstrate that measurement error can adversely impair causal discovery. We advocate that caution needs to be exercised when using standard causal discovery methods if the circumstances under which the data were generated are unknown. |
---|---|
ISSN: | 2767-9357 |
DOI: | 10.1109/MFI62651.2024.10705789 |