Visual Commonsense Causal Reasoning From a Still Image

Even from a still image, humans exhibit the ability to ratiocinate diverse visual cause-and-effect relationships of events preceding, succeeding, and extending beyond the given image scope. Previous work on commonsense causal reasoning (CCR) aimed at understanding general causal dependencies among c...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 13; pp. 85084 - 85097
Main Authors	Wu, Xiaojing, Guo, Rui, Li, Qin, Zhu, Ning
Format	Journal Article
Language	English
Published	Piscataway IEEE 2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Cause effect analysis Commonsense reasoning Correlation Electronic mail Estimation Large language models multimodal large language model Natural languages Question answering (information retrieval) Reasoning Visual commonsense reasoning Visual effects visual event reasoning Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Even from a still image, humans exhibit the ability to ratiocinate diverse visual cause-and-effect relationships of events preceding, succeeding, and extending beyond the given image scope. Previous work on commonsense causal reasoning (CCR) aimed at understanding general causal dependencies among common events in natural language descriptions. However, in real-world scenarios, CCR is fundamentally a multisensory task and is more susceptible to spurious correlations, given that commonsense causal relationships manifest in various modalities and involve multiple sources of confounders. In this work, to the best of our knowledge, we present the first comprehensive study focusing on visual commonsense causal reasoning (VCCR) within the potential outcomes framework. By drawing parallels between vision-language data and human subjects in the observational study, we tailor a foundational framework, VCC-Reasoner, for detecting implicit visual commonsense causation. It combines inverse propensity score weighting and outcome regression, offering dual robust estimates of the average treatment effect. Empirical evidence underscores the efficacy and superiority of VCC-Reasoner, showcasing its outstanding VCCR capabilities.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2025.3558429