Concentrated Reasoning and Unified Reconstruction for Multi-Modal Media Manipulation

Detecting and Grounding Multi-Modal Media Manipulation (DGM 4 ) is an emerging task that aims to identify and locate manipulated elements in both textual and visual media. Given the complexity of this task, the model requires more sophisticated reasoning capabilities to align multi-modal features an...

Full description

Saved in:

Bibliographic Details
Published in	ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp. 8190 - 8194
Main Authors	Zhao, Weichen, Lu, Yuxing, Jiao, Ge, Yang, Yuan
Format	Conference Proceeding
Language	English
Published	IEEE 14.04.2024
Subjects	DeepFake Detection Feature extraction Grounding Mask Signal Modeling Media Multi-Modal Media Manipulation Signal processing Technological innovation Transformers Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Detecting and Grounding Multi-Modal Media Manipulation (DGM 4 ) is an emerging task that aims to identify and locate manipulated elements in both textual and visual media. Given the complexity of this task, the model requires more sophisticated reasoning capabilities to align multi-modal features and capture forgery traces. To this end, we propose a Concentrated reasoning and Unified reconstruction framework (CrUr) for DGM 4 . Instead of adhering to traditional hierarchical reasoning paradigms, we directly carry out all inference tasks using integrated multi-modal features. Specifically, we extract and align features at a finer granularity, capturing subtle differences that may indicate manipulation by leveraging advanced mask signal modeling. Moreover, to adapt to fine-grained reasoning tasks, we design a transformer-based Reconstruction Harmonizer to facilitate more complex interactions among the reconstructed features, ultimately obtaining integrated features. Experimental results on the DGM 4 datasets show that our method achieves state-of-the-art performances.
ISSN:	2379-190X
DOI:	10.1109/ICASSP48485.2024.10447651