Appearance difference makes relationship: A new visual relationships inferance mechanism

To understand visual information better, the machine needs to go to a higher space based on the object recognition, which is to understand the relationship between objects. Despite recent advances in this field through deep learning technology, the detection and grounding for visual relationships is...

Full description

Saved in:

Bibliographic Details
Published in	IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC ...) (Online) Vol. 4; pp. 599 - 605
Main Authors	Yu, Nie, Siyu, Zhu, Guiping, Su, Yuxin, Guo
Format	Conference Proceeding
Language	English
Published	IEEE 18.06.2021
Subjects	Automation Computer Vision Conferences Deep learning Genomics Grounding Information management Multimedia Visual Relationships Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	To understand visual information better, the machine needs to go to a higher space based on the object recognition, which is to understand the relationship between objects. Despite recent advances in this field through deep learning technology, the detection and grounding for visual relationships is still a difficult task. In this work, we propose a relational attention model that considers appearance differences to try to get rid of the long tail distribution problem in data-driven methods. Appearance difference is used to highlight the difference between the entity in the image and the entity of the same category which determines the relationship. This is a customized structure based on a transformer, which considers the impact from subject, object and their Co-occurrence on the relationship. Our representation generation method based on multi-head attention can effectively model the relationship and solve the multi-label problem of the visual relationship. In comparison to other state of the art approaches, we achieve an absolute mean improvement in performance on the Visual Genome dataset.
ISSN:	2693-2776
DOI:	10.1109/IMCEC51613.2021.9482321