Modelling relations with prototypes for visual relation detection

Relations between objects drive our understanding of images. Modelling them poses several challenges due to the combinatorial nature of the problem and the complex structure of natural language. This paper tackles the task of predicting relationships in the form of (subject, relation, object) triple...

Full description

Saved in:

Bibliographic Details
Published in	Multimedia tools and applications Vol. 80; no. 15; pp. 22465 - 22486
Main Authors	Plesse, François, Ginsca, Alexandru, Delezoide, Bertrand, Prêteux, Françoise
Format	Journal Article
Language	English
Published	New York Springer US 01.06.2021 Springer Nature B.V Springer Verlag
Subjects	Combinatorial analysis Computer Communication Networks Computer Science Computer Vision and Pattern Recognition Data Structures and Information Theory Modelling Multimedia Multimedia Information Systems Neural networks Object recognition Prototypes Special Purpose and Application-Based Systems Training Visual relation detection Synonyms Prototype Nearest neighbors Metric learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Relations between objects drive our understanding of images. Modelling them poses several challenges due to the combinatorial nature of the problem and the complex structure of natural language. This paper tackles the task of predicting relationships in the form of (subject, relation, object) triplets from still images. To address these issues, we propose a framework for learning relation prototypes that aims to capture the complex nature of relation distributions. Concurrently, a network is trained to define a space in which relationship triplets with similar spatial layouts, interacting objects and relations are clustered together. Finally, the network is compared to two models explicitly tackling the problem of synonymy among relations. For this, two well known scene-graph labelling benchmarks are used for training and testing: VRD and Visual Genome. Prediction of relations based on distance to prototype provides a significant increase in the diversity of predicted relations, improving the average relation recall from 40.3% to 41.7% on the first and 31.3% to 35.4% on the second.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-020-09001-6