Scene graph generation based on lightweight entity pair object detection and relation classification ensemble

Scene Graph Generation (SGG) aims to automatically generate a semantic graph structure, enabling a deeper understanding and reasoning of visual scenes. It is widely used in scenarios such as autonomous driving, virtual reality and smart cities. Existing SGG models generally have problems such as com...

Full description

Saved in:
Bibliographic Details
Published inNeurocomputing (Amsterdam) Vol. 637; p. 130130
Main Authors Hu, Hong-Xiang, Yang, Xu-Hua, Zhao, Yu-Yong
Format Journal Article
LanguageEnglish
Published Elsevier B.V 07.07.2025
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Scene Graph Generation (SGG) aims to automatically generate a semantic graph structure, enabling a deeper understanding and reasoning of visual scenes. It is widely used in scenarios such as autonomous driving, virtual reality and smart cities. Existing SGG models generally have problems such as complex structure, large number of parameters, and difficulty in deployment. At the same time, the bias in predicate relationship prediction caused by the long-tail distribution of training data affects the overall performance of SGG. In order to improve the limitations of the above methods, first, we propose a lightweight entity pair object detection method that can directly and efficiently decode subject-object pairs that may have predicate relations from images. Next, in order to alleviate the long-tail distribution, we propose a predicate relation classifier optimization method (PRCE) based on ensemble learning. The original training set is used to generate multiple new training sets based on varied sampling rules, and distinct predicate relationship classifiers are trained with the resulting classifiers integrated for final predicate relationship prediction. On the Visual Genome (VG) and OpenImageV6 datasets, by comparing with well-known methods, experiments show that our proposed model achieves very competitive performance. The code is available at: https://github.com/xhonghu/match-prce.
ISSN:0925-2312
DOI:10.1016/j.neucom.2025.130130