Object-Level Scene Context Prediction

Contextual information plays an important role in solving various image and scene understanding tasks. Prior works have focused on the extraction of contextual information from an image and use it to infer the properties of some object(s) in the image or understand the scene behind the image, e.g.,...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on pattern analysis and machine intelligence Vol. 44; no. 9; pp. 5280 - 5292
Main Authors Qiao, Xiaotian, Zheng, Quanlong, Cao, Ying, Lau, Rynson W.H.
Format Journal Article
LanguageEnglish
Published United States IEEE 01.09.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN0162-8828
1939-3539
2160-9292
1939-3539
DOI10.1109/TPAMI.2021.3075676

Cover

More Information
Summary:Contextual information plays an important role in solving various image and scene understanding tasks. Prior works have focused on the extraction of contextual information from an image and use it to infer the properties of some object(s) in the image or understand the scene behind the image, e.g., context-based object detection, recognition and semantic segmentation. In this paper, we consider an inverse problem, i.e., how to hallucinate the missing contextual information from the properties of standalone objects. We refer to it as object-level scene context prediction. This problem is difficult, as it requires extensive knowledge of the complex and diverse relationships among objects in the scene. We propose a deep neural network, which takes as input the properties (i.e., category, shape, and position) of a few standalone objects to predict an object-level scene layout that compactly encodes the semantics and structure of the scene context where the given objects are. Quantitative experiments and user studies demonstrate that our model can generate more plausible scene contexts than the baselines. Our model also enables the synthesis of realistic scene images from partial scene layouts. Finally, we validate that our model internally learns useful features for scene recognition and fake scene detection.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0162-8828
1939-3539
2160-9292
1939-3539
DOI:10.1109/TPAMI.2021.3075676