Decomposed Prototype Learning for Few-Shot Scene Graph Generation
Today's scene graph generation (SGG) models typically require abundant manual annotations to learn new predicate types. Therefore, it is difficult to apply them to real-world applications with massive uncommon predicate categories whose annotations are hard to collect. In this paper, we focus o...
Saved in:
Main Authors | , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
20.03.2023
|
Subjects | |
Online Access | Get full text |
DOI | 10.48550/arxiv.2303.10863 |
Cover
Loading…
Summary: | Today's scene graph generation (SGG) models typically require abundant manual
annotations to learn new predicate types. Therefore, it is difficult to apply
them to real-world applications with massive uncommon predicate categories
whose annotations are hard to collect. In this paper, we focus on Few-Shot SGG
(FSSGG), which encourages SGG models to be able to quickly transfer previous
knowledge and recognize unseen predicates well with only a few examples.
However, current methods for FSSGG are hindered by the high intra-class
variance of predicate categories in SGG: On one hand, each predicate category
commonly has multiple semantic meanings under different contexts. On the other
hand, the visual appearance of relation triplets with the same predicate
differs greatly under different subject-object compositions. Such great
variance of inputs makes it hard to learn generalizable representation for each
predicate category with current few-shot learning (FSL) methods. However, we
found that this intra-class variance of predicates is highly related to the
composed subjects and objects. To model the intra-class variance of predicates
with subject-object context, we propose a novel Decomposed Prototype Learning
(DPL) model for FSSGG. Specifically, we first construct a decomposable
prototype space to capture diverse semantics and visual patterns of subjects
and objects for predicates by decomposing them into multiple prototypes.
Afterwards, we integrate these prototypes with different weights to generate
query-adaptive predicate representation with more reliable semantics for each
query sample. We conduct extensive experiments and compare with various
baseline methods to show the effectiveness of our method. |
---|---|
DOI: | 10.48550/arxiv.2303.10863 |