Improving Visual Reasoning Through Semantic Representation

In visual reasoning, the achievement of deep learning significantly improved the accuracy of results. Image features are primarily used as input to get answers. However, the image features are too redundant to learn accurate characterizations within a limited complexity and time. While in the proces...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 9; pp. 91476 - 91486
Main Authors Zheng, Wenfeng, Liu, Xiangjun, Ni, Xubin, Yin, Lirong, Yang, Bo
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In visual reasoning, the achievement of deep learning significantly improved the accuracy of results. Image features are primarily used as input to get answers. However, the image features are too redundant to learn accurate characterizations within a limited complexity and time. While in the process of human reasoning, abstract description of an image is usually to avoid irrelevant details. Inspired by this, a higher-level representation named semantic representation is introduced. In this paper, a detailed visual reasoning model is proposed. This new model contains an image understanding model based on semantic representation, feature extraction and process model refined with watershed and u-distance method, a feature vector learning model using pyramidal pooling and residual network, and a question understanding model combining problem embedding coding method and machine translation decoding method. The feature vector could better represent the whole image instead of overly focused on specific characteristics. The model using semantic representation as input verifies that more accurate results can be obtained by introducing a high-level semantic representation. The result also shows that it is feasible and effective to introduce high-level and abstract forms of knowledge representation into deep learning tasks. This study lays a theoretical and experimental foundation for introducing different levels of knowledge representation into deep learning in the future.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2021.3074937