Learning to capture dependencies between global features of different convolution layers
•Proposed a network that can capture global feature dependencies between different convolution layers.•The proposed network can be embedded in the entire network or only in specific layers.•The proposed network maintains variable input sizes and can be easily combined with other operations.•Proposed...
Saved in:
Published in | Journal of visual communication and image representation Vol. 81; p. 103360 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Inc
01.11.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •Proposed a network that can capture global feature dependencies between different convolution layers.•The proposed network can be embedded in the entire network or only in specific layers.•The proposed network maintains variable input sizes and can be easily combined with other operations.•Proposed RELU-Dot-product to measures the relationship between global features of different layers.
NLNet has been considered as one milestone in the study of capturing long-range dependencies. Many recent studies modify the internal structure of NLNet directly and apply them to video object detection and semantic segmentation tasks. The dependencies between local and global features have been well developed, but the dependencies between global features of different convolution layers are rarely considered. Convolution is a local operation, so the global features of different convolution layers cannot be directly related, resulting in the loss of dependencies between global features. Given the vulnerability, this study designs a network that can efficiently capture the dependencies between the global features of different convolution layers, potentially further improving the accuracy. Furthermore, for the calculation of the dependency matrix, based on the Dot-product used in NLNet, we propose RELU-Dot-product, which can achieve higher accuracy. We evaluatethe proposed method on image classification and object detection tasks. The data sets involved are CIFAR10, CIFAR100, Tiny-imagenet, VOC2007, VOC2012 and MS COCO. Experiments show that our method can significantly improve network performance by introducing a few parameters. |
---|---|
ISSN: | 1047-3203 1095-9076 |
DOI: | 10.1016/j.jvcir.2021.103360 |