Not-So-CLEVR: learning same–different relations strains feedforward neural networks

The advent of deep learning has recently led to great successes in various engineering applications. As a prime example, convolutional neural networks, a type of feedforward neural network, now approach human accuracy on visual recognition tasks like image classification and face recognition. Howeve...

Full description

Saved in:

Bibliographic Details
Published in	Interface focus Vol. 8; no. 4; p. 20180011
Main Authors	Kim, Junkyung, Ricci, Matthew, Serre, Thomas
Format	Journal Article
Language	English
Published	England The Royal Society 06.08.2018
Subjects	Convolutional Neural Networks Deep Learning Perceptual Grouping Visual Attention Visual Relations Working Memory visual attention deep learning perceptual grouping convolutional neural networks visual relations working memory
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The advent of deep learning has recently led to great successes in various engineering applications. As a prime example, convolutional neural networks, a type of feedforward neural network, now approach human accuracy on visual recognition tasks like image classification and face recognition. However, here we will show that feedforward neural networks struggle to learn abstract visual relations that are effortlessly recognized by non-human primates, birds, rodents and even insects. We systematically study the ability of feedforward neural networks to learn to recognize a variety of visual relations and demonstrate that same–different visual relations pose a particular strain on these networks. Networks fail to learn same–different visual relations when stimulus variability makes rote memorization difficult. Further, we show that learning same–different problems becomes trivial for a feedforward network that is fed with perceptually grouped stimuli. This demonstration and the comparative success of biological vision in learning visual relations suggests that feedback mechanisms such as attention, working memory and perceptual grouping may be the key components underlying human-level abstract visual reasoning.
Bibliography:	Theme issue ‘Understanding images in biological and computer vision’ organised by Andrew Schofield, Aleš Leonardis, Marina Bloj, Iain D Gilchrist and Nicola Bellotto ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 These authors contributed equally to this study. One contribution of 12 to a theme issue ‘Understanding images in biological and computer vision’.
ISSN:	2042-8898 2042-8901
DOI:	10.1098/rsfs.2018.0011