Objects and scenes classification with selective use of central and peripheral image content

•En-HMAX adopts a relative order of importance depending on the image category.•Central vision corresponds to object recognition.•Peripheral vision corresponds to scene recognition.•Deep learning models exhibit similar trends towards object and scene images.•Only 50% of the relevant image data could...

Full description

Saved in:

Bibliographic Details
Published in	Journal of visual communication and image representation Vol. 66; p. 102698
Main Authors	Alameer, Ali, Degenaar, Patrick, Nazarpour, Kianoush
Format	Journal Article
Language	English
Published	Elsevier Inc 01.01.2020
Subjects	Biological visual-systems Image understanding Scene analysis Visual perception Visual recognition Visual-data reduction Image understanding Visual-data reduction Scene analysis Biological visual-systems Visual recognition Visual perception
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•En-HMAX adopts a relative order of importance depending on the image category.•Central vision corresponds to object recognition.•Peripheral vision corresponds to scene recognition.•Deep learning models exhibit similar trends towards object and scene images.•Only 50% of the relevant image data could achieve 90% object classification accuracy. The human visual recognition system is more efficient than any current robotic vision setting. One reason for this superiority is that humans utilize different fields of vision, depending on the recognition task. For instance, experiments on human subjects show that the peripheral vision is more useful than the central vision in recognizing scenes. We tested our recently-developed model, that is, the elastic net-regularized hierarchical MAX (En-HMAX), in recognizing objects and scenes. In various experimental conditions, images were occluded with windows and scotomas of varying sizes. With this model, classification accuracies of up to 90% for objects and scenes were possible. Modelling human experiments, window and scotoma analysis with the En-HMAX model revealed that object and scene recognition are sensitive to the availability of data in the centre and the periphery of the images, respectively. Similarly, results of deep learning models have shown that the classification accuracy diminishes dramatically in the absence of the peripheral vision. These differences led us to further analyse the performance of the En-HMAX model with the parafoveal versus peripheral areas of vision, in a second study. Results of the second study show that approximately 50% of the visual field would be sufficient to achieve 96% accuracy in the classification of unseen images. The En-HMAX model adopts a relative order of importance, similar to the human visual system, depending on the image category. We showed that utilizing the relevant regions of vision can significantly reduce the image processing time and size.
ISSN:	1047-3203 1095-9076
DOI:	10.1016/j.jvcir.2019.102698