Scene variability and perception constancy in the visual system: a model of pre-processing before data analysis and learning
Hell in data analysis is paved (at least) with variability and noise. Is there some lost garden of Eden? Is there some way to approach it? In this paper, we deal with the human visual perception and we show how our visual system manages to process visual information in such a highly efficient way th...
Saved in:
Published in | 2009 IEEE International Workshop on Machine Learning for Signal Processing pp. 1 - 12 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.09.2009
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Hell in data analysis is paved (at least) with variability and noise. Is there some lost garden of Eden? Is there some way to approach it? In this paper, we deal with the human visual perception and we show how our visual system manages to process visual information in such a highly efficient way that it is able to categorize images or scenes within ranges of 100-150 ms independently of variability and noise. In fact, before high-level recognition task, the visual system unfolds a series of pre-processing stages to reduce all the variability that disturbed an image : (1) In the retina: a first adaptative process to global and local illuminant intensity and color, and a second to local contrasts preprocess the visual signal to bring equal quantity of information on the whole image. The retina spatio-temporal filter whitens the image spectra so that all the spatial frequencies are equally represented. (2) In the primary visual cortex: a bank of cortical like filters decomposes the power frequency spectrum of the visual signal; this is equivalent to estimate the local power frequency spectrum, which is relatively insensitive to image translations. Moreover such decomposition offers a log polar representation of the power spectrum, which can be useful to resolve zoom and rotation variability and also to estimate local perspective. (3) In the cortical area V4, a further Fourier Transform of the log-polar spectrum provides insensitivity to zooms and rotations, as well as to perspective transformations. In this research, an image is represented by means of a high-dimensional vector: first this image is preprocessed by the retina, then, the power spectrum of the preprocessed image is decomposed by a bank of filters and the energy output of each filter is considered. Taking advantages of all the processing stages, data variability can be expected to be optimally reduced for comparison purposes and or for categorization tasks. In the last part of this paper, we give an example of image categorization by means of CCA, a self-organizing neural network, which, after noise reduction, reduces dimension and unfolds the manifold where the data are embedded. It is also shown that the network, taught by human subjects performing the categorization task, improves its results by providing a better separation of the semantical categories. |
---|---|
ISBN: | 1424449472 9781424449477 |
ISSN: | 1551-2541 |
DOI: | 10.1109/MLSP.2009.5306254 |