Block-Row Sparse Multiview Multilabel Learning for Image Classification

In image analysis, the images are often represented by multiple visual features (also known as multiview features), that aim to better interpret them for achieving remarkable performance of the learning. Since the processes of feature extraction on each view are separated, the multiple visual featur...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on cybernetics Vol. 46; no. 2; pp. 450 - 461
Main Authors Zhu, Xiaofeng, Li, Xuelong, Zhang, Shichao
Format Journal Article
LanguageEnglish
Published United States IEEE 01.02.2016
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In image analysis, the images are often represented by multiple visual features (also known as multiview features), that aim to better interpret them for achieving remarkable performance of the learning. Since the processes of feature extraction on each view are separated, the multiple visual features of images may include overlap, noise, and redundancy. Thus, learning with all the derived views of the data could decrease the effectiveness. To address this, this paper simultaneously conducts a hierarchical feature selection and a multiview multilabel (MVML) learning for multiview image classification, via embedding a proposed a new block-row regularizer into the MVML framework. The block-row regularizer concatenating a Frobenius norm (F-norm) regularizer and an l 2,1 -norm regularizer is designed to conduct a hierarchical feature selection, in which the F-norm regularizer is used to conduct a high-level feature selection for selecting the informative views (i.e., discarding the uninformative views) and the 12,1-norm regularizer is then used to conduct a low-level feature selection on the informative views. The rationale of the use of a block-row regularizer is to avoid the issue of the over-fitting (via the block-row regularizer), to remove redundant views and to preserve the natural group structures of data (via the F-norm regularizer), and to remove noisy features (the 12,1-norm regularizer), respectively. We further devise a computationally efficient algorithm to optimize the derived objective function and also theoretically prove the convergence of the proposed optimization method. Finally, the results on real image datasets show that the proposed method outperforms two baseline algorithms and three state-of-the-art algorithms in terms of classification performance.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2168-2267
2168-2275
DOI:10.1109/TCYB.2015.2403356