Unsupervised feature selection using feature similarity

In this article, we describe an unsupervised feature selection algorithm suitable for data sets, large in both dimension and size. The method is based on measuring similarity between features whereby redundancy therein is removed. This does not need any search and, therefore, is fast. A new feature...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on pattern analysis and machine intelligence Vol. 24; no. 3; pp. 301 - 312
Main Authors	Mitra, P., Murthy, C.A., Pal, S.K.
Format	Journal Article
Language	English
Published	New York IEEE 01.03.2002 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Compressing Entropy Loss measurement Pattern analysis Redundancy Representations Similarity
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this article, we describe an unsupervised feature selection algorithm suitable for data sets, large in both dimension and size. The method is based on measuring similarity between features whereby redundancy therein is removed. This does not need any search and, therefore, is fast. A new feature similarity measure, called maximum information compression index, is introduced. The algorithm is generic in nature and has the capability of multiscale representation of data sets. The superiority of the algorithm, in terms of speed and performance, is established extensively over various real-life data sets of different sizes and dimensions. It is also demonstrated how redundancy and information loss in feature selection can be quantified with an entropy measure.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	0162-8828 1939-3539
DOI:	10.1109/34.990133