Enhanced sparse representation classifier for text classification

•Empirical demonstration of Sparse classification methods on text data.•Minimum residual error and class wise dictionaries based methods are most efficient.•Proposed a dynamic dictionary refinement algorithm.•Proposed work improves computational speed and classification performance.•Dictionary refin...

Full description

Saved in:
Bibliographic Details
Published inExpert systems with applications Vol. 129; pp. 260 - 272
Main Authors P., Unnikrishnan, Govindan, V.K., Madhu Kumar, S.D.
Format Journal Article
LanguageEnglish
Published New York Elsevier Ltd 01.09.2019
Elsevier BV
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•Empirical demonstration of Sparse classification methods on text data.•Minimum residual error and class wise dictionaries based methods are most efficient.•Proposed a dynamic dictionary refinement algorithm.•Proposed work improves computational speed and classification performance.•Dictionary refinement procedure addressed the problem of over-reliance on training data. Classification of text based on its substance is an essential part of analysis to organize enormously large text data and to mine the salient information contained in it. It is gaining greater attention with the surge in the volume of on-line data available. Classical algorithms like k-NN (k-nearest neighbor), SVM (Support Vector Machine) and their variations have been observed to yield only reasonable results in addressing the problem, leaving enough room for further improvement. A class of algorithms commonly referred to as Sparse Methods has been emerged recently from compressive sensing and found numerous effective applications in many areas of data analysis and image processing. Sparse Methods as a tool for text analysis is an alley that is largely unexplored rigorously. This paper presents exploration of sparse representation-based methods for text classification. Based on the success of sparse representation based methods in different areas of data analysis, we intuitively hypothesized that it should work well on text classification problems as well. This paper empirically reinforces the hypothesis by testing the method on Reuters and WebKB data sets. The empirical results on Reuters and WebKB benchmark data show that it can outperform classical classification algorithms like SVM and k-NN. It has been observed that obtaining the basis of representation and sparse codes are computationally costly operations affecting the performance of the system. We also propose a class-wise dictionary refinement algorithm and dynamic dictionary selection algorithm to make sparse coding faster. The addition of dictionary refinement to the classification system not only reduces the time taken for sparse coding but also gives improved classification accuracy. The outcomes of the study are empirical verification of sparse representation classifier as a text classification tool and a computationally efficient solution for the bottleneck operation of sparse coding.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2019.04.003