Text Classification Using Sentential Frequent Itemsets

Text classification techniques mostly rely on single term analysis of the document data set, while more concepts, especially the specific ones, are usually conveyed by set of terms. To achieve more accurate text classifier, more informative feature including frequent co-occurring words in the same s...

Full description

Saved in:

Bibliographic Details
Published in	Journal of computer science and technology Vol. 22; no. 2; pp. 334 - 337
Main Authors	Liu, Shi-Zhu, Hu, He-Ping
Format	Journal Article
Language	English
Published	Beijing Springer Nature B.V 01.03.2007 College of Computer Science, Huazhong University of Science and Technology, Wuhan 430071, China
Subjects	Classification Classifiers Conveying Data mining Documents Mining Sentences Studies Text categorization Texts sentential frequent itemsets text classification variable precision rough set model
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Text classification techniques mostly rely on single term analysis of the document data set, while more concepts, especially the specific ones, are usually conveyed by set of terms. To achieve more accurate text classifier, more informative feature including frequent co-occurring words in the same sentence and their weights are particularly important in such scenarios. In this paper, we propose a novel approach using sentential frequent itemset, a concept comes from association rule mining, for text classification, which views a sentence rather than a document as a transaction, and uses a variable precision rough set based method to evaluate each sentential frequent itemset’s contribution to the classification. Experiments over the Reuters and newsgroup corpus are carried out, which validate the practicability of the proposed system.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	1000-9000 1860-4749
DOI:	10.1007/s11390-007-9041-7