Document Clustering Using Semantic Features and Fuzzy Relations
Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem...
Saved in:
Published in | Journal of Information and Communication Convergence Engineering, 11(3) Vol. 11; no. 3; pp. 179 - 184 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
한국정보통신학회
30.09.2013
|
Subjects | |
Online Access | Get full text |
ISSN | 2234-8255 2234-8883 |
DOI | 10.6109/jicce.2013.11.3.179 |
Cover
Loading…
Summary: | Traditional clustering methods are usually based on the bag-of-words (BOW) model. A disadvantage of the BOW model is that it ignores the semantic relationship among terms in the data set. To resolve this problem, ontology or matrix factorization approaches are usually used. However, a major problem of the ontology approach is that it is usually difficult to find a comprehensive ontology that can cover all the concepts mentioned in a collection. This paper proposes a new document clustering method using semantic features and fuzzy relations for solving the problems of ontology and matrix factorization approaches. The proposed method can improve the quality of document clustering because the clustered documents use fuzzy relation values between semantic features and terms to distinguish clearly among dissimilar documents in clusters. The selected cluster label terms can represent the inherent structure of a document set better by using semantic features based on non-negative matrix factorization, which is used in document clustering. The experimental results demonstrate that the proposed method achieves better performance than other document clustering methods. KCI Citation Count: 0 |
---|---|
Bibliography: | G704-SER000003196.2013.11.3.001 |
ISSN: | 2234-8255 2234-8883 |
DOI: | 10.6109/jicce.2013.11.3.179 |