A New Cluster Merging Algorithm of Suffix tree Clustering

Document clustering methods can be used to structure large sets of text or hypertext documents. Suffix Tree Clustering has been proved to be a good approach for documents clustering. However, the cluster merging algorithm of Suffix Tree Clustering is based on the overlap of their document sets, whic...

Full description

Saved in:
Bibliographic Details
Published inIntelligent Information Processing III pp. 197 - 203
Main Authors Wang, Jianhua, Li, Ruixu
Format Book Chapter
LanguageEnglish
Published Boston, MA Springer US
SeriesIFIP International Federation for Information Processing
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Document clustering methods can be used to structure large sets of text or hypertext documents. Suffix Tree Clustering has been proved to be a good approach for documents clustering. However, the cluster merging algorithm of Suffix Tree Clustering is based on the overlap of their document sets, which totally ignore the similarity between the non-overlap parts of different clusters. In this paper, we introduce a novel cluster merging approach which will combines the cosine similarity and overlap percentage. Using this method, we can get a better clustering result and a comparative small number of clusters.
ISBN:9780387446394
0387446397
ISSN:1571-5736
DOI:10.1007/978-0-387-44641-7_21