A New Cluster Merging Algorithm of Suffix tree Clustering
Document clustering methods can be used to structure large sets of text or hypertext documents. Suffix Tree Clustering has been proved to be a good approach for documents clustering. However, the cluster merging algorithm of Suffix Tree Clustering is based on the overlap of their document sets, whic...
Saved in:
Published in | Intelligent Information Processing III pp. 197 - 203 |
---|---|
Main Authors | , |
Format | Book Chapter |
Language | English |
Published |
Boston, MA
Springer US
|
Series | IFIP International Federation for Information Processing |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Document clustering methods can be used to structure large sets of text or hypertext documents. Suffix Tree Clustering has been proved to be a good approach for documents clustering. However, the cluster merging algorithm of Suffix Tree Clustering is based on the overlap of their document sets, which totally ignore the similarity between the non-overlap parts of different clusters. In this paper, we introduce a novel cluster merging approach which will combines the cosine similarity and overlap percentage. Using this method, we can get a better clustering result and a comparative small number of clusters. |
---|---|
ISBN: | 9780387446394 0387446397 |
ISSN: | 1571-5736 |
DOI: | 10.1007/978-0-387-44641-7_21 |