Web Information Organization Using Keyword Distillation Based Clustering
This paper describes a system that conducts search result clustering for several thousands of Web pages, and elaborates cluster labels through keyword distillation. Keyword distillation is a method that properly handles spelling variations, transliterations, synonyms, inclusion relations and word am...
Saved in:
Published in | 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology Vol. 1; pp. 325 - 330 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
Washington, DC, USA
IEEE Computer Society
2009
IEEE |
Series | ACM Conferences |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | This paper describes a system that conducts search result clustering for several thousands of Web pages, and elaborates cluster labels through keyword distillation. Keyword distillation is a method that properly handles spelling variations, transliterations, synonyms, inclusion relations and word ambiguity, using linguistic resources and contexts of a user's query. The system provides a clustering result from 1,000 pages in less than one minute by taking advantage of a search engine infrastructure and grid computing environment. Experimental results show that the system correctly merged synonymous keywords and is useful for finding topics hidden in the lower-ranked pages in a search result. |
---|---|
ISBN: | 0769538010 9780769538013 |
DOI: | 10.1109/WI-IAT.2009.57 |