Essential Pages

Results to Web search queries are ranked using heuristics that typically analyze the global link topology, user behavior, and content relevance. We point to a particular inefficiency of such methods: information redundancy. In queries where learning about a subject is an objective, modern search eng...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01 Vol. 1; pp. 173 - 182
Main Authors	Swaminathan, Ashwin, Mathew, Cherian V., Kirovski, Darko
Format	Conference Proceeding
Language	English
Published	Washington, DC, USA IEEE Computer Society 15.09.2009 IEEE
Series	ACM Conferences
Subjects	Computing methodologies > Machine learning coverage Information systems > Information retrieval Information systems > Information retrieval > Document representation Information systems > Information retrieval > Evaluation of retrieval results Internet Knowledge engineering learning queries Measurement Optimization Prototypes Redundancy redundancy elimination Search engines Semantics Time-frequency analysis Web page ranking Web search coverage redundancy elimination learning queries Web page ranking Web search
Online Access	Get full text
ISBN	0769538010 9780769538013
DOI	10.1109/WI-IAT.2009.33

Cover

More Information
Summary:	Results to Web search queries are ranked using heuristics that typically analyze the global link topology, user behavior, and content relevance. We point to a particular inefficiency of such methods: information redundancy. In queries where learning about a subject is an objective, modern search engines return relatively unsatisfactory results as they consider the query coverage by each page individually, not a set of pages as a whole. We address this problem using essential pages. If we denote as $\mathbb{S}_Q$ the total knowledge that exists on the Web about a given query $Q$, we want to build a search engine that returns a set of essential pages $E_Q$ that maximizes the information covered over $\mathbb{S}_Q$. We present a preliminary prototype that optimizes the selection of essential pages; we draw some informal comparisons with respect to existing search engines; and finally, we evaluate our prototype using a blind-test user study.
ISBN:	0769538010 9780769538013
DOI:	10.1109/WI-IAT.2009.33