Optimised phrase querying and browsing of large text databases

Most search systems for querying large document collections, e.g., Web search engines, are based on well-understood information retrieval principles. These systems are both efficient and effective in finding answers to many user information needs, expressed through informal ranked or structured Bool...

Full description

Saved in:
Bibliographic Details
Published inProceedings 24th Australian Computer Science Conference. ACSC 2001 pp. 11 - 19
Main Authors Bahle, D., Williams, H.E., Zobel, J.
Format Conference Proceeding
LanguageEnglish
Published IEEE 2001
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Most search systems for querying large document collections, e.g., Web search engines, are based on well-understood information retrieval principles. These systems are both efficient and effective in finding answers to many user information needs, expressed through informal ranked or structured Boolean queries. Phrase querying and browsing are additional techniques that can augment or replace conventional querying tools. The authors propose optimisations for phrase querying with a nextword index, an efficient structure for phrase based searching. We show that careful consideration of which search terms are evaluated in a query plan and optimisation of the order of evaluation of the plan can reduce query evaluation costs by more than a factor of five. We conclude that, for phrase querying and browsing with nextword indexes, an ordered query plan should be used for all browsing and querying. Moreover, we show that optimised phrase querying is practical on large text collections.
ISBN:0769509630
9780769509631
ISSN:1530-0900
2332-5720
DOI:10.1109/ACSC.2001.906618