A framework for understanding Latent Semantic Indexing (LSI) performance

In this paper we present a theoretical model for understanding the performance of Latent Semantic Indexing (LSI) search and retrieval application. Many models for understanding LSI have been proposed. Ours is the first to study the values produced by LSI in the term by dimension vectors. The framewo...

Full description

Saved in:
Bibliographic Details
Published inInformation processing & management Vol. 42; no. 1; pp. 56 - 73
Main Authors Kontostathis, April, Pottenger, William M.
Format Journal Article
LanguageEnglish
Published Oxford Elsevier Ltd 2006
Elsevier Science
Elsevier Science Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper we present a theoretical model for understanding the performance of Latent Semantic Indexing (LSI) search and retrieval application. Many models for understanding LSI have been proposed. Ours is the first to study the values produced by LSI in the term by dimension vectors. The framework presented here is based on term co-occurrence data. We show a strong correlation between second-order term co-occurrence and the values produced by the Singular Value Decomposition (SVD) algorithm that forms the foundation for LSI. We also present a mathematical proof that the SVD algorithm encapsulates term co-occurrence information.
Bibliography:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
ISSN:0306-4573
1873-5371
DOI:10.1016/j.ipm.2004.11.007