TOPIC DETECTION OF UNRESTRICTED TEXTS: APPROACHES AND EVALUATIONS

Topic detection and tracking refers to automatic techniques for locating topically related cohesive paragraphs in a stream of text. Most documents are about more than one subject, but many Natural Language Processing (NLP) and Information Retrieval (IR) techniques implicitly assume documents have ju...

Full description

Saved in:
Bibliographic Details
Published inApplied artificial intelligence Vol. 19; no. 2; pp. 119 - 135
Main Author Chali, Yllias
Format Journal Article
LanguageEnglish
Published Taylor & Francis Group 26.01.2005
Online AccessGet full text

Cover

Loading…
More Information
Summary:Topic detection and tracking refers to automatic techniques for locating topically related cohesive paragraphs in a stream of text. Most documents are about more than one subject, but many Natural Language Processing (NLP) and Information Retrieval (IR) techniques implicitly assume documents have just one topic. Even in the presence of a single topic within a document, the document may address multiple subtopics and various aspects of the primary topic. Hence, dividing documents into topically coherent units and discovering their topic might have many uses. We describe new clues that account for the topic of grouping of contiguous portions of the text. Those clues are based on general lexical resources, which make them applicable to unrestricted texts, and can have many uses such as helping users find answers to general questions in an information search task, or in question/answering systems, or in text summarization. We devise an algorithm for identifying these clues, and we report on the performance of these clues, as well as the improvements suggested by our experiments.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:0883-9514
1087-6545
DOI:10.1080/08839510590887441