A Natural Language Thesaurus for the Humanities: The Need for a Database Search Aid

Database searching presents special difficulties for humanists because many subjects may be covered, many synonyms may be used to describe a single concept, and terms may vary in precision. Databases may be searched by using controlled vocabularies, free-text (natural language) terms, or a combinati...

Full description

Saved in:
Bibliographic Details
Published inThe Library quarterly (Chicago) Vol. 68; no. 4; pp. 406 - 430
Main Authors Knapp, Sara D, Cohen, Laura B, Juedes, D. R
Format Journal Article
LanguageEnglish
Published Chicago, Il University of Chicago Press 01.10.1998
University of Chicago, acting through its Press
Subjects
Online AccessGet full text
ISSN0024-2519
1549-652X
DOI10.1086/603001

Cover

Loading…
More Information
Summary:Database searching presents special difficulties for humanists because many subjects may be covered, many synonyms may be used to describe a single concept, and terms may vary in precision. Databases may be searched by using controlled vocabularies, free-text (natural language) terms, or a combination of both. A significant cause of recall failure in a free-text search is the inability of the searcher to think of all the terms an author may have used. The current study was undertaken to determine the potential value to humanists of a thesaurus integrating free-text terms from the humanities and social sciences. In the first part of the study, a sample of common-noun subject headings from the "Humanities Index" was analyzed to determine how many have at least quasi-synonymous terms. The subject headings were compared to terms in "The Contemporary Thesaurus of Social Science Terms and Synonyms: A Guide for Natural Language Computer Searching" to determine the overlap of terminology between the humanities and social sciences. The results indicate a high degree of overlap, suggesting that a thesaurus integrating terms from the humanities and the social sciences would be of value to scholars in both disciplines. Results also demonstrate that a high proportion of common-noun subject headings have at least quasi-synonymous terms useful for searching. In the second part of the study, searches for humanities scholars were conducted on controlled-vocabulary databases, using both controlled vocabulary and free-text terms to determine whether the latter retrieved additional relevant records not retrieved by the controlled vocabulary. The results indicate that combining both approaches yields more relevant items and higher recall than either method alone. Searchers need tools to identify both controlled-vocabulary terms and free-text terms. The proposed free-text thesaurus will complement controlled-vocabulary thesauri.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Feature-1
content type line 23
ISSN:0024-2519
1549-652X
DOI:10.1086/603001