A Natural Language Thesaurus for the Humanities: The Need for a Database Search Aid
Database searching presents special difficulties for humanists because many subjects may be covered, many synonyms may be used to describe a single concept, and terms may vary in precision. Databases may be searched by using controlled vocabularies, free-text (natural language) terms, or a combinati...
Saved in:
Published in | The Library quarterly (Chicago) Vol. 68; no. 4; pp. 406 - 430 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Chicago, Il
University of Chicago Press
01.10.1998
University of Chicago, acting through its Press |
Subjects | |
Online Access | Get full text |
ISSN | 0024-2519 1549-652X |
DOI | 10.1086/603001 |
Cover
Loading…
Summary: | Database searching presents special difficulties for humanists because many subjects may be covered, many synonyms may be used to describe a single concept, and terms may vary in precision. Databases may be searched by using controlled vocabularies, free-text (natural language) terms, or a combination of both. A significant cause of recall failure in a free-text search is the inability of the searcher to think of all the terms an author may have used. The current study was undertaken to determine the potential value to humanists of a thesaurus integrating free-text terms from the humanities and social sciences. In the first part of the study, a sample of common-noun subject headings from the "Humanities Index" was analyzed to determine how many have at least quasi-synonymous terms. The subject headings were compared to terms in "The Contemporary Thesaurus of Social Science Terms and Synonyms: A Guide for Natural Language Computer Searching" to determine the overlap of terminology between the humanities and social sciences. The results indicate a high degree of overlap, suggesting that a thesaurus integrating terms from the humanities and the social sciences would be of value to scholars in both disciplines. Results also demonstrate that a high proportion of common-noun subject headings have at least quasi-synonymous terms useful for searching. In the second part of the study, searches for humanities scholars were conducted on controlled-vocabulary databases, using both controlled vocabulary and free-text terms to determine whether the latter retrieved additional relevant records not retrieved by the controlled vocabulary. The results indicate that combining both approaches yields more relevant items and higher recall than either method alone. Searchers need tools to identify both controlled-vocabulary terms and free-text terms. The proposed free-text thesaurus will complement controlled-vocabulary thesauri. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Feature-1 content type line 23 |
ISSN: | 0024-2519 1549-652X |
DOI: | 10.1086/603001 |