The aboutness of words

Word aboutness is defined as the relationship between words and subjects associated with them. An aboutness coefficient is developed to estimate the strength of the aboutness relationship. Words that are randomly distributed across subjects are assumed to lack aboutness and the degree to which their...

Full description

Saved in:
Bibliographic Details
Published inJournal of the American Society for Information Science and Technology Vol. 68; no. 10; pp. 2471 - 2483
Main Authors O'Neill, Edward T., Kammerer, Kerre A., Bennett, Rick
Format Journal Article
LanguageEnglish
Published Hoboken Wiley Periodicals Inc 01.10.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Word aboutness is defined as the relationship between words and subjects associated with them. An aboutness coefficient is developed to estimate the strength of the aboutness relationship. Words that are randomly distributed across subjects are assumed to lack aboutness and the degree to which their usage deviates from a random pattern indicates the strength of the aboutness. To estimate aboutness, title words and their associated subjects are extracted from the titles of non‐fiction English language books in the OCLC WorldCat database. The usage patterns of the title words are analyzed and used to compute aboutness coefficients for each of the common title words. Words with low aboutness coefficients (An and In) are commonly found in stop word lists, whereas words with high aboutness coefficients (Carbonate, Autism) are unambiguous and have a strong subject association. The aboutness coefficient potentially can enhance indexing, advance authority control, and improve retrieval.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2330-1635
2330-1643
DOI:10.1002/asi.23856