An automatic approach to weighted subject indexing-an empirical study in the biomedical domain

Subject indexing is an intellectually intensive process that has many inherent uncertainties. Existing manual subject indexing systems generally produce binary outcomes for whether or not to assign an indexing term. This does not sufficiently reflect the extent to which the indexing terms are associ...

Full description

Saved in:
Bibliographic Details
Published inJournal of the Association for Information Science and Technology Vol. 66; no. 9; pp. 1776 - 1784
Main Authors Lu, Kun, Mao, Jin
Format Journal Article
LanguageEnglish
Published Blackwell Publishing Ltd 01.09.2015
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Subject indexing is an intellectually intensive process that has many inherent uncertainties. Existing manual subject indexing systems generally produce binary outcomes for whether or not to assign an indexing term. This does not sufficiently reflect the extent to which the indexing terms are associated with the documents. On the other hand, the idea of probabilistic or weighted indexing was proposed a long time ago and has seen success in capturing uncertainties in the automatic indexing process. One hurdle to overcome in implementing weighted indexing in manual subject indexing systems is the practical burden that could be added to the already intensive indexing process. This study proposes a method to infer automatically the associations between subject terms and documents through text mining. By uncovering the connections between MeSH descriptors and document text, we are able to derive the weights of MeSH descriptors manually assigned to documents. Our initial results suggest that the inference method is feasible and promising. The study has practical implications for improving subject indexing practice and providing better support for information retrieval.
Bibliography:istex:57E6F92A9AA66DE8F0A480B8FB31CCDCD8238A99
ArticleID:ASI23290
ark:/67375/WNG-QJ2DW5WP-Q
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2330-1635
2330-1643
DOI:10.1002/asi.23290