Better Together: Improving the Lives of Metadata Creators with Natural Language Processing

DC Public Library has long held digital copies of the full run of local alternative weekly, Washington City Paper, but had no official status as a rights grantor to enable use. That recently changed due to a full agreement being reached with the publisher. One condition of that agreement, however, w...

Full description

Saved in:
Bibliographic Details
Published inThe code4lib journal no. 51
Main Author Paul Kelley
Format Journal Article
LanguageEnglish
Published Code4Lib 01.06.2021
Online AccessGet full text

Cover

Loading…
More Information
Summary:DC Public Library has long held digital copies of the full run of local alternative weekly, Washington City Paper, but had no official status as a rights grantor to enable use. That recently changed due to a full agreement being reached with the publisher. One condition of that agreement, however, was that issues become available with usable descriptive metadata and subject access in time to celebrate the upcoming 40th anniversary of the publication, which at that time was in six months. One of the most time intensive tasks our metadata specialists work on is assigning description to digital objects. This paper details how we applied Python’s Natural Language Toolkit and OpenRefine’s reconciliation functions to the collection’s OCR text to simplify subject selection for staff with no background in programming.
ISSN:1940-5758
1940-5758