Better Together: Improving the Lives of Metadata Creators with Natural Language Processing
DC Public Library has long held digital copies of the full run of local alternative weekly, Washington City Paper, but had no official status as a rights grantor to enable use. That recently changed due to a full agreement being reached with the publisher. One condition of that agreement, however, w...
Saved in:
Published in | The code4lib journal no. 51 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
Code4Lib
01.06.2021
|
Online Access | Get full text |
Cover
Loading…
Summary: | DC Public Library has long held digital copies of the full run of local alternative weekly, Washington City Paper, but had no official status as a rights grantor to enable use. That recently changed due to a full agreement being reached with the publisher. One condition of that agreement, however, was that issues become available with usable descriptive metadata and subject access in time to celebrate the upcoming 40th anniversary of the publication, which at that time was in six months. One of the most time intensive tasks our metadata specialists work on is assigning description to digital objects. This paper details how we applied Python’s Natural Language Toolkit and OpenRefine’s reconciliation functions to the collection’s OCR text to simplify subject selection for staff with no background in programming. |
---|---|
ISSN: | 1940-5758 1940-5758 |