Generation of an annotated reference standard for vaccine adverse event reports

As part of a collaborative project between the US Food and Drug Administration (FDA) and the Centers for Disease Control and Prevention for the development of a web-based natural language processing (NLP) workbench, we created a corpus of 1000 Vaccine Adverse Event Reporting System (VAERS) reports a...

Full description

Saved in:
Bibliographic Details
Published inVaccine Vol. 36; no. 29; pp. 4325 - 4330
Main Authors Foster, Matthew, Pandey, Abhishek, Kreimeyer, Kory, Botsis, Taxiarchis
Format Journal Article
LanguageEnglish
Published Netherlands Elsevier Ltd 05.07.2018
Elsevier Limited
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:As part of a collaborative project between the US Food and Drug Administration (FDA) and the Centers for Disease Control and Prevention for the development of a web-based natural language processing (NLP) workbench, we created a corpus of 1000 Vaccine Adverse Event Reporting System (VAERS) reports annotated for 36,726 clinical features, 13,365 temporal features, and 22,395 clinical-temporal links. This paper describes the final corpus, as well as the methodology used to create it, so that clinical NLP researchers outside FDA can evaluate the utility of the corpus to aid their own work. The creation of this standard went through four phases: pre-training, pre-production, production-clinical feature annotation, and production-temporal annotation. The pre-production phase used a double annotation followed by adjudication strategy to refine and finalize the annotation model while the production phases followed a single annotation strategy to maximize the number of reports in the corpus. An analysis of 30 reports randomly selected as part of a quality control assessment yielded accuracies of 0.97, 0.96, and 0.83 for clinical features, temporal features, and clinical-temporal associations, respectively and speaks to the quality of the corpus.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0264-410X
1873-2518
DOI:10.1016/j.vaccine.2018.05.079