Automatic Classification of Structured Product Labels for Pregnancy Risk Drug Categories, a Machine Learning Approach

With regular expressions and manual review, 18,342 FDA-approved drug product labels were processed to determine if the five standard pregnancy drug risk categories were mentioned in the label. After excluding 81 drugs with multiple-risk categories, 83% of the labels had a risk category within the te...

Full description

Saved in:
Bibliographic Details
Published inAMIA ... Annual Symposium proceedings Vol. 2015; pp. 1093 - 1102
Main Authors Rodriguez, Laritza M, Fushman, Dina Demner
Format Journal Article
LanguageEnglish
Published United States American Medical Informatics Association 2015
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:With regular expressions and manual review, 18,342 FDA-approved drug product labels were processed to determine if the five standard pregnancy drug risk categories were mentioned in the label. After excluding 81 drugs with multiple-risk categories, 83% of the labels had a risk category within the text and 17% labels did not. We trained a Sequential Minimal Optimization algorithm on the labels containing pregnancy risk information segmented into standard document sections. For the evaluation of the classifier on the testing set, we used the Micromedex drug risk categories. The precautions section had the best performance for assigning drug risk categories, achieving Accuracy 0.79, Precision 0.66, Recall 0.64 and F1 measure 0.65. Missing pregnancy risk categories could be suggested using machine learning algorithms trained on the existing publicly available pregnancy risk information.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1559-4076