Automatic Classification of Structured Product Labels for Pregnancy Risk Drug Categories, a Machine Learning Approach

With regular expressions and manual review, 18,342 FDA-approved drug product labels were processed to determine if the five standard pregnancy drug risk categories were mentioned in the label. After excluding 81 drugs with multiple-risk categories, 83% of the labels had a risk category within the te...

Full description

Saved in:

Bibliographic Details
Published in	AMIA ... Annual Symposium proceedings Vol. 2015; pp. 1093 - 1102
Main Authors	Rodriguez, Laritza M, Fushman, Dina Demner
Format	Journal Article
Language	English
Published	United States American Medical Informatics Association 2015
Subjects	Algorithms Drug Labeling Female Humans Machine Learning Pregnancy Pregnancy Complications Risk data-mining drug risk document classification machine learning knowledge extraction pregnancy
Online Access	Get full text

Cover

Loading…

More Information
Summary:	With regular expressions and manual review, 18,342 FDA-approved drug product labels were processed to determine if the five standard pregnancy drug risk categories were mentioned in the label. After excluding 81 drugs with multiple-risk categories, 83% of the labels had a risk category within the text and 17% labels did not. We trained a Sequential Minimal Optimization algorithm on the labels containing pregnancy risk information segmented into standard document sections. For the evaluation of the classifier on the testing set, we used the Micromedex drug risk categories. The precautions section had the best performance for assigning drug risk categories, achieving Accuracy 0.79, Precision 0.66, Recall 0.64 and F1 measure 0.65. Missing pregnancy risk categories could be suggested using machine learning algorithms trained on the existing publicly available pregnancy risk information.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1559-4076