Accuracy of an automated knowledge base for identifying drug adverse reactions

[Display omitted] •Classifier to discern between drugs causing and not causing a condition.•Evidence in classifier was highly predictive of reference sets.•Method to integrate multiple sources of evidence about drugs and conditions.•Two manually-created reference sets of drug-condition pairs trained...

Full description

Saved in:
Bibliographic Details
Published inJournal of biomedical informatics Vol. 66; pp. 72 - 81
Main Authors Voss, E.A., Boyce, R.D., Ryan, P.B., van der Lei, J., Rijnbeek, P.R., Schuemie, M.J.
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 01.02.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:[Display omitted] •Classifier to discern between drugs causing and not causing a condition.•Evidence in classifier was highly predictive of reference sets.•Method to integrate multiple sources of evidence about drugs and conditions.•Two manually-created reference sets of drug-condition pairs trained an automated classifier. Drug safety researchers seek to know the degree of certainty with which a particular drug is associated with an adverse drug reaction. There are different sources of information used in pharmacovigilance to identify, evaluate, and disseminate medical product safety evidence including spontaneous reports, published peer-reviewed literature, and product labels. Automated data processing and classification using these evidence sources can greatly reduce the manual curation currently required to develop reference sets of positive and negative controls (i.e. drugs that cause adverse drug events and those that do not) to be used in drug safety research. In this paper we explore a method for automatically aggregating disparate sources of information together into a single repository, developing a predictive model to classify drug-adverse event relationships, and applying those predictions to a real world problem of identifying negative controls for statistical method calibration. Our results showed high predictive accuracy for the models combining all available evidence, with an area under the receiver-operator curve of ⩾0.92 when tested on three manually generated lists of drugs and conditions that are known to either have or not have an association with an adverse drug event. Results from a pilot implementation of the method suggests that it is feasible to develop a scalable alternative to the time-and-resource-intensive, manual curation exercise previously applied to develop reference sets of positive and negative controls to be used in drug safety research.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1532-0464
1532-0480
DOI:10.1016/j.jbi.2016.12.005