Analyzing adverse drug reaction using statistical and machine learning methods

Background:. Adverse drug reactions (ADRs) are unintended negative drug-induced responses. Determining the association between drugs and ADRs is crucial, and several methods have been proposed to demonstrate this association. This systematic review aimed to examine the analytical tools by considerin...

Full description

Saved in:
Bibliographic Details
Published inMedicine (Baltimore) Vol. 101; no. 25; p. e29387
Main Authors Hae Reong Kim, MS, MinDong Sung, MD, Ji Ae Park, MS, Kyeongseob Jeong, BS, Ho Heon Kim, MS, Suehyun Lee, PhD, Yu Rang Park, PhD
Format Journal Article
LanguageEnglish
Published Wolters Kluwer 01.06.2022
Online AccessGet full text

Cover

Loading…
More Information
Summary:Background:. Adverse drug reactions (ADRs) are unintended negative drug-induced responses. Determining the association between drugs and ADRs is crucial, and several methods have been proposed to demonstrate this association. This systematic review aimed to examine the analytical tools by considering original articles that utilized statistical and machine learning methods for detecting ADRs. Methods:. A systematic literature review was conducted based on articles published between 2015 and 2020. The keywords used were statistical, machine learning, and deep learning methods for detecting ADR signals. The study was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement (PRISMA) guidelines. Results:. We reviewed 72 articles, of which 51 and 21 addressed statistical and machine learning methods, respectively. Electronic medical record (EMR) data were exclusively analyzed using the regression method. For FDA Adverse Event Reporting System (FAERS) data, components of the disproportionality method were preferable. DrugBank was the most used database for machine learning. Other methods accounted for the highest and supervised methods accounted for the second highest. Conclusions:. Using the 72 main articles, this review provides guidelines on which databases are frequently utilized and which analysis methods can be connected. For statistical analysis, >90% of the cases were analyzed by disproportionate or regression analysis with each spontaneous reporting system (SRS) data or electronic medical record (EMR) data; for machine learning research, however, there was a strong tendency to analyze various data combinations. Only half of the DrugBank database was occupied, and the k-nearest neighbor method accounted for the greatest proportion.
ISSN:0025-7974
1536-5964
DOI:10.1097/MD.0000000000029387