Prediction of clinical trial enrollment rates

Clinical trials represent a critical milestone of translational and clinical sciences. However, poor recruitment to clinical trials has been a long standing problem affecting institutions all over the world. One way to reduce the cost incurred by insufficient enrollment is to minimize initiating tri...

Full description

Saved in:

Bibliographic Details
Published in	PloS one Vol. 17; no. 2; p. e0263193
Main Authors	Bieganek, Cameron, Aliferis, Constantin, Ma, Sisi
Format	Journal Article
Language	English
Published	United States Public Library of Science 24.02.2022 Public Library of Science (PLoS)
Subjects	Algorithms Artificial intelligence Censuses Clinical trials Clinical Trials, Phase I as Topic Clinical Trials, Phase III as Topic Content analysis Datasets Decision support systems Design Forecasting Forecasts and trends Health informatics Humans Learning algorithms Machine Learning Medicine Models, Theoretical Natural Language Processing Patient Selection Performance prediction Statistics Translational Science, Biomedical United States Minneapolis Minnesota United States > US
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Clinical trials represent a critical milestone of translational and clinical sciences. However, poor recruitment to clinical trials has been a long standing problem affecting institutions all over the world. One way to reduce the cost incurred by insufficient enrollment is to minimize initiating trials that are most likely to fall short of their enrollment goal. Hence, the ability to predict which proposed trials will meet enrollment goals prior to the start of the trial is highly beneficial. In the current study, we leveraged a data set extracted from ClinicalTrials.gov that consists of 46,724 U.S. based clinical trials from 1990 to 2020. We constructed 4,636 candidate predictors based on data collected by ClinicalTrials.gov and external sources for enrollment rate prediction using various state-of-the-art machine learning methods. Taking advantage of a nested time series cross-validation design, our models resulted in good predictive performance that is generalizable to future data and stable over time. Moreover, information content analysis revealed the study design related features to be the most informative feature type regarding enrollment. Compared to the performance of models built with all features, the performance of models built with study design related features is only marginally worse (AUC = 0.78 ± 0.03 vs. AUC = 0.76 ± 0.02). The results presented can form the basis for data-driven decision support systems to assess whether proposed clinical trials would likely meet their enrollment goal.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Competing Interests: The authors have declared that no competing interests exist.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0263193