Subnational analysis and modelling of the Ebola epidemic in West Africa, 2013-2016 : application of machine learning algorithms for case fatality imputation
Background The 2013-2016 West African Ebola epidemic has been the largest to date with more than 11,000 deaths in the affected countries. The data collected have provided more insight into the case fatality ratio (CFR) and how it varies with age and other characteristics. However, the accuracy and p...
Saved in:
Main Author | |
---|---|
Format | Dissertation |
Language | English |
Published |
Imperial College London
2020
|
Online Access | Get full text |
Cover
Loading…
Summary: | Background The 2013-2016 West African Ebola epidemic has been the largest to date with more than 11,000 deaths in the affected countries. The data collected have provided more insight into the case fatality ratio (CFR) and how it varies with age and other characteristics. However, the accuracy and precision of the naïve CFR remain limited because 44% of survival outcomes were unreported. Methods Using a machine learning (MaLe) model, Boosted Regression Tree (BRT), I imputed survival outcomes (i.e. survival or death) when unreported, corrected for model imperfection to estimate the CFR without imputation, with imputation and adjusted with imputation. I used semivariogram analysis and kriging to investigate subnational heterogeneities in CFR estimates. I used simulations to evaluate the performance of various MaLe inference methods for the estimation of CFR under different outbreak data scenarios. Results The adjusted CFR estimates were 82.8% (95% CI 45%.6-85.6%) overall and 89.1% (95% CI 40.8%-91.6%), 65.6% (95% CI 61.3%-69.6%) and 79.2% (95% CI 45.4%-84.1%) for Sierra Leone, Guinea and Liberia, respectively. BRT modelling accounted for most of the spatiotemporal variation and interactions in CFR, but moderate spatial autocorrelation remained. Combining district-level CFR estimates and kriged district-level residuals provided the best linear unbiased map of CFR. Temporal autocorrelation was not observed in the district-level residuals from the BRT estimates. Finally, I observed that the performance of MaLe inference methods for CFR imputation varies under different outbreak data scenarios. Conclusions Adjusted CFR estimates improved the naïve CFR estimates obtained without imputation and were more representative. Used in conjunction with other resources, adjusted CFR estimates and the unbiased CFR maps will inform future public health response to Ebola outbreaks. I confirm that, across the board, data imputation with adjustment for the sensitivity and specificity of MaLe inference methods reduces the bias in CFR estimates. |
---|---|
Bibliography: | 0000000507359718 Commonwealth Scholarship Commission |
DOI: | 10.25560/93113 |