Achieving High Accuracy in Predicting the Probability of Periprosthetic Joint Infection From Synovial Fluid in Patients Undergoing Hip or Knee Arthroplasty: The Development and Validation of a Multivariable Machine Learning Algorithm
Background and objective The current periprosthetic joint infection (PJI) diagnostic guidelines require clinicians to interpret and integrate multiple criteria into a complex scoring system. Also, PJI classifications are often inconclusive, failing to provide a clinical diagnosis. Machine learning (...
Saved in:
Published in | Curēus (Palo Alto, CA) Vol. 15; no. 12; p. e51036 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
Cureus Inc
24.12.2023
Cureus |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Background and objective The current periprosthetic joint infection (PJI) diagnostic guidelines require clinicians to interpret and integrate multiple criteria into a complex scoring system. Also, PJI classifications are often inconclusive, failing to provide a clinical diagnosis. Machine learning (ML) models could be leveraged to reduce reliance on these complex systems and thereby reduce diagnostic uncertainty. This study aimed to develop an ML algorithm using synovial fluid (SF) test results to establish a PJI probability score. Methods We used a large clinical laboratory's dataset of SF samples, aspirated from patients with hip or knee arthroplasty as part of a PJI evaluation. Patient age and SF biomarkers [white blood cell count, neutrophil percentage (%PMN), red blood cell count, absorbance at 280 nm wavelength, C-reactive protein (CRP), alpha-defensin (AD), neutrophil elastase, and microbial antigen (MID) tests] were used for model development. Data preprocessing, principal component analysis, and unsupervised clustering (K-means) revealed four clusters of samples that naturally aggregated based on biomarker results. Analysis of the characteristics of each of these four clusters revealed three clusters (n=13,133) with samples having biomarker results typical of a PJI-negative classification and one cluster (n=4,032) with samples having biomarker results typical of a PJI-positive classification. A decision tree model, trained and tested independently of external diagnostic rules, was then developed to match the classification determined by the unsupervised clustering. The performance of the model was assessed versus a modified 2018 International Consensus Meeting (ICM) criteria, in both the test cohort and an independent unlabeled validation set of 5,601 samples. The SHAP (SHapley Additive exPlanations) method was used to explore feature importance. Results The ML model showed an area under the curve of 0.993, with a sensitivity of 98.8%, specificity of 97.3%, positive predictive value (PPV) of 92.9%, and negative predictive value (NPV) of 99.8% in predicting the modified 2018 ICM diagnosis among test set samples. The model maintained its diagnostic accuracy in the validation cohort, yielding 99.1% sensitivity, 97.1% specificity, 91.9% PPV, and 99.9% NPV. The model's inconclusive rate (diagnostic probability between 20-80%) in the validation cohort was only 1.3%, lower than that observed with the modified 2018 ICM PJI classification (7.4%; p<0.001). The SHAP analysis found that AD was the most important feature in the model, exhibiting dominance among >95% of "infected" and "not infected" diagnoses. Other important features were the sum of the MID test panel, %PMN, and SF-CRP. Conclusions Although defined methods and tools for diagnosis of PJI using multiple biomarker criteria are available, they are not consistently applied or widely implemented. There is a need for algorithmic interpretation of these biomarkers to enable consistent interpretation of the results to drive treatment decisions. The new model, using clinical parameters measured from a patient's SF sample, renders a preoperative probability score for PJI which performs well compared to a modified 2018 ICM definition. Taken together with other clinical signs, this model has the potential to increase the accuracy of clinical evaluations and reduce the rate of inconclusive classification, thereby enabling more appropriate and expedited downstream treatment decisions. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 2168-8184 2168-8184 |
DOI: | 10.7759/cureus.51036 |