Development and Validation of Interpretable Machine Learning Models for Clinically Significant Prostate Cancer Diagnosis in Patients With Lesions of PI‐RADS v2.1 Score ≥3
Background For patients with PI‐RADS v2.1 ≥ 3, prostate biopsy is strongly recommended. Due to the unsatisfactory positive rate of biopsy, improvements in clinically significant prostate cancer (csPCa) risk assessments are required. Purpose To develop and validate machine learning (ML) models based...
Saved in:
Published in | Journal of magnetic resonance imaging Vol. 60; no. 5; pp. 2130 - 2141 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
Hoboken, USA
John Wiley & Sons, Inc
01.11.2024
Wiley Subscription Services, Inc |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Background
For patients with PI‐RADS v2.1 ≥ 3, prostate biopsy is strongly recommended. Due to the unsatisfactory positive rate of biopsy, improvements in clinically significant prostate cancer (csPCa) risk assessments are required.
Purpose
To develop and validate machine learning (ML) models based on clinical and imaging parameters for csPCa detection in patients with PI‐RADS v2.1 ≥ 3.
Study Type
Retrospective.
Subjects
One thousand eighty‐three patients with PI‐RADS v2.1 ≥ 3, randomly split into training (70%, N = 759) and validation (30%, N = 324) datasets, and 147 patients enrolled prospectively for testing.
Field Strength/Sequence
3.0 T scanners/T2‐weighted fast spin echo sequence and DWI with diffusion‐weighted single‐shot gradient echo planar imaging sequence.
Assessment
The factors evaluated for csPCa detection were age, prostate specific antigen, prostate volume, and the diameter and location of the index lesion, PI‐RADSv2.1. Five ML models for csPCa detection were developed: logistic regression (LR), extreme gradient boosting, random forest (RF), decision tree, and support vector machines. The csPCa was defined as Gleason grade ≥2.
Statistical Tests
Univariable and multivariable LR analyses to identify parameters associated with csPCa. Area under the receiver operating characteristic curve (AUC), Brier score, and DeLong test were used to assess and compare the csPCa diagnostic performance with the LR model. The significance level was defined as 0.05.
Results
The RF model exhibited the highest AUC (0.880–0.904) and lowest Brier score (0.125–0.133) among the ML models in the validation and testing cohorts, however, there was no difference when compared to the LR model (P = 0.453 and 0.548). The sensitivity and negative predictive values in the validation and testing cohorts were 93.8%–97.6% and 82.7%–95.1%, respectively, at a threshold of 0.450 (99% sensitivity of the RF model).
Data Conclusion
The RF model might help for assessing the risk of csPCa and preventing overdiagnosis and unnecessary biopsy for men with PI‐RADSv2.1 ≥ 3.
Evidence Level
3
Technical Efficacy
Stage 2 |
---|---|
Bibliography: | Mingjian Ruan and Yi Liu contributed equally to this work. ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 ObjectType-Undefined-3 |
ISSN: | 1053-1807 1522-2586 1522-2586 |
DOI: | 10.1002/jmri.29275 |