Comparing deep learning-based auto-segmentation of organs at risk and clinical target volumes to expert inter-observer variability in radiotherapy planning
•Deep learning-based auto-segmented contours (DC) can provide significant time savings.•DCs for organs at risk accurately reproduce expert contours.•DCs for target volumes are less accurate but may serve as a template for manual edits. Deep learning-based auto-segmented contours (DC) aim to alleviat...
Saved in:
Published in | Radiotherapy and oncology Vol. 144; pp. 152 - 158 |
---|---|
Main Authors | , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Ireland
Elsevier B.V
01.03.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •Deep learning-based auto-segmented contours (DC) can provide significant time savings.•DCs for organs at risk accurately reproduce expert contours.•DCs for target volumes are less accurate but may serve as a template for manual edits.
Deep learning-based auto-segmented contours (DC) aim to alleviate labour intensive contouring of organs at risk (OAR) and clinical target volumes (CTV). Most previous DC validation studies have a limited number of expert observers for comparison and/or use a validation dataset related to the training dataset. We determine if DC models are comparable to Radiation Oncologist (RO) inter-observer variability on an independent dataset.
Expert contours (EC) were created by multiple ROs for central nervous system (CNS), head and neck (H&N), and prostate radiotherapy (RT) OARs and CTVs. DCs were generated using deep learning-based auto-segmentation software trained by a single RO on publicly available data. Contours were compared using Dice Similarity Coefficient (DSC) and 95% Hausdorff distance (HD).
Sixty planning CT scans had 2–4 ECs, for a total of 60 CNS, 53 H&N, and 50 prostate RT contour sets. The mean DC and EC contouring times were 0.4 vs 7.7 min for CNS, 0.6 vs 26.6 min for H&N, and 0.4 vs 21.3 min for prostate RT contours. There were minimal differences in DSC and 95% HD involving DCs for OAR comparisons, but more noticeable differences for CTV comparisons.
The accuracy of DCs trained by a single RO is comparable to expert inter-observer variability for the RT planning contours in this study. Use of deep learning-based auto-segmentation in clinical practice will likely lead to significant benefits to RT planning workflow and resources. |
---|---|
ISSN: | 0167-8140 1879-0887 |
DOI: | 10.1016/j.radonc.2019.10.019 |