Deep learning for automatic head and neck lymph node level delineation provides expert-level accuracy
Deep learning-based head and neck lymph node level (HN_LNL) autodelineation is of high relevance to radiotherapy research and clinical treatment planning but still underinvestigated in academic literature. In particular, there is no publicly available open-source solution for large-scale autosegment...
Saved in:
Published in | Frontiers in oncology Vol. 13; p. 1115258 |
---|---|
Main Authors | , , , , , , , , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Switzerland
Frontiers Media S.A
16.02.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Deep learning-based head and neck lymph node level (HN_LNL) autodelineation is of high relevance to radiotherapy research and clinical treatment planning but still underinvestigated in academic literature. In particular, there is no publicly available open-source solution for large-scale autosegmentation of HN_LNL in the research setting.
An expert-delineated cohort of 35 planning CTs was used for training of an nnU-net 3D-fullres/2D-ensemble model for autosegmentation of 20 different HN_LNL. A second cohort acquired at the same institution later in time served as the test set (n = 20). In a completely blinded evaluation, 3 clinical experts rated the quality of deep learning autosegmentations in a head-to-head comparison with expert-created contours. For a subgroup of 10 cases, intraobserver variability was compared to the average deep learning autosegmentation accuracy on the original and recontoured set of expert segmentations. A postprocessing step to adjust craniocaudal boundaries of level autosegmentations to the CT slice plane was introduced and the effect of autocontour consistency with CT slice plane orientation on geometric accuracy and expert rating was investigated.
Blinded expert ratings for deep learning segmentations and expert-created contours were not significantly different. Deep learning segmentations with slice plane adjustment were rated numerically higher (mean, 81.0 vs. 79.6, p = 0.185) and deep learning segmentations without slice plane adjustment were rated numerically lower (77.2 vs. 79.6, p = 0.167) than manually drawn contours. In a head-to-head comparison, deep learning segmentations with CT slice plane adjustment were rated significantly better than deep learning contours without slice plane adjustment (81.0 vs. 77.2, p = 0.004). Geometric accuracy of deep learning segmentations was not different from intraobserver variability (mean Dice per level, 0.76 vs. 0.77, p = 0.307). Clinical significance of contour consistency with CT slice plane orientation was not represented by geometric accuracy metrics (volumetric Dice, 0.78 vs. 0.78, p = 0.703).
We show that a nnU-net 3D-fullres/2D-ensemble model can be used for highly accurate autodelineation of HN_LNL using only a limited training dataset that is ideally suited for large-scale standardized autodelineation of HN_LNL in the research setting. Geometric accuracy metrics are only an imperfect surrogate for blinded expert rating. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 ORCID: Florian Putz, orcid.org/0000-0003-3966-2872 This article was submitted to Radiation Oncology, a section of the journal Frontiers in Oncology Reviewed by: Jesper Grau Eriksen, Aarhus University Hospital, Denmark; Ruta Zukauskaite, Odense University Hospital, Denmark; Lois Holloway, South Western Sydney Local Health District, Australia; Mathis Rasmussen, Aarhus University, Denmark; Anne Andresen, Aarhus University Hospital, Denmark Edited by: Christian Rønn Hansen, Odense University Hospital, Denmark |
ISSN: | 2234-943X 2234-943X |
DOI: | 10.3389/fonc.2023.1115258 |