Head and neck tumor segmentation in PET/CT: The HECKTOR challenge
•The paper describes the first challenge on head and neck tumor segmentation in PET/CT.•Training (n=201, 4 centers) and test sets (n=53, 1 unseen center) amount to 254 cases.•All ground truth segmentations underwent cleaning to ensure quality and homogeneity.•The winning team obtained a DSC of 0.759...
Saved in:
Published in | Medical image analysis Vol. 77; p. 102336 |
---|---|
Main Authors | , , , , , , , , , , , , , , , , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
Netherlands
Elsevier B.V
01.04.2022
Elsevier BV Elsevier |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | •The paper describes the first challenge on head and neck tumor segmentation in PET/CT.•Training (n=201, 4 centers) and test sets (n=53, 1 unseen center) amount to 254 cases.•All ground truth segmentations underwent cleaning to ensure quality and homogeneity.•The winning team obtained a DSC of 0.759, showing a larg improvement over the baseline.•Additional post-challenge analyses (e.g. false positives analysis, ranking stability).
[Display omitted]
This paper relates the post-analysis of the first edition of the HEad and neCK TumOR (HECKTOR) challenge. This challenge was held as a satellite event of the 23rd International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2020, and was the first of its kind focusing on lesion segmentation in combined FDG-PET and CT image modalities. The challenge’s task is the automatic segmentation of the Gross Tumor Volume (GTV) of Head and Neck (H&N) oropharyngeal primary tumors in FDG-PET/CT images. To this end, the participants were given a training set of 201 cases from four different centers and their methods were tested on a held-out set of 53 cases from a fifth center. The methods were ranked according to the Dice Score Coefficient (DSC) averaged across all test cases. An additional inter-observer agreement study was organized to assess the difficulty of the task from a human perspective. 64 teams registered to the challenge, among which 10 provided a paper detailing their approach. The best method obtained an average DSC of 0.7591, showing a large improvement over our proposed baseline method and the inter-observer agreement, associated with DSCs of 0.6610 and 0.61, respectively. The automatic methods proved to successfully leverage the wealth of metabolic and structural properties of combined PET and CT modalities, significantly outperforming human inter-observer agreement level, semi-automatic thresholding based on PET images as well as other single modality-based methods. This promising performance is one step forward towards large-scale radiomics studies in H&N cancer, obviating the need for error-prone and time-consuming manual delineation of GTVs. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1361-8415 1361-8423 |
DOI: | 10.1016/j.media.2021.102336 |