Practical Training Approaches for Discordant Atopic Dermatitis Severity Datasets: Merging Methods With Soft-Label and Train-Set Pruning

Objective assessment of atopic dermatitis (AD) is essential for choosing proper management strategies. This study investigated the performance of convolutional neural networks (CNN) models in grading the severity of AD. Five board-certified dermatologists independently evaluated the severity of 9,19...

Full description

Saved in:
Bibliographic Details
Published inIEEE journal of biomedical and health informatics Vol. 27; no. 1; pp. 166 - 175
Main Authors Cho, Soo Ick, Lee, Dongheon, Han, Byeol, Lee, Ji Su, Hong, Ji Yeon, Chung, Jin Ho, Lee, Dong Hun, Na, Jung-Im
Format Journal Article
LanguageEnglish
Published United States IEEE 01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Objective assessment of atopic dermatitis (AD) is essential for choosing proper management strategies. This study investigated the performance of convolutional neural networks (CNN) models in grading the severity of AD. Five board-certified dermatologists independently evaluated the severity of 9,192 AD images. The severity of AD was evaluated based on an Investigator's Global Assessment (IGA) and six signs of AD. For CNN training, we applied three distinct approaches: 1) ensemble vs. integration 2) hard-label vs. soft-label and 3) train-set pruning. For the IGA prediction, the two best models were chosen based on the macro-averaged AUROC and F-1 score. The ensemble-soft-label-pruning model was chosen based on AUROC 0.943, 0.927 for the internal and external validation set respectively, and integration-soft-label-whole dataset model was chosen based on the F1-score 0.750, 0.721 for the internal and external validation set respectively. CNN models trained by multi-evaluator dataset outperformed the models by an individual evaluator dataset, and they performed better to the dataset in which the assessment of dermatologists was concordant. In conclusion, CNN models for AD could be improved by labeled dataset from multiple evaluators, merging methods with soft-label and train-set pruning.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2168-2194
2168-2208
DOI:10.1109/JBHI.2022.3218166