Training high-performance deep learning classifier for diagnosis in oral cytology using diverse annotations

The uncertainty of true labels in medical images hinders diagnosis owing to the variability across professionals when applying deep learning models. We used deep learning to obtain an optimal convolutional neural network (CNN) by adequately annotating data for oral exfoliative cytology considering l...

Full description

Saved in:
Bibliographic Details
Published inScientific reports Vol. 14; no. 1; pp. 17591 - 8
Main Authors Sukegawa, Shintaro, Tanaka, Futa, Nakano, Keisuke, Hara, Takeshi, Ochiai, Takanaga, Shimada, Katsumitsu, Inoue, Yuta, Taki, Yoshihiro, Nakai, Fumi, Nakai, Yasuhiro, Ishihama, Takanori, Miyazaki, Ryo, Murakami, Satoshi, Nagatsuka, Hitoshi, Miyake, Minoru
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 30.07.2024
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The uncertainty of true labels in medical images hinders diagnosis owing to the variability across professionals when applying deep learning models. We used deep learning to obtain an optimal convolutional neural network (CNN) by adequately annotating data for oral exfoliative cytology considering labels from multiple oral pathologists. Six whole-slide images were processed using QuPath for segmenting them into tiles. The images were labeled by three oral pathologists, resulting in 14,535 images with the corresponding pathologists’ annotations. Data from three pathologists who provided the same diagnosis were labeled as ground truth (GT) and used for testing. We investigated six models trained using the annotations of (1) pathologist A, (2) pathologist B, (3) pathologist C, (4) GT, (5) majority voting, and (6) a probabilistic model. We divided the test by cross-validation per slide dataset and examined the classification performance of the CNN with a ResNet50 baseline. Statistical evaluation was performed repeatedly and independently using every slide 10 times as test data. For the area under the curve, three cases showed the highest values (0.861, 0.955, and 0.991) for the probabilistic model. Regarding accuracy, two cases showed the highest values (0.988 and 0.967). For the models using the pathologists and GT annotations, many slides showed very low accuracy and large variations across tests. Hence, the classifier trained with probabilistic labels provided the optimal CNN for oral exfoliative cytology considering diagnoses from multiple pathologists. These results may lead to trusted medical artificial intelligence solutions that reflect diverse diagnoses of various professionals.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2045-2322
2045-2322
DOI:10.1038/s41598-024-67879-w