DiagSet: a dataset for prostate cancer histopathological image classification

Cancer diseases constitute one of the most significant societal challenges. In this paper, we introduce a novel histopathological dataset for prostate cancer detection. The proposed dataset, consisting of over 2.6 million tissue patches extracted from 430 fully annotated scans, 4675 scans with assig...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Koziarski, Michał, Cyganek, Bogusław, Niedziela, Przemysław, Olborski, Bogusław, Antosz, Zbigniew, Żydak, Marcin, Kwolek, Bogdan, Wąsowicz, Paweł, Bukała, Andrzej, Swadźba, Jakub, Sitkowski, Piotr
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 02.06.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Cancer diseases constitute one of the most significant societal challenges. In this paper, we introduce a novel histopathological dataset for prostate cancer detection. The proposed dataset, consisting of over 2.6 million tissue patches extracted from 430 fully annotated scans, 4675 scans with assigned binary diagnoses, and 46 scans with diagnoses independently provided by a group of histopathologists can be found at https://github.com/michalkoziarski/DiagSet. Furthermore, we propose a machine learning framework for detection of cancerous tissue regions and prediction of scan-level diagnosis, utilizing thresholding to abstain from the decision in uncertain cases. The proposed approach, composed of ensembles of deep neural networks operating on the histopathological scans at different scales, achieves 94.6% accuracy in patch-level recognition and is compared in a scan-level diagnosis with 9 human histopathologists showing high statistical agreement.
ISSN:2331-8422