A Dataset for Breast Cancer Histopathological Image Classification

Today, medical image analysis papers require solid experiments to prove the usefulness of proposed methods. However, experiments are often performed on data selected by the researchers, which may come from different institutions, scanners, and populations. Different evaluation measures may be used,...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on biomedical engineering Vol. 63; no. 7; pp. 1455 - 1462
Main Authors Spanhol, Fabio A., Oliveira, Luiz S., Petitjean, Caroline, Heutte, Laurent
Format Journal Article
LanguageEnglish
Published United States IEEE 01.07.2016
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Institute of Electrical and Electronics Engineers
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Today, medical image analysis papers require solid experiments to prove the usefulness of proposed methods. However, experiments are often performed on data selected by the researchers, which may come from different institutions, scanners, and populations. Different evaluation measures may be used, making it difficult to compare the methods. In this paper, we introduce a dataset of 7909 breast cancer histopathology images acquired on 82 patients, which is now publicly available from http://web.inf.ufpr.br/vri/breast-cancer-database. The dataset includes both benign and malignant images. The task associated with this dataset is the automated classification of these images in two classes, which would be a valuable computer-aided diagnosis tool for the clinician. In order to assess the difficulty of this task, we show some preliminary results obtained with state-of-the-art image classification systems. The accuracy ranges from 80% to 85%, showing room for improvement is left. By providing this dataset and a standardized evaluation protocol to the scientific community, we hope to gather researchers in both the medical and the machine learning field to advance toward this clinical application.
AbstractList Today, medical image analysis papers require solid experiments to prove the usefulness of proposed methods. However, experiments are often performed on data selected by the researchers, which may come from different institutions, scanners, and populations. Different evaluation measures may be used, making it difficult to compare the methods. In this paper, we introduce a dataset of 7909 breast cancer histopathology images acquired on 82 patients, which is now publicly available from http://web.inf.ufpr.br/vri/breast-cancer-database. The dataset includes both benign and malignant images. The task associated with this dataset is the automated classification of these images in two classes, which would be a valuable computer-aided diagnosis tool for the clinician. In order to assess the difficulty of this task, we show some preliminary results obtained with state-of-the-art image classification systems. The accuracy ranges from 80% to 85%, showing room for improvement is left. By providing this dataset and a standardized evaluation protocol to the scientific community, we hope to gather researchers in both the medical and the machine learning field to advance toward this clinical application.Today, medical image analysis papers require solid experiments to prove the usefulness of proposed methods. However, experiments are often performed on data selected by the researchers, which may come from different institutions, scanners, and populations. Different evaluation measures may be used, making it difficult to compare the methods. In this paper, we introduce a dataset of 7909 breast cancer histopathology images acquired on 82 patients, which is now publicly available from http://web.inf.ufpr.br/vri/breast-cancer-database. The dataset includes both benign and malignant images. The task associated with this dataset is the automated classification of these images in two classes, which would be a valuable computer-aided diagnosis tool for the clinician. In order to assess the difficulty of this task, we show some preliminary results obtained with state-of-the-art image classification systems. The accuracy ranges from 80% to 85%, showing room for improvement is left. By providing this dataset and a standardized evaluation protocol to the scientific community, we hope to gather researchers in both the medical and the machine learning field to advance toward this clinical application.
Today, medical image analysis papers require solid experiments to prove the usefulness of proposed methods. However, experiments are often performed on data selected by the researchers, which may come from different institutions, scanners, and populations. Different evaluation measures may be used, making it difficult to compare the methods. In this paper, we introduce a dataset of 7909 breast cancer histopathology images acquired on 82 patients, which is now publicly available from http://web.inf.ufpr.br/vri/breast-cancer-database. The dataset includes both benign and malignant images. The task associated with this dataset is the automated classification of these images in two classes, which would be a valuable computer-aided diagnosis tool for the clinician. In order to assess the difficulty of this task, we show some preliminary results obtained with state-of-the-art image classification systems. The accuracy ranges from 80% to 85%, showing room for improvement is left. By providing this dataset and a standardized evaluation protocol to the scientific community, we hope to gather researchers in both the medical and the machine learning field to advance toward this clinical application.
Author Heutte, Laurent
Oliveira, Luiz S.
Petitjean, Caroline
Spanhol, Fabio A.
Author_xml – sequence: 1
  givenname: Fabio A.
  orcidid: 0000-0002-9603-8067
  surname: Spanhol
  fullname: Spanhol, Fabio A.
  email: faspanhol@inf.ufpr.br
  organization: Federal University of Parana, Curitiba-PR, Brazil
– sequence: 2
  givenname: Luiz S.
  surname: Oliveira
  fullname: Oliveira, Luiz S.
  organization: Federal University of Parana
– sequence: 3
  givenname: Caroline
  surname: Petitjean
  fullname: Petitjean, Caroline
  organization: LITIS EA 4108, Université de Rouen
– sequence: 4
  givenname: Laurent
  surname: Heutte
  fullname: Heutte, Laurent
  organization: LITIS EA 4108, Université de Rouen
BackLink https://www.ncbi.nlm.nih.gov/pubmed/26540668$$D View this record in MEDLINE/PubMed
https://hal.science/hal-02113843$$DView record in HAL
BookMark eNqN0UtPwzAMAOAIgWAb_ACEhCpxgUNHnFeT4zYemzTEBc5V2rqsqG1G0yLx7yls7MAFTpbtT5YTD8l-7Wok5BToGICa66fpw-2YUZBjJoxiSuyRAUipQyY57JMBpaBDw4w4IkPvX_tUaKEOyRFTUlCl9IBMJ8GNba3HNshdE0wbtL4NZrZOsQnmhW_d2rYrV7qXIrVlsKjsCwaz0npf5H2lLVx9TA5yW3o82cYReb67fZrNw-Xj_WI2WYYrwWQbSqVonkVMU5kwDSlkTHGWSKME0IxKjGQScUSINCTGpgKVVmmW530nBxnxEbnazF3ZMl43RWWbj9jZIp5PlvFXjTIArgV_h95ebuy6cW8d-jauCp9iWdoaXedj0ExKZjgV_6BUKwUG9N80MkYqZr6XvfhFX13X1P3_fCmtjRDK9Op8q7qkwmz3qp_79OBsAwpE3LUjDv3ugn8CwWWYYQ
CODEN IEBEAX
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2016
Distributed under a Creative Commons Attribution 4.0 International License
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2016
– notice: Distributed under a Creative Commons Attribution 4.0 International License
DBID 97E
RIA
RIE
CGR
CUY
CVF
ECM
EIF
NPM
7QF
7QO
7QQ
7SC
7SE
7SP
7SR
7TA
7TB
7U5
8BQ
8FD
F28
FR3
H8D
JG9
JQ2
KR7
L7M
L~C
L~D
P64
7X8
1XC
DOI 10.1109/TBME.2015.2496264
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Aluminium Industry Abstracts
Biotechnology Research Abstracts
Ceramic Abstracts
Computer and Information Systems Abstracts
Corrosion Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
Materials Business File
Mechanical & Transportation Engineering Abstracts
Solid State and Superconductivity Abstracts
METADEX
Technology Research Database
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
Aerospace Database
Materials Research Database
ProQuest Computer Science Collection
Civil Engineering Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Biotechnology and BioEngineering Abstracts
MEDLINE - Academic
Hyper Article en Ligne (HAL)
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Materials Research Database
Civil Engineering Abstracts
Aluminium Industry Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Mechanical & Transportation Engineering Abstracts
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Ceramic Abstracts
Materials Business File
METADEX
Biotechnology and BioEngineering Abstracts
Computer and Information Systems Abstracts Professional
Aerospace Database
Engineered Materials Abstracts
Biotechnology Research Abstracts
Solid State and Superconductivity Abstracts
Engineering Research Database
Corrosion Abstracts
Advanced Technologies Database with Aerospace
ANTE: Abstracts in New Technology & Engineering
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
MEDLINE

Materials Research Database
Engineering Research Database
Technology Research Database
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
Engineering
Computer Science
EISSN 1558-2531
EndPage 1462
ExternalDocumentID oai_HAL_hal_02113843v1
4096830451
26540668
7312934
Genre orig-research
Research Support, Non-U.S. Gov't
Journal Article
GrantInformation_xml – fundername: Coordination for the Improvement of Higher Level Personnel
  grantid: #263/2014
  funderid: 10.13039/501100002322
– fundername: National Council for Scientific and Technological Development
  grantid: #301653/2011-9
  funderid: 10.13039/501100003593
GroupedDBID ---
-~X
.55
.DC
.GJ
0R~
29I
4.4
53G
5GY
5RE
5VS
6IF
6IK
6IL
6IN
85S
97E
AAJGR
AARMG
AASAJ
AAWTH
AAYJJ
ABAZT
ABJNI
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ACKIV
ACNCT
ACPRK
ADZIZ
AENEX
AETIX
AFFNX
AFRAH
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CHZPO
CS3
DU5
EBS
EJD
F5P
HZ~
H~9
IAAWW
IBMZZ
ICLAB
IDIHD
IEGSK
IFIPE
IFJZH
IPLJI
JAVBF
LAI
MS~
O9-
OCL
P2P
RIA
RIE
RIL
RNS
TAE
TN5
VH1
VJK
X7M
ZGI
ZXP
CGR
CUY
CVF
ECM
EIF
NPM
PKN
RIG
7QF
7QO
7QQ
7SC
7SE
7SP
7SR
7TA
7TB
7U5
8BQ
8FD
F28
FR3
H8D
JG9
JQ2
KR7
L7M
L~C
L~D
P64
7X8
1XC
ID FETCH-LOGICAL-h425t-5660fd72805b281c1d2632b596410d05e75b73ee1781b9ac4e686cdff5e7f1573
IEDL.DBID RIE
ISSN 0018-9294
1558-2531
IngestDate Fri May 09 12:17:59 EDT 2025
Fri Jul 11 08:33:55 EDT 2025
Fri Jul 11 11:35:54 EDT 2025
Fri Jul 11 13:10:10 EDT 2025
Mon Jun 30 08:34:42 EDT 2025
Wed Feb 19 02:43:08 EST 2025
Wed Aug 27 02:53:47 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 7
Keywords histopathology
Breast cancer
image classification
medical imaging
Language English
License Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-h425t-5660fd72805b281c1d2632b596410d05e75b73ee1781b9ac4e686cdff5e7f1573
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0002-9603-8067
0000-0003-4740-9770
0000-0003-0013-5370
PMID 26540668
PQID 1798894469
PQPubID 85474
PageCount 8
ParticipantIDs ieee_primary_7312934
proquest_miscellaneous_1808661918
proquest_journals_1798894469
hal_primary_oai_HAL_hal_02113843v1
pubmed_primary_26540668
proquest_miscellaneous_1799562957
proquest_miscellaneous_1825529304
PublicationCentury 2000
PublicationDate 2016-July
2016-07-00
20160701
2016-07
PublicationDateYYYYMMDD 2016-07-01
PublicationDate_xml – month: 07
  year: 2016
  text: 2016-July
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: New York
PublicationTitle IEEE transactions on biomedical engineering
PublicationTitleAbbrev TBME
PublicationTitleAlternate IEEE Trans Biomed Eng
PublicationYear 2016
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Institute of Electrical and Electronics Engineers
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
– name: Institute of Electrical and Electronics Engineers
SSID ssj0014846
Score 2.666308
Snippet Today, medical image analysis papers require solid experiments to prove the usefulness of proposed methods. However, experiments are often performed on data...
SourceID hal
proquest
pubmed
ieee
SourceType Open Access Repository
Aggregation Database
Index Database
Publisher
StartPage 1455
SubjectTerms Artificial Intelligence
Automation
Biopsy
Breast
Breast - diagnostic imaging
Breast cancer
Breast Neoplasms - diagnostic imaging
Cancer
Classification
Computer Science
Databases, Factual
Datasets
Feature extraction
Female
Histocytochemistry
Histograms
Histopathology
Humans
Image classification
Image Interpretation, Computer-Assisted - methods
Malignant tumors
Mammography
Medical
Medical imaging
Medical research
Microscopy
Patients
Researchers
Tasks
Title A Dataset for Breast Cancer Histopathological Image Classification
URI https://ieeexplore.ieee.org/document/7312934
https://www.ncbi.nlm.nih.gov/pubmed/26540668
https://www.proquest.com/docview/1798894469
https://www.proquest.com/docview/1799562957
https://www.proquest.com/docview/1808661918
https://www.proquest.com/docview/1825529304
https://hal.science/hal-02113843
Volume 63
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT-QwDLaAA2IPy3PZ4aWAONKhaZMmOc7w0LBiOIHErWqaVEiIGcR29sCvx0471QoB4la1OTiJ7Xyunc8Ax6gjhc6cimyqkkjIskQ_6HSEh63DgMQIW9F95_FNNroTf-7l_QKcdHdhvPeh-Mz36THk8t20nNGvslOV0ukkFmERA7fmrlaXMRC6uZQTczTgxIg2g8ljc3o7HF9QEZfsY6yBAJ568SQZQpUsEKw-UAVkaK3yOcoMp83lKozncjZFJo_9WW375es7CsfvTmQNfrawkw0aPVmHBT_ZgB__kRFuwPK4TbNvwnDAzosaz7eaIaZlQypcr9kZKcgLC8Qi1Mh47jbZ1RM6JRbaa1LhUdjrLbi7vLg9G0Vts4XoAc22jhDWxZWjZlXSJpqX3BGTu5UmEzx2sfRKWpV6zxUCXVOUwmc6K11V4ZeKS5X-gqXJdOJ_Aysra6U2NvVETl9YrSopC-kweqnQPSQ9OMI1z58bOo2cCK5Hg-uc3iHi4KkW6T_eg01at25Uu2Q92JtvUd5a2988kK4ZDGxNDw67z2gnlPwoJn46C2MwFEyMVF-M0RjgYUTJ9VdjcBooSYyibDcq0sk4V6qdj2XfhRXUwqwp9t2Dpfpl5vcR0tT2IOjyG1mT7iw
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1Lb9QwEB6VIvE48GgLbClgEByzjRM7tg8cdvvQLt3taSv1FuLYUSXELmqzIPgt_Sv9b51xshFC0FslblFiRY49nvm-ePwNwHu0kUJnTkU2VUkkZFmiH3Q6wmDrkJAYYSs67zw9zkYn4tOpPF2Dy-4sjPc-JJ_5Pl2GvXy3KJf0q2xXpRSdRJtCeeR__kCCdvFxvI-z-SFJDg9me6OorSEQnaE11hGilbhyVINJ2kTzkjsSKLfSZILHLpZeSatS77lC_GaKUvhMZ6WrKnxScalSfO8duIs4QybN6bBuj0Lo5hhQzNFlJEa0e6Y8Nruz4fSA0sZkH9kNUgaq_pNkCI6yIOl6RjmXoZjLv3FtiG-Hj-FqNTJNWsuX_rK2_fLXH6KR_-vQPYFHLbBmg2YlPIU1P9-Ah7_JLW7AvWmbSLAJwwHbL2qM4DVD1M6GlJpfsz1aAucsSKdQqeZVYGDjr-h2WSggSqlVwZq34ORWvucZrM8Xc_8CWFlZK7WxqSf5_cJqVUlZSIf8rEIHmPTgHc5x_q0RDMlJwns0mOR0DzEVT7VIv_MebNI8da3aKerBzsok8tafXORBVs4gdTc9eNs9Rk9A2zvF3C-WoQ2S3cRIdUMbjRQWOTPXN7XBz8CexNiV541Jdn1cGfH23_v-Bu6PZtNJPhkfH72EB7gCsia1eQfW6_Olf4UArravwzpi8Pm2re8a8kVJ6Q
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Dataset+for+Breast+Cancer+Histopathological+Image+Classification&rft.jtitle=IEEE+transactions+on+biomedical+engineering&rft.au=Spanhol%2C+Fabio+A.&rft.au=Oliveira%2C+Luiz+S.&rft.au=Petitjean%2C+Caroline&rft.au=Heutte%2C+Laurent&rft.date=2016-07-01&rft.pub=IEEE&rft.issn=0018-9294&rft.volume=63&rft.issue=7&rft.spage=1455&rft.epage=1462&rft_id=info:doi/10.1109%2FTBME.2015.2496264&rft_id=info%3Apmid%2F26540668&rft.externalDocID=7312934
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9294&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9294&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9294&client=summon