Deep Over-sampling Framework for Classifying Imbalanced Data

Class imbalance is a challenging issue in practical classification problems for deep learning models as well as traditional models. Traditionally successful countermeasures such as synthetic over-sampling have had limited success with complex, structured data handled by deep learning models. In this...

Full description

Saved in:
Bibliographic Details
Published inMachine Learning and Knowledge Discovery in Databases Vol. 10534; pp. 770 - 785
Main Authors Ando, Shin, Huang, Chun Yuan
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2017
Springer International Publishing
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Class imbalance is a challenging issue in practical classification problems for deep learning models as well as traditional models. Traditionally successful countermeasures such as synthetic over-sampling have had limited success with complex, structured data handled by deep learning models. In this paper, we propose Deep Over-sampling (DOS), a framework for extending the synthetic over-sampling method to the deep feature space acquired by a convolutional neural network (CNN). Its key feature is an explicit, supervised representation learning, for which the training data presents each raw input sample with a synthetic embedding target in the deep feature space, which is sampled from the linear subspace of in-class neighbors. We implement an iterative process of training the CNN and updating the targets, which induces smaller in-class variance among the embeddings, to increase the discriminative power of the deep representation. We present an empirical study using public benchmarks, which shows that the DOS framework not only counteracts class imbalance better than the existing method, but also improves the performance of the CNN in the standard, balanced settings.
AbstractList Class imbalance is a challenging issue in practical classification problems for deep learning models as well as traditional models. Traditionally successful countermeasures such as synthetic over-sampling have had limited success with complex, structured data handled by deep learning models. In this paper, we propose Deep Over-sampling (DOS), a framework for extending the synthetic over-sampling method to the deep feature space acquired by a convolutional neural network (CNN). Its key feature is an explicit, supervised representation learning, for which the training data presents each raw input sample with a synthetic embedding target in the deep feature space, which is sampled from the linear subspace of in-class neighbors. We implement an iterative process of training the CNN and updating the targets, which induces smaller in-class variance among the embeddings, to increase the discriminative power of the deep representation. We present an empirical study using public benchmarks, which shows that the DOS framework not only counteracts class imbalance better than the existing method, but also improves the performance of the CNN in the standard, balanced settings.
Author Ando, Shin
Huang, Chun Yuan
Author_xml – sequence: 1
  givenname: Shin
  surname: Ando
  fullname: Ando, Shin
  email: ando@rs.tus.ac.jp
– sequence: 2
  givenname: Chun Yuan
  surname: Huang
  fullname: Huang, Chun Yuan
BookMark eNqNkMtOwzAQRQ0URAv9Axb5AcOM7fghsUGFQiWkbmBt2YnDo2kS7ADi70lbYM1qpHt1RjNnQkZN2wRCzhDOEUBdGKUppxwNVciEocYKuUemQ8yHcJuZfTJGiUg5F-aATH4LbUZkDBwYNUrwIzJBQC0UQ6mPyTSlVwBAwwEVjMnldQhdtvwIkSa37uqX5imbR7cOn21cZVUbs1ntUnqpvjbNYu1d7ZoilNm1690pOaxcncL0Z56Qx_nNw-yO3i9vF7Ore9oxrXuKRaic9HmlQbC8LHnuSm6k97pSBUcPPhfMs1I56XKtyyALI7RgnKES3kh-Qthub-ricEaI1rftKlkEu5FlByuW2-F9uxVjN7IGSOygLrZv7yH1NmyoIjR9dHXx7Lo-xGQlB85AWiWY1fhvLM-VAtB_2DeKZ3r8
ContentType Book Chapter
Copyright Springer International Publishing AG 2017
Copyright_xml – notice: Springer International Publishing AG 2017
DBID FFUUA
DEWEY 006.31
DOI 10.1007/978-3-319-71249-9_46
DatabaseName ProQuest Ebook Central - Book Chapters - Demo use only
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9783319712499
3319712497
EISSN 1611-3349
Editor Vens, Celine
Hollmén, Jaakko
Dzeroski, Saso
Todorovski, Ljupčo
Ceci, Michelangelo
Editor_xml – sequence: 1
  fullname: Todorovski, Ljupčo
– sequence: 2
  fullname: Ceci, Michelangelo
– sequence: 3
  fullname: Vens, Celine
– sequence: 4
  fullname: Hollmén, Jaakko
– sequence: 5
  fullname: Dzeroski, Saso
EndPage 785
ExternalDocumentID EBC6303206_742_816
EBC5577008_742_816
GroupedDBID 0D6
0DA
38.
AABBV
AALVI
ABBVZ
ABHTH
ABQUB
ACDJR
AEDXK
AEJLV
AEKFX
AEZAY
AGIGN
AGYGE
AIODD
ALBAV
ALMA_UNASSIGNED_HOLDINGS
AZZ
BATQV
BBABE
CVWCR
CZZ
FFUUA
I4C
IEZ
SBO
SWYDZ
TPJZQ
TSXQS
Z5O
Z7R
Z7U
Z7W
Z7X
Z7Z
Z81
Z83
Z84
Z85
Z87
Z88
-DT
-GH
-~X
1SB
29L
2HA
2HV
5QI
875
AASHB
ABMNI
ACGFS
ADCXD
AEFIE
EJD
F5P
FEDTE
HVGLF
LAS
LDH
P2P
RIG
RNI
RSU
SVGTG
VI1
~02
ID FETCH-LOGICAL-p288t-1cefa6b5f80425dd35ad396bb8f7c31b0b542b2d7a6a588de6c9484232174b963
ISBN 3319712489
9783319712482
ISSN 0302-9743
IngestDate Tue Jul 29 20:20:35 EDT 2025
Wed May 28 23:51:48 EDT 2025
Wed May 28 23:38:33 EDT 2025
IsPeerReviewed true
IsScholarly true
LCCallNum QA76.9.D343Q334-342T
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-p288t-1cefa6b5f80425dd35ad396bb8f7c31b0b542b2d7a6a588de6c9484232174b963
OCLC 1018472168
PQID EBC5577008_742_816
PageCount 16
ParticipantIDs springer_books_10_1007_978_3_319_71249_9_46
proquest_ebookcentralchapters_6303206_742_816
proquest_ebookcentralchapters_5577008_742_816
PublicationCentury 2000
PublicationDate 2017
PublicationDateYYYYMMDD 2017-01-01
PublicationDate_xml – year: 2017
  text: 2017
PublicationDecade 2010
PublicationPlace Switzerland
PublicationPlace_xml – name: Switzerland
– name: Cham
PublicationSeriesSubtitle Lecture Notes in Artificial Intelligence
PublicationSeriesTitle Lecture Notes in Computer Science
PublicationSeriesTitleAlternate Lect.Notes Computer
PublicationSubtitle European Conference, ECML PKDD 2017, Skopje, Macedonia, September 18-22, 2017, Proceedings, Part I
PublicationTitle Machine Learning and Knowledge Discovery in Databases
PublicationYear 2017
Publisher Springer International Publishing AG
Springer International Publishing
Publisher_xml – name: Springer International Publishing AG
– name: Springer International Publishing
RelatedPersons Kleinberg, Jon M.
Mattern, Friedemann
Naor, Moni
Mitchell, John C.
Terzopoulos, Demetri
Steffen, Bernhard
Pandu Rangan, C.
Kanade, Takeo
Kittler, Josef
Weikum, Gerhard
Hutchison, David
Tygar, Doug
RelatedPersons_xml – sequence: 1
  givenname: David
  surname: Hutchison
  fullname: Hutchison, David
– sequence: 2
  givenname: Takeo
  surname: Kanade
  fullname: Kanade, Takeo
– sequence: 3
  givenname: Josef
  surname: Kittler
  fullname: Kittler, Josef
– sequence: 4
  givenname: Jon M.
  surname: Kleinberg
  fullname: Kleinberg, Jon M.
– sequence: 5
  givenname: Friedemann
  surname: Mattern
  fullname: Mattern, Friedemann
– sequence: 6
  givenname: John C.
  surname: Mitchell
  fullname: Mitchell, John C.
– sequence: 7
  givenname: Moni
  surname: Naor
  fullname: Naor, Moni
– sequence: 8
  givenname: C.
  surname: Pandu Rangan
  fullname: Pandu Rangan, C.
– sequence: 9
  givenname: Bernhard
  surname: Steffen
  fullname: Steffen, Bernhard
– sequence: 10
  givenname: Demetri
  surname: Terzopoulos
  fullname: Terzopoulos, Demetri
– sequence: 11
  givenname: Doug
  surname: Tygar
  fullname: Tygar, Doug
– sequence: 12
  givenname: Gerhard
  surname: Weikum
  fullname: Weikum, Gerhard
SSID ssj0001930170
ssj0002792
Score 2.3151417
Snippet Class imbalance is a challenging issue in practical classification problems for deep learning models as well as traditional models. Traditionally successful...
SourceID springer
proquest
SourceType Publisher
StartPage 770
SubjectTerms Class imbalance
Convolutional neural network
Deep learning
Representation learning
Synthetic over-sampling
Title Deep Over-sampling Framework for Classifying Imbalanced Data
URI http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=5577008&ppg=816
http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=6303206&ppg=816
http://link.springer.com/10.1007/978-3-319-71249-9_46
Volume 10534
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb9QwELba5YI4lPIQFKh84FYFbRLbsSUupbultDwuLerN8itSDywVm17665lx4s1DK6FyiSJrEjn5nMnMeL4ZQt57F_K5svPMK4PbjN5l1oCXAqaE9KYILMSiPt--i7Mrdn7Nr_sGi5Fd0tgP7n4rr-R_UIUxwBVZsg9AdnNTGIBzwBeOgDAcJ8bvOMzadRjCNMiQKqS2VMOLFCLDspoO0zMjr29hGoP_q_VwhSxCuD36ASLZ2mBeOdziNKVqxezD2DDzpiVCffllMQkSswUWLZ1tEyzIq0mwIAULJ-HGQcTr-PPIwSzhC63ABJBjjQlfLtuqf4cpF0iPwmtVpjTbUu5a5pPB-FddfjrhvKqwOSZ47RqEdsluJfmMPDpenn_92QfQVIm1f5Cvkyap2opK_aQHXMltcxp5FZON8GhfXD4lT5BzQpEMArPcJzth9YzspY4btFPAz8lHxIyOMKMbzChgRgeY0R4zipi9IFeny8uTs6zrf5HdFlI2We5CbYTltUTN6n3JjS-VsFbWlStzO7ecFbbwlRGGS-mDcIpJ3HkHN9OCZn1JZqvfq_CKUGlgoBC8YMIz7pgRheOmdrXidV2K_DXJ0pvQcZe-Sw127XOv9QSTf8oLMI2Kuejlj9Lr1Si-1qlcNuCiSw246IiLRlwOHjibN-Rxv9jfklnz5y68A1uxsYfdqvkL_E9lEg
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Machine+Learning+and+Knowledge+Discovery+in+Databases&rft.atitle=Deep+Over-sampling+Framework+for+Classifying+Imbalanced+Data&rft.date=2017-01-01&rft.pub=Springer+International+Publishing+AG&rft.isbn=9783319712482&rft.volume=10534&rft_id=info:doi/10.1007%2F978-3-319-71249-9_46&rft.externalDBID=816&rft.externalDocID=EBC5577008_742_816
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F5577008-l.jpg
http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F6303206-l.jpg