A practical approach to novel class discovery in tabular data

The problem of novel class discovery (NCD) consists in extracting knowledge from a labeled set of known classes to accurately partition an unlabeled set of novel classes. While NCD has recently received a lot of attention from the community, it is often solved on computer vision problems and under u...

Full description

Saved in:
Bibliographic Details
Published inData mining and knowledge discovery Vol. 38; no. 4; pp. 2087 - 2116
Main Authors Colin, Troisemaine, Alexandre, Reiffers-Masson, Stéphane, Gosselin, Vincent, Lemaire, Sandrine, Vaton
Format Journal Article
LanguageEnglish
Published New York Springer US 01.07.2024
Springer Nature B.V
Springer
SeriesECML PKDD 2024
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The problem of novel class discovery (NCD) consists in extracting knowledge from a labeled set of known classes to accurately partition an unlabeled set of novel classes. While NCD has recently received a lot of attention from the community, it is often solved on computer vision problems and under unrealistic conditions. In particular, the number of novel classes is usually assumed to be known in advance, and their labels are sometimes used to tune hyperparameters. Methods that rely on these assumptions are not applicable in real-world scenarios. In this work, we focus on solving NCD in tabular data when no prior knowledge of the novel classes is available. To this end, we propose to tune the hyperparameters of NCD methods by adapting the k -fold cross-validation process and hiding some of the known classes in each fold. Since we have found that methods with too many hyperparameters are likely to overfit these hidden classes, we define a simple deep NCD model. This method is composed of only the essential elements necessary for the NCD problem and shows robust performance under realistic conditions. Furthermore, we find that the latent space of this method can be used to reliably estimate the number of novel classes. Additionally, we adapt two unsupervised clustering algorithms ( k -means and Spectral Clustering) to leverage the knowledge of the known classes. Extensive experiments are conducted on 7 tabular datasets and demonstrate the effectiveness of the proposed method and hyperparameter tuning process, and show that the NCD problem can be solved without relying on knowledge from the novel classes.
AbstractList The problem of Novel Class Discovery (NCD) consists in extracting knowledge from a labeled set of known classes to accurately partition an unlabeled set of novel classes. While NCD has recently received a lot of attention from the community, it is often solved on computer vision problems and under unrealistic conditions. In particular, the number of novel classes is usually assumed to be known in advance, and their labels are sometimes used to tune hyperparameters. Methods that rely on these assumptions are not applicable in real-world scenarios. In this work, we focus on solving NCD in tabular data when no prior knowledge of the novel classes is available. To this end, we propose to tune the hyperparameters of NCD methods by adapting the k-fold cross-validation process and hiding some of the known classes in each fold. Since we have found that methods with too many hyperparameters are likely to overfit these hidden classes, we define a simple deep NCD model. This method is composed of only the essential elements necessary for the NCD problem and performs impressively well under realistic conditions. Furthermore, we find that the latent space of this method can be used to reliably estimate the number of novel classes. Additionally, we adapt two unsupervised clustering algorithms (k-means and Spectral Clustering) to leverage the knowledge of the known classes. Extensive experiments are conducted on 7 tabular datasets and demonstrate the effectiveness of the proposed method and hyperparameter tuning process, and show that the NCD problem can be solved without relying on knowledge from the novel classes.
The problem of novel class discovery (NCD) consists in extracting knowledge from a labeled set of known classes to accurately partition an unlabeled set of novel classes. While NCD has recently received a lot of attention from the community, it is often solved on computer vision problems and under unrealistic conditions. In particular, the number of novel classes is usually assumed to be known in advance, and their labels are sometimes used to tune hyperparameters. Methods that rely on these assumptions are not applicable in real-world scenarios. In this work, we focus on solving NCD in tabular data when no prior knowledge of the novel classes is available. To this end, we propose to tune the hyperparameters of NCD methods by adapting the k -fold cross-validation process and hiding some of the known classes in each fold. Since we have found that methods with too many hyperparameters are likely to overfit these hidden classes, we define a simple deep NCD model. This method is composed of only the essential elements necessary for the NCD problem and shows robust performance under realistic conditions. Furthermore, we find that the latent space of this method can be used to reliably estimate the number of novel classes. Additionally, we adapt two unsupervised clustering algorithms ( k -means and Spectral Clustering) to leverage the knowledge of the known classes. Extensive experiments are conducted on 7 tabular datasets and demonstrate the effectiveness of the proposed method and hyperparameter tuning process, and show that the NCD problem can be solved without relying on knowledge from the novel classes.
The problem of novel class discovery (NCD) consists in extracting knowledge from a labeled set of known classes to accurately partition an unlabeled set of novel classes. While NCD has recently received a lot of attention from the community, it is often solved on computer vision problems and under unrealistic conditions. In particular, the number of novel classes is usually assumed to be known in advance, and their labels are sometimes used to tune hyperparameters. Methods that rely on these assumptions are not applicable in real-world scenarios. In this work, we focus on solving NCD in tabular data when no prior knowledge of the novel classes is available. To this end, we propose to tune the hyperparameters of NCD methods by adapting the k-fold cross-validation process and hiding some of the known classes in each fold. Since we have found that methods with too many hyperparameters are likely to overfit these hidden classes, we define a simple deep NCD model. This method is composed of only the essential elements necessary for the NCD problem and shows robust performance under realistic conditions. Furthermore, we find that the latent space of this method can be used to reliably estimate the number of novel classes. Additionally, we adapt two unsupervised clustering algorithms (k-means and Spectral Clustering) to leverage the knowledge of the known classes. Extensive experiments are conducted on 7 tabular datasets and demonstrate the effectiveness of the proposed method and hyperparameter tuning process, and show that the NCD problem can be solved without relying on knowledge from the novel classes.
Author Sandrine, Vaton
Stéphane, Gosselin
Vincent, Lemaire
Alexandre, Reiffers-Masson
Colin, Troisemaine
Author_xml – sequence: 1
  givenname: Troisemaine
  surname: Colin
  fullname: Colin, Troisemaine
  email: colin.troisemaine@gmail.com
  organization: Department of Computer Science, IMT Atlantique, Orange Labs
– sequence: 2
  givenname: Reiffers-Masson
  surname: Alexandre
  fullname: Alexandre, Reiffers-Masson
  organization: Department of Computer Science, IMT Atlantique
– sequence: 3
  givenname: Gosselin
  surname: Stéphane
  fullname: Stéphane, Gosselin
  organization: Orange Labs
– sequence: 4
  givenname: Lemaire
  surname: Vincent
  fullname: Vincent, Lemaire
  organization: Orange Labs
– sequence: 5
  givenname: Vaton
  surname: Sandrine
  fullname: Sandrine, Vaton
  organization: Department of Computer Science, IMT Atlantique
BackLink https://hal.science/hal-04283853$$DView record in HAL
BookMark eNp9kEFLwzAYhoNMcJv-AU8BTx6qSdOk6cHDGOqEgRcFb-FrmrqO2tQkG_Tfm1lF8LBTvi88T3jzztCks51B6JKSG0pIfuspEVQmJM0SQknKk-EETSnPWZJz8TaJM5NZwiUlZ2jm_ZYQwlNGpuhugXsHOjQaWgx97yzoDQ4Wd3ZvWqxb8B5XjddxdQNuOhyg3LXgcAUBztFpDa03Fz_nHL0-3L8sV8n6-fFpuVgnmnEWEl1UpRQ807IAFidmtGG0ZDVQTjNaCR33OqsywyWBOje5LA0UQgotack1m6Pr8d0NtKp3zQe4QVlo1GqxVoc7kqWSSc72aWSvRjb-5XNnfFBbu3NdjKcYkYJmBRUyUnKktLPeO1Mr3QQIje2Cg6ZVlKhDsWosVsVi1Xexaohq-k_9TXRUYqPkI9y9G_eX6oj1BYrgjLM
CitedBy_id crossref_primary_10_1093_nargab_lqae166
Cites_doi 10.1109/CVPR46437.2021.00934
10.1109/CVPR.2009.5206848
10.1007/s00357-003-0004-6
10.1109/CVPR52688.2022.01387
10.1109/CVPRW56347.2022.00441
10.1109/ICCV.2019.00849
10.5281/zenodo.7873825
10.1109/ICCV48922.2021.00951
10.1007/BF00114162
10.1016/j.ins.2022.07.101
10.1109/TPAMI.2021.3091944
10.1109/CVPR52688.2022.00734
10.1109/ICKG55886.2022.00041
10.1007/s11222-007-9033-z
10.1109/ICDCSW.2011.20
10.1109/CVPR52729.2023.00337
10.1109/CVPR46437.2021.01072
10.1609/aaai.v34i04.5763
10.1002/nav.3800020109
10.1016/j.patcog.2012.07.021
10.1109/TPAMI.2012.256
ContentType Journal Article
Copyright The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Distributed under a Creative Commons Attribution 4.0 International License
Copyright_xml – notice: The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
– notice: Distributed under a Creative Commons Attribution 4.0 International License
DBID AAYXX
CITATION
3V.
7SC
7WY
7WZ
7XB
87Z
8AL
8AO
8FD
8FE
8FG
8FK
8FL
8G5
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BEZIV
BGLVJ
CCPQU
DWQXO
FRNLG
F~G
GNUQQ
GUQSH
HCIFZ
JQ2
K60
K6~
K7-
L.-
L7M
L~C
L~D
M0C
M0N
M2O
MBDVC
P5Z
P62
PHGZM
PHGZT
PKEHL
PQBIZ
PQBZA
PQEST
PQGLB
PQQKQ
PQUKI
Q9U
1XC
VOOES
DOI 10.1007/s10618-024-01025-y
DatabaseName CrossRef
ProQuest Central (Corporate)
Computer and Information Systems Abstracts
ABI/INFORM Collection
ABI/INFORM Global (PDF only)
ProQuest Central (purchase pre-March 2016)
ABI/INFORM Collection
Computing Database (Alumni Edition)
ProQuest Pharma Collection
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ABI/INFORM Collection (Alumni)
Research Library (Alumni)
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
Proquest Central
Business Premium Collection
Technology Collection
ProQuest One Community College
ProQuest Central
Business Premium Collection (Alumni)
ABI/INFORM Global (Corporate)
ProQuest Central Student
Research Library Prep
SciTech Premium Collection
ProQuest Computer Science Collection
ProQuest Business Collection (Alumni Edition)
ProQuest Business Collection
Computer Science Database
ABI/INFORM Professional Advanced
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
ABI/INFORM Global
Computing Database
Research Library
Research Library (Corporate)
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic
ProQuest One Academic Middle East (New)
ProQuest One Business
ProQuest One Business (Alumni)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central Basic
Hyper Article en Ligne (HAL)
Hyper Article en Ligne (HAL) (Open Access)
DatabaseTitle CrossRef
ABI/INFORM Global (Corporate)
ProQuest Business Collection (Alumni Edition)
ProQuest One Business
Research Library Prep
Computer Science Database
ProQuest Central Student
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
Research Library (Alumni Edition)
ProQuest Pharma Collection
ABI/INFORM Complete
ProQuest Central
ABI/INFORM Professional Advanced
ProQuest One Applied & Life Sciences
ProQuest Central Korea
ProQuest Research Library
ProQuest Central (New)
Advanced Technologies Database with Aerospace
ABI/INFORM Complete (Alumni Edition)
Advanced Technologies & Aerospace Collection
Business Premium Collection
ABI/INFORM Global
ProQuest Computing
ABI/INFORM Global (Alumni Edition)
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Business Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
ProQuest One Business (Alumni)
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
Business Premium Collection (Alumni)
DatabaseTitleList

ABI/INFORM Global (Corporate)
Database_xml – sequence: 1
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Physics
Computer Science
EISSN 1573-756X
EndPage 2116
ExternalDocumentID oai_HAL_hal_04283853v2
10_1007_s10618_024_01025_y
GrantInformation_xml – fundername: Orange SA
GroupedDBID -59
-5G
-BR
-EM
-Y2
-~C
.4S
.86
.DC
.VR
06D
0R~
0VY
199
1N0
1SB
203
29F
2J2
2JN
2JY
2KG
2KM
2LR
2P1
2VQ
2~H
30V
3V.
4.4
406
408
409
40D
40E
5GY
5VS
67Z
6NX
78A
7WY
8AO
8FE
8FG
8FL
8G5
8TC
8UJ
95-
95.
95~
96X
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDZT
ABECU
ABFTD
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFS
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACSNA
ACZOJ
ADHHG
ADHIR
ADINQ
ADKNI
ADKPE
ADRFC
ADTPH
ADURQ
ADYFF
ADZKW
AEBTG
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGGDS
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARCSS
ARMRJ
ASPBG
AVWKF
AXYYD
AYJHY
AZFZN
AZQEC
B-.
BA0
BDATZ
BENPR
BEZIV
BGLVJ
BGNMA
BPHCQ
BSONS
CAG
CCPQU
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
DWQXO
EBLON
EBS
EDO
EIOEI
EJD
ESBYG
F5P
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRNLG
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GROUPED_ABI_INFORM_COMPLETE
GUQSH
GXS
H13
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I-F
I09
IHE
IJ-
IKXTQ
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
J-C
J0Z
J9A
JBSCW
JCJTX
JZLTJ
K60
K6V
K6~
K7-
KDC
KOV
LAK
LLZTM
M0C
M0N
M2O
M4Y
MA-
N2Q
NB0
NPVJJ
NQJWS
NU0
O9-
O93
O9J
OAM
OVD
P2P
P62
P9O
PF0
PQBIZ
PQBZA
PQQKQ
PROAC
PT4
PT5
Q2X
QOS
R89
R9I
RNI
RNS
ROL
RPX
RSV
RZC
RZE
RZK
S16
S1Z
S27
S3B
SAP
SCO
SDH
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
TEORI
TSG
TSK
TSV
TUC
TUS
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
Z7R
Z7S
Z7W
Z7X
Z7Y
Z7Z
Z81
Z83
Z88
ZMTXR
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ACSTC
ADHKG
ADKFA
AEZWR
AFDZB
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
AMVHM
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
7SC
7XB
8AL
8FD
8FK
ABRTQ
JQ2
L.-
L7M
L~C
L~D
MBDVC
PKEHL
PQEST
PQGLB
PQUKI
Q9U
1XC
VOOES
ID FETCH-LOGICAL-c353t-c9db8654c89a3b863ece31b3fa15141d6cce3f4d4e580af7e78bea9686c81b5c3
IEDL.DBID BENPR
ISSN 1384-5810
IngestDate Thu May 08 06:32:04 EDT 2025
Sat Aug 23 13:51:28 EDT 2025
Tue Jul 01 00:40:33 EDT 2025
Thu Apr 24 23:08:23 EDT 2025
Fri Feb 21 02:39:40 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 4
Keywords Open world learning
Clustering
Transfer learning
Novel class discovery
Tabular data
transfer learning
clustering
novel class discovery
novel class discovery clustering tabular data open world learning transfer learning
tabular data
open world learning
Language English
License Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c353t-c9db8654c89a3b863ece31b3fa15141d6cce3f4d4e580af7e78bea9686c81b5c3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-8940-6004
0000-0002-4084-1977
0000-0003-2211-1767
OpenAccessLink https://hal.science/hal-04283853
PQID 3086149168
PQPubID 43030
PageCount 30
ParticipantIDs hal_primary_oai_HAL_hal_04283853v2
proquest_journals_3086149168
crossref_citationtrail_10_1007_s10618_024_01025_y
crossref_primary_10_1007_s10618_024_01025_y
springer_journals_10_1007_s10618_024_01025_y
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2024-07-01
PublicationDateYYYYMMDD 2024-07-01
PublicationDate_xml – month: 07
  year: 2024
  text: 2024-07-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationSeriesTitle ECML PKDD 2024
PublicationTitle Data mining and knowledge discovery
PublicationTitleAbbrev Data Min Knowl Disc
PublicationYear 2024
Publisher Springer US
Springer Nature B.V
Springer
Publisher_xml – name: Springer US
– name: Springer Nature B.V
– name: Springer
References Dua D, Graff C (2017) UCI machine learning repository
Troisemaine C, Lemaire V, Gosselin S, Reiffers-Masson A, Flocon-Cholet J, Vaton S (2023) Novel class discovery: an introduction and key concepts
Zhang L, Qi L, Yang X, Qiao H, Yang M-H, Liu Z (2022) Automatically discovering novel visual categories with self-supervised prototype learning
StuetzleWEstimating the cluster tree of a density by analyzing the minimal spanning tree of a sampleJ Classif20032012547198312010.1007/s00357-003-0004-6
Hsu Y-C, Lv Z, Schlosser J, Odom P, Kira Z (2019) Multi-class classification without multi-class labels. In: ICLR
Arthur D, Vassilvitskii S (2007) K-means++ the advantages of careful seeding. In: ACM-SIAM SODA, pp 1027–1035
ScheirerWJRezende RochaASapkotaABoultTEToward open set recognitionIEEE Trans Pattern Anal Mach Intell20133571757177210.1109/TPAMI.2012.256
Troisemaine C, Flocon-Cholet J, Gosselin S, Vaton S, Reiffers-Masson A, Lemaire V (2022) A method for discovering novel classes in tabular data. In: ICKG, pp 265–274
Han K, Rebuffi S-A, Ehrhardt S, Vedaldi A, Zisserman A (2021) Autonovel: automatically discovering and learning novel visual categories. In: PAMI
LeLPattersonAWhiteMSupervised autoencoders: improving generalization performance with unsupervised regularizersAdv Neural Inf Process Syst20183172
Li Z, Otholt J, Dai B, Hu D, Meinel C, Yang H (2022) A closer look at novel class discovery from the labeled set. In: NeurIPS 2022 workshop on distribution shifts: connecting methods and applications
NgAJordanMWeissYOn spectral clustering: analysis and an algorithmAdv Neural Inf Process Syst2001814
Zhong Z, Zhu L, Luo Z, Li S, Yang Y, Sebe N (2021) Openmix: reviving known knowledge for discovering novel visual categories in an open world. In: CVPR, pp 9462–9470
Anguita D, Ghio A, Oneto L, Parra X, Reyes-Ortiz JL (2013) A public domain dataset for human activity recognition using smartphones. In: ESANN
Zhong Z, Fini E, Roy S, Luo Z, Ricci E, Sebe N (2021) Neighborhood contrastive learning for novel class discovery. In: CVPR
Sun Y, Li Y (2023) Opencon: open-world contrastive learning. In: TMLR
Chen Y, Zhu X, Li W, Gong S (2020) Semi-supervised learning under class distribution mismatch. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 3569–3576
Von Luxburg U, Williamson RC, Guyon I (2012) Clustering: science or art? In: ICML workshop on unsupervised and transfer learning, pp 65–79
Yang M, Zhu Y, Yu J, Wu A, Deng C (2022) Divide and conquer: compositional experts for generalized novel class discovery. In: CVPR, pp 14268–14277
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: CVPR, pp 248–255
ArbelaitzOGurrutxagaIMuguerzaJPérezJMPeronaIAn extensive comparative study of cluster validity indicesPattern Recogn201346124325610.1016/j.patcog.2012.07.021
KhanAAMohantySKA fast spectral clustering technique using MST based proximity graph for diversified datasetsInf Sci202271113113110.1016/j.ins.2022.07.101
Vaze S, Han K, Vedaldi A, Zisserman A (2022) Generalized category discovery. In: CVPR, pp. 7492–7501
Hsu Y-C, Lv Z, Kira Z (2018) Learning to cluster in order to transfer across domains and tasks. In: ICLR
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol 96, pp 226–231
KuhnHWYawBThe Hungarian method for the assignment problemNaval Res Logist Quart1955683977551010.1002/nav.3800020109
SunYShiZLiangYLiYWhen and how does known class help discover unknown ones? Provable understanding through spectral analysisICML2023202301433043
Fei Y, Zhao Z, Yang S, Zhao B (2022) Xcon: learning with experts for fine-grained category discovery. In: British machine vision conference (BMVC)
Yang M, Wang L, Deng C, Zhang H (2023) Bootstrap your own prior: towards distribution-agnostic novel class discovery. In: CVPR, pp 3459–3468
Satopaa V, Albrecht J, Irwin D, Raghavan B (2011) Finding a “kneedle” in a haystack: detecting knee points in system behavior. In: ICDCS workshops. IEEE, pp 166–171
Han K, Vedaldi A, Zisserman A (2019) Learning to discover novel visual categories via deep transfer clustering. In: ICCV
Cao K, Brbic M, Leskovec J (2022) Open-world semi-supervised learning. In: ICLR
Zheng J, Li W, Hong J, Petersson L, Barnes N (2022) Towards open-set object detection and discovery. In: CVPR, pp 3961–3970
LuxburgUA tutorial on spectral clusteringStat Comput200717395416240980310.1007/s11222-007-9033-z
Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. In: ICLR
Chi H, Liu F, Yang W, Lan L, Liu T, Han B, Niu G, Zhou M, Sugiyama M (2022) Meta discovery: learning to discover novel classes given very limited data. In: ICLR
ZhaoBHanKNovel visual category discovery with dual ranking statistics and mutual knowledge distillationAdv Neural Inf Process Syst2021342298222994
FreyPWSlateDJLetter recognition using Holland-style adaptive classifiersMach Learn2005616118210.1007/BF00114162
Guo L-Z, Zhang Z-Y, Jiang Y, Li Y-F, Zhou Z-H (2020) Safe deep semi-supervised learning for unseen-class unlabeled data. In: ICML
Caron M, Touvron H, Misra I, Jegou H., Mairal J, Bojanowski P, Joulin A (2021) Emerging properties in self-supervised vision transformers. In: ICCV, pp. 1–21
ArvaiKKneedZenodo202310.5281/zenodo.7873825
XieJGirshickRFarhadiAUnsupervised deep embedding for clustering analysisICML201648478487
1025_CR19
1025_CR15
1025_CR37
1025_CR16
1025_CR38
1025_CR17
HW Kuhn (1025_CR21) 1955; 6
WJ Scheirer (1025_CR27) 2013; 35
1025_CR18
A Ng (1025_CR25) 2001; 8
O Arbelaitz (1025_CR2) 2013; 46
1025_CR11
1025_CR33
1025_CR12
1025_CR34
1025_CR14
1025_CR36
1025_CR30
1025_CR31
1025_CR10
1025_CR32
L Le (1025_CR22) 2018; 31
B Zhao (1025_CR39) 2021; 34
1025_CR8
1025_CR9
1025_CR6
1025_CR26
1025_CR7
1025_CR5
K Arvai (1025_CR4) 2023
1025_CR3
1025_CR1
W Stuetzle (1025_CR28) 2003; 20
Y Sun (1025_CR29) 2023; 202
1025_CR23
U Luxburg (1025_CR24) 2007; 17
PW Frey (1025_CR13) 2005; 6
1025_CR40
1025_CR41
AA Khan (1025_CR20) 2022; 7
1025_CR42
J Xie (1025_CR35) 2016; 48
References_xml – reference: ArbelaitzOGurrutxagaIMuguerzaJPérezJMPeronaIAn extensive comparative study of cluster validity indicesPattern Recogn201346124325610.1016/j.patcog.2012.07.021
– reference: ArvaiKKneedZenodo202310.5281/zenodo.7873825
– reference: FreyPWSlateDJLetter recognition using Holland-style adaptive classifiersMach Learn2005616118210.1007/BF00114162
– reference: Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. In: ICLR
– reference: Arthur D, Vassilvitskii S (2007) K-means++ the advantages of careful seeding. In: ACM-SIAM SODA, pp 1027–1035
– reference: XieJGirshickRFarhadiAUnsupervised deep embedding for clustering analysisICML201648478487
– reference: ZhaoBHanKNovel visual category discovery with dual ranking statistics and mutual knowledge distillationAdv Neural Inf Process Syst2021342298222994
– reference: KuhnHWYawBThe Hungarian method for the assignment problemNaval Res Logist Quart1955683977551010.1002/nav.3800020109
– reference: Von Luxburg U, Williamson RC, Guyon I (2012) Clustering: science or art? In: ICML workshop on unsupervised and transfer learning, pp 65–79
– reference: Li Z, Otholt J, Dai B, Hu D, Meinel C, Yang H (2022) A closer look at novel class discovery from the labeled set. In: NeurIPS 2022 workshop on distribution shifts: connecting methods and applications
– reference: ScheirerWJRezende RochaASapkotaABoultTEToward open set recognitionIEEE Trans Pattern Anal Mach Intell20133571757177210.1109/TPAMI.2012.256
– reference: NgAJordanMWeissYOn spectral clustering: analysis and an algorithmAdv Neural Inf Process Syst2001814
– reference: Zhang L, Qi L, Yang X, Qiao H, Yang M-H, Liu Z (2022) Automatically discovering novel visual categories with self-supervised prototype learning
– reference: Zhong Z, Fini E, Roy S, Luo Z, Ricci E, Sebe N (2021) Neighborhood contrastive learning for novel class discovery. In: CVPR
– reference: Guo L-Z, Zhang Z-Y, Jiang Y, Li Y-F, Zhou Z-H (2020) Safe deep semi-supervised learning for unseen-class unlabeled data. In: ICML
– reference: Chen Y, Zhu X, Li W, Gong S (2020) Semi-supervised learning under class distribution mismatch. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 3569–3576
– reference: Han K, Rebuffi S-A, Ehrhardt S, Vedaldi A, Zisserman A (2021) Autonovel: automatically discovering and learning novel visual categories. In: PAMI
– reference: Satopaa V, Albrecht J, Irwin D, Raghavan B (2011) Finding a “kneedle” in a haystack: detecting knee points in system behavior. In: ICDCS workshops. IEEE, pp 166–171
– reference: StuetzleWEstimating the cluster tree of a density by analyzing the minimal spanning tree of a sampleJ Classif20032012547198312010.1007/s00357-003-0004-6
– reference: Chi H, Liu F, Yang W, Lan L, Liu T, Han B, Niu G, Zhou M, Sugiyama M (2022) Meta discovery: learning to discover novel classes given very limited data. In: ICLR
– reference: Cao K, Brbic M, Leskovec J (2022) Open-world semi-supervised learning. In: ICLR
– reference: Yang M, Zhu Y, Yu J, Wu A, Deng C (2022) Divide and conquer: compositional experts for generalized novel class discovery. In: CVPR, pp 14268–14277
– reference: SunYShiZLiangYLiYWhen and how does known class help discover unknown ones? Provable understanding through spectral analysisICML2023202301433043
– reference: Troisemaine C, Flocon-Cholet J, Gosselin S, Vaton S, Reiffers-Masson A, Lemaire V (2022) A method for discovering novel classes in tabular data. In: ICKG, pp 265–274
– reference: Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: CVPR, pp 248–255
– reference: Hsu Y-C, Lv Z, Kira Z (2018) Learning to cluster in order to transfer across domains and tasks. In: ICLR
– reference: LuxburgUA tutorial on spectral clusteringStat Comput200717395416240980310.1007/s11222-007-9033-z
– reference: Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol 96, pp 226–231
– reference: Troisemaine C, Lemaire V, Gosselin S, Reiffers-Masson A, Flocon-Cholet J, Vaton S (2023) Novel class discovery: an introduction and key concepts
– reference: Zheng J, Li W, Hong J, Petersson L, Barnes N (2022) Towards open-set object detection and discovery. In: CVPR, pp 3961–3970
– reference: Yang M, Wang L, Deng C, Zhang H (2023) Bootstrap your own prior: towards distribution-agnostic novel class discovery. In: CVPR, pp 3459–3468
– reference: Fei Y, Zhao Z, Yang S, Zhao B (2022) Xcon: learning with experts for fine-grained category discovery. In: British machine vision conference (BMVC)
– reference: Zhong Z, Zhu L, Luo Z, Li S, Yang Y, Sebe N (2021) Openmix: reviving known knowledge for discovering novel visual categories in an open world. In: CVPR, pp 9462–9470
– reference: Dua D, Graff C (2017) UCI machine learning repository
– reference: Han K, Vedaldi A, Zisserman A (2019) Learning to discover novel visual categories via deep transfer clustering. In: ICCV
– reference: Hsu Y-C, Lv Z, Schlosser J, Odom P, Kira Z (2019) Multi-class classification without multi-class labels. In: ICLR
– reference: Vaze S, Han K, Vedaldi A, Zisserman A (2022) Generalized category discovery. In: CVPR, pp. 7492–7501
– reference: Anguita D, Ghio A, Oneto L, Parra X, Reyes-Ortiz JL (2013) A public domain dataset for human activity recognition using smartphones. In: ESANN
– reference: Caron M, Touvron H, Misra I, Jegou H., Mairal J, Bojanowski P, Joulin A (2021) Emerging properties in self-supervised vision transformers. In: ICCV, pp. 1–21
– reference: LeLPattersonAWhiteMSupervised autoencoders: improving generalization performance with unsupervised regularizersAdv Neural Inf Process Syst20183172
– reference: Sun Y, Li Y (2023) Opencon: open-world contrastive learning. In: TMLR
– reference: KhanAAMohantySKA fast spectral clustering technique using MST based proximity graph for diversified datasetsInf Sci202271113113110.1016/j.ins.2022.07.101
– ident: 1025_CR30
– ident: 1025_CR42
  doi: 10.1109/CVPR46437.2021.00934
– ident: 1025_CR32
– volume: 34
  start-page: 22982
  year: 2021
  ident: 1025_CR39
  publication-title: Adv Neural Inf Process Syst
– ident: 1025_CR9
  doi: 10.1109/CVPR.2009.5206848
– volume: 20
  start-page: 25
  issue: 1
  year: 2003
  ident: 1025_CR28
  publication-title: J Classif
  doi: 10.1007/s00357-003-0004-6
– ident: 1025_CR34
– ident: 1025_CR37
  doi: 10.1109/CVPR52688.2022.01387
– ident: 1025_CR40
  doi: 10.1109/CVPRW56347.2022.00441
– ident: 1025_CR17
  doi: 10.1109/ICCV.2019.00849
– year: 2023
  ident: 1025_CR4
  publication-title: Zenodo
  doi: 10.5281/zenodo.7873825
– ident: 1025_CR6
  doi: 10.1109/ICCV48922.2021.00951
– ident: 1025_CR19
– ident: 1025_CR8
– volume: 6
  start-page: 161
  year: 2005
  ident: 1025_CR13
  publication-title: Mach Learn
  doi: 10.1007/BF00114162
– ident: 1025_CR38
– ident: 1025_CR15
– volume: 7
  start-page: 1113
  year: 2022
  ident: 1025_CR20
  publication-title: Inf Sci
  doi: 10.1016/j.ins.2022.07.101
– volume: 8
  start-page: 14
  year: 2001
  ident: 1025_CR25
  publication-title: Adv Neural Inf Process Syst
– ident: 1025_CR16
  doi: 10.1109/TPAMI.2021.3091944
– ident: 1025_CR33
  doi: 10.1109/CVPR52688.2022.00734
– ident: 1025_CR31
  doi: 10.1109/ICKG55886.2022.00041
– ident: 1025_CR11
– volume: 202
  start-page: 3014
  year: 2023
  ident: 1025_CR29
  publication-title: ICML
– volume: 17
  start-page: 395
  year: 2007
  ident: 1025_CR24
  publication-title: Stat Comput
  doi: 10.1007/s11222-007-9033-z
– ident: 1025_CR26
  doi: 10.1109/ICDCSW.2011.20
– ident: 1025_CR23
– ident: 1025_CR36
  doi: 10.1109/CVPR52729.2023.00337
– ident: 1025_CR41
  doi: 10.1109/CVPR46437.2021.01072
– ident: 1025_CR7
  doi: 10.1609/aaai.v34i04.5763
– ident: 1025_CR14
– ident: 1025_CR5
– ident: 1025_CR3
– ident: 1025_CR18
– ident: 1025_CR1
– volume: 48
  start-page: 478
  year: 2016
  ident: 1025_CR35
  publication-title: ICML
– volume: 6
  start-page: 83
  year: 1955
  ident: 1025_CR21
  publication-title: Naval Res Logist Quart
  doi: 10.1002/nav.3800020109
– ident: 1025_CR12
– ident: 1025_CR10
– volume: 46
  start-page: 243
  issue: 1
  year: 2013
  ident: 1025_CR2
  publication-title: Pattern Recogn
  doi: 10.1016/j.patcog.2012.07.021
– volume: 31
  start-page: 72
  year: 2018
  ident: 1025_CR22
  publication-title: Adv Neural Inf Process Syst
– volume: 35
  start-page: 1757
  issue: 7
  year: 2013
  ident: 1025_CR27
  publication-title: IEEE Trans Pattern Anal Mach Intell
  doi: 10.1109/TPAMI.2012.256
SSID ssj0005230
Score 2.4070237
Snippet The problem of novel class discovery (NCD) consists in extracting knowledge from a labeled set of known classes to accurately partition an unlabeled set of...
The problem of Novel Class Discovery (NCD) consists in extracting knowledge from a labeled set of known classes to accurately partition an unlabeled set of...
SourceID hal
proquest
crossref
springer
SourceType Open Access Repository
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 2087
SubjectTerms Algorithms
Artificial Intelligence
Chemistry and Earth Sciences
Clustering
Computer Science
Computer vision
Data Mining and Knowledge Discovery
Information Storage and Retrieval
Knowledge
Methods
Physics
Statistics for Engineering
Tables (data)
SummonAdditionalLinks – databaseName: SpringerLink Journals (ICM)
  dbid: U2A
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED_cRPDFj6k4nRLENw0sy8fSBx-GOIaoTw72VpI0RWF04upg_72Xrt2mqOBbm15auLv07sjvdwG45N5wrRJDWSc1VBhuqJVOUWNk2ycYoFxBCnt8UoOhuB_JUUkKm1Zo92pLsvhTr5HdFNMUYwoNfdAknddgU4baHb142OmtATv4ghusBZWatUuqzM_v-BKOai8BDLmWaX7bHC1iTn8PdspkkfQW1t2HDZ81YLc6iIGU67IBWwWO000P4KZHSt4Tzqv6hZN8QrLJzI-JC7kyCUzcgNyck9eM5MYGJCoJUNFDGPbvnm8HtDwhgToueU5dlFitpHA6MhyvuHeeM8tTg4FcsEQ5vE9FIrzUbZN2fVdbbyKllcN0VTp-BPVskvljIJZh4SBxGCO6cGlkOS5mbp320smEiSawSlGxK9uHh1MsxvGq8XFQbozKjQvlxvMmXC3nvC2aZ_wpfYH6XwqGvteD3kMcxkJhxzGxmHWa0KrME5erbRpzrMuw0mNKN-G6Mtnq8e-fPPmf-ClsdwrfCWjdFtTz9w9_hjlJbs8LF_wElMzV6g
  priority: 102
  providerName: Springer Nature
Title A practical approach to novel class discovery in tabular data
URI https://link.springer.com/article/10.1007/s10618-024-01025-y
https://www.proquest.com/docview/3086149168
https://hal.science/hal-04283853
Volume 38
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3da9swED-a5mUv-2g3lrUNYuytFYuijygPZbgjH7RrKGWB7MlIsswGxUkbN5D_vidHbrrB8mbLsg13ku530u_uAL5wb7hWmaGsmxsqDDfUSqeoMbLjMzRQrgoKu56o8VRczuQsbrgtI62yXhOrhTqbu7BH_pUj9kY0z5T-trinoWpUOF2NJTQa0MQlWKPz1bwYTG5uX5A8-CZOWAsqNevEsJkYPKeYpmijaMirJun6L9PU-B2IkS9Q5z8HpZX9Gb6F1xE4kmSj6Xew54sDeFMXZSBxjh7CeUJi5BP2rjOGk3JOivnK3xEX0DIJsbiBu7kmfwpSGhu4qCSQRd_DdDj4-X1MY40E6rjkJXX9zGolhdN9w_GKe-c5szw3aMoFy5TD-1xkwkvdMXnP97T1pq-0cghYpeMfYL-YF_4jEMvQdZDYjDZduLxvOU5nbp320smMiRawWjypiwnEQx2Lu3Sb-jiINEWRppVI03ULTp_fWWzSZ-zs_Rml_twxZL4eJz_S0BZcO47QYtVtwXGtlDTOt2W6HR0tOKsVtX38_19-2v21I3jVrUZI4Ocew3758OhPEIWUtg0NPRy1oZmMfl0N2nHgYeu0mzwBjKXaXw
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9wwEB4BPdBLC32ILbRYVXuiVtfxA-eAqhXtdikLJ5C4ubbjCCSUpSVQ7Z_qb-xMNmEBqdy4JY7jSOPxzOd4vhmADzJ5aU3huchKz5WXngcdDfde91OBDio2pLCDQzM6Vj9O9MkC_O24MBRW2dnExlAXk0j_yD9LxN6I5oWxXy5-caoaRaerXQmNmVrsp-kf3LJd7ux9xfn9mGXDb0e7I95WFeBRalnzmBfBGq2izb3EK5likiLI0qPzU6IwEe9LVaikbd-X22nbhuRzY01EiKejxHEX4YmSMqcVZYffb4WUyBkr2Squrei3JJ2WqmeE5egROWVx03x6xxEunlIY5i2Me-9YtvF2wxV41sJUNpjp1SospOoFPO9KQLDWIryEnQFreVbYu8tPzuoJqybX6ZxFwuaMmL8UKTplZxWrfaDIV0ahqa_g-FFk9xqWqkmV1oAFgRsVjc2IIFQs8yDReMgQbdJRF0L1QHTicbFNV05VM87dPNEyidShSF0jUjftwdbNOxezZB0P9n6PUr_pSHm2R4OxozbaSEoEMtdZDza6SXHt6r50c13swaduouaP___JNw-PtgnLo6ODsRvvHe6vw9Os0RaKDN6Apfr3VXqL-KcO7xqlY_DzsbX8HxxDE_E
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3db9MwED9tnYR44RtRGGAheAJrdfxR92FCha3q2KgmxKS9GdtxBNKUDhaG-q_x13GXOutAYm97Sxwnkc6Xu9_Fv7sDeCmTl9aUnoui8lx56XnQ0XDv9SCV6KBimxT2cWamR-rDsT5eg99dLgzRKjub2Brqch7pH_mWROyNaF4Yu1VlWsThzuTt6XdOHaRop7Vrp7FUkf20-IXh29n23g6u9auimOx-fj_lucMAj1LLhsdRGazRKtqRl3gkU0xSBFl5dIRKlCbieaVKlbQd-GqYhjYkPzLWRIR7Okp87jpsDDEqGvRg493u7PDTJYKJXOYoW8W1FYOcspMT94ywHP0jp5pumi_-covrX4mUeQnx_rNJ2_q-yR24lUErGy-17C6spfoe3O4aQrBsH-7D9pjlrCuc3VUrZ82c1fPzdMIiIXVGecDEG12wbzVrfCAeLCOi6gM4uhbpPYRePa_TI2BBYNiicRjxhIrVKEg0JTJEm3TUpVB9EJ14XMzFy6mHxolblV0mkToUqWtF6hZ9eH1xz-mydMeVs1-g1C8mUtXt6fjA0RiFlRJhzXnRh81uUVz-1s_cSjP78KZbqNXl_7_y8dVPew43UMPdwd5s_wncLFplIZrwJvSaHz_TUwRDTXiWtY7Bl-tW9D-ztRmD
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Practical+Approach+to+Novel+Class+Discovery+in+Tabular+Data&rft.jtitle=Data+mining+and+knowledge+discovery&rft.au=Troisemaine%2C+Colin&rft.au=Reiffers-Masson%2C+Alexandre&rft.au=Gosselin%2C+St%C3%A9phane&rft.au=Lemaire%2C+Vincent&rft.series=ECML+PKDD+2024&rft.date=2024-07-01&rft.pub=Springer&rft.issn=1384-5810&rft.eissn=1573-756X&rft_id=info:doi/10.1007%2Fs10618-024-01025-y&rft.externalDBID=HAS_PDF_LINK&rft.externalDocID=oai_HAL_hal_04283853v2
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1384-5810&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1384-5810&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1384-5810&client=summon