Deep learning-based clustering approaches for bioinformatics

Abstract Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at analyzing unstructured and high-dimensional data in the form of sequences, expressions, texts and images. Further, clustering is used to gain insi...

Full description

Saved in:
Bibliographic Details
Published inBriefings in bioinformatics Vol. 22; no. 1; pp. 393 - 415
Main Authors Karim, Md Rezaul, Beyan, Oya, Zappa, Achille, Costa, Ivan G, Rebholz-Schuhmann, Dietrich, Cochez, Michael, Decker, Stefan
Format Journal Article
LanguageEnglish
Published England Oxford University Press 18.01.2021
Oxford Publishing Limited (England)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Abstract Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at analyzing unstructured and high-dimensional data in the form of sequences, expressions, texts and images. Further, clustering is used to gain insights into biological processes in the genomics level, e.g. clustering of gene expressions provides insights on the natural structure inherent in the data, understanding gene functions, cellular processes, subtypes of cells and understanding gene regulations. Subsequently, clustering approaches, including hierarchical, centroid-based, distribution-based, density-based and self-organizing maps, have long been studied and used in classical machine learning settings. In contrast, deep learning (DL)-based representation and feature learning for clustering have not been reviewed and employed extensively. Since the quality of clustering is not only dependent on the distribution of data points but also on the learned representation, deep neural networks can be effective means to transform mappings from a high-dimensional data space into a lower-dimensional feature space, leading to improved clustering results. In this paper, we review state-of-the-art DL-based approaches for cluster analysis that are based on representation learning, which we hope to be useful, particularly for bioinformatics research. Further, we explore in detail the training procedures of DL-based clustering algorithms, point out different clustering quality metrics and evaluate several DL-based approaches on three bioinformatics use cases, including bioimaging, cancer genomics and biomedical text mining. We believe this review and the evaluation results will provide valuable insights and serve a starting point for researchers wanting to apply DL-based unsupervised methods to solve emerging bioinformatics research problems.
AbstractList Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at analyzing unstructured and high-dimensional data in the form of sequences, expressions, texts and images. Further, clustering is used to gain insights into biological processes in the genomics level, e.g. clustering of gene expressions provides insights on the natural structure inherent in the data, understanding gene functions, cellular processes, subtypes of cells and understanding gene regulations. Subsequently, clustering approaches, including hierarchical, centroid-based, distribution-based, density-based and self-organizing maps, have long been studied and used in classical machine learning settings. In contrast, deep learning (DL)-based representation and feature learning for clustering have not been reviewed and employed extensively. Since the quality of clustering is not only dependent on the distribution of data points but also on the learned representation, deep neural networks can be effective means to transform mappings from a high-dimensional data space into a lower-dimensional feature space, leading to improved clustering results. In this paper, we review state-of-the-art DL-based approaches for cluster analysis that are based on representation learning, which we hope to be useful, particularly for bioinformatics research. Further, we explore in detail the training procedures of DL-based clustering algorithms, point out different clustering quality metrics and evaluate several DL-based approaches on three bioinformatics use cases, including bioimaging, cancer genomics and biomedical text mining. We believe this review and the evaluation results will provide valuable insights and serve a starting point for researchers wanting to apply DL-based unsupervised methods to solve emerging bioinformatics research problems.
Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at analyzing unstructured and high-dimensional data in the form of sequences, expressions, texts and images. Further, clustering is used to gain insights into biological processes in the genomics level, e.g. clustering of gene expressions provides insights on the natural structure inherent in the data, understanding gene functions, cellular processes, subtypes of cells and understanding gene regulations. Subsequently, clustering approaches, including hierarchical, centroid-based, distribution-based, density-based and self-organizing maps, have long been studied and used in classical machine learning settings. In contrast, deep learning (DL)-based representation and feature learning for clustering have not been reviewed and employed extensively. Since the quality of clustering is not only dependent on the distribution of data points but also on the learned representation, deep neural networks can be effective means to transform mappings from a high-dimensional data space into a lower-dimensional feature space, leading to improved clustering results. In this paper, we review state-of-the-art DL-based approaches for cluster analysis that are based on representation learning, which we hope to be useful, particularly for bioinformatics research. Further, we explore in detail the training procedures of DL-based clustering algorithms, point out different clustering quality metrics and evaluate several DL-based approaches on three bioinformatics use cases, including bioimaging, cancer genomics and biomedical text mining. We believe this review and the evaluation results will provide valuable insights and serve a starting point for researchers wanting to apply DL-based unsupervised methods to solve emerging bioinformatics research problems.Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at analyzing unstructured and high-dimensional data in the form of sequences, expressions, texts and images. Further, clustering is used to gain insights into biological processes in the genomics level, e.g. clustering of gene expressions provides insights on the natural structure inherent in the data, understanding gene functions, cellular processes, subtypes of cells and understanding gene regulations. Subsequently, clustering approaches, including hierarchical, centroid-based, distribution-based, density-based and self-organizing maps, have long been studied and used in classical machine learning settings. In contrast, deep learning (DL)-based representation and feature learning for clustering have not been reviewed and employed extensively. Since the quality of clustering is not only dependent on the distribution of data points but also on the learned representation, deep neural networks can be effective means to transform mappings from a high-dimensional data space into a lower-dimensional feature space, leading to improved clustering results. In this paper, we review state-of-the-art DL-based approaches for cluster analysis that are based on representation learning, which we hope to be useful, particularly for bioinformatics research. Further, we explore in detail the training procedures of DL-based clustering algorithms, point out different clustering quality metrics and evaluate several DL-based approaches on three bioinformatics use cases, including bioimaging, cancer genomics and biomedical text mining. We believe this review and the evaluation results will provide valuable insights and serve a starting point for researchers wanting to apply DL-based unsupervised methods to solve emerging bioinformatics research problems.
Abstract Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at analyzing unstructured and high-dimensional data in the form of sequences, expressions, texts and images. Further, clustering is used to gain insights into biological processes in the genomics level, e.g. clustering of gene expressions provides insights on the natural structure inherent in the data, understanding gene functions, cellular processes, subtypes of cells and understanding gene regulations. Subsequently, clustering approaches, including hierarchical, centroid-based, distribution-based, density-based and self-organizing maps, have long been studied and used in classical machine learning settings. In contrast, deep learning (DL)-based representation and feature learning for clustering have not been reviewed and employed extensively. Since the quality of clustering is not only dependent on the distribution of data points but also on the learned representation, deep neural networks can be effective means to transform mappings from a high-dimensional data space into a lower-dimensional feature space, leading to improved clustering results. In this paper, we review state-of-the-art DL-based approaches for cluster analysis that are based on representation learning, which we hope to be useful, particularly for bioinformatics research. Further, we explore in detail the training procedures of DL-based clustering algorithms, point out different clustering quality metrics and evaluate several DL-based approaches on three bioinformatics use cases, including bioimaging, cancer genomics and biomedical text mining. We believe this review and the evaluation results will provide valuable insights and serve a starting point for researchers wanting to apply DL-based unsupervised methods to solve emerging bioinformatics research problems.
Author Rebholz-Schuhmann, Dietrich
Zappa, Achille
Costa, Ivan G
Beyan, Oya
Cochez, Michael
Karim, Md Rezaul
Decker, Stefan
AuthorAffiliation 4 Institute for Computational Genomics , RWTH Aachen University Medical School, Aachen, Germany
5 German National Library of Medicine , University of Cologne, Germany
2 Information Systems and Databases , RWTH Aachen University, Aachen, Germany
6 Department of Computer Science , Vrije Univeriteit Amsterdam, The Netherlands
1 Fraunhofer Institute for Applied Information Technology FIT , Schloss Birlinghoven, Sankt Augustin, Germany
3 Insight Centre for Data Analytics , National University of Ireland Galway, Ireland
AuthorAffiliation_xml – name: 4 Institute for Computational Genomics , RWTH Aachen University Medical School, Aachen, Germany
– name: 6 Department of Computer Science , Vrije Univeriteit Amsterdam, The Netherlands
– name: 3 Insight Centre for Data Analytics , National University of Ireland Galway, Ireland
– name: 1 Fraunhofer Institute for Applied Information Technology FIT , Schloss Birlinghoven, Sankt Augustin, Germany
– name: 5 German National Library of Medicine , University of Cologne, Germany
– name: 2 Information Systems and Databases , RWTH Aachen University, Aachen, Germany
Author_xml – sequence: 1
  givenname: Md Rezaul
  surname: Karim
  fullname: Karim, Md Rezaul
  email: rezaul.karim@fit.fraunhofer.de
  organization: Fraunhofer Institute for Applied Information Technology FIT, Schloss Birlinghoven, Sankt Augustin, Germany
– sequence: 2
  givenname: Oya
  surname: Beyan
  fullname: Beyan, Oya
  organization: Fraunhofer Institute for Applied Information Technology FIT, Schloss Birlinghoven, Sankt Augustin, Germany
– sequence: 3
  givenname: Achille
  surname: Zappa
  fullname: Zappa, Achille
  organization: Insight Centre for Data Analytics, National University of Ireland Galway, Ireland
– sequence: 4
  givenname: Ivan G
  surname: Costa
  fullname: Costa, Ivan G
  organization: Institute for Computational Genomics, RWTH Aachen University Medical School, Aachen, Germany
– sequence: 5
  givenname: Dietrich
  surname: Rebholz-Schuhmann
  fullname: Rebholz-Schuhmann, Dietrich
  organization: German National Library of Medicine, University of Cologne, Germany
– sequence: 6
  givenname: Michael
  surname: Cochez
  fullname: Cochez, Michael
  organization: Fraunhofer Institute for Applied Information Technology FIT, Schloss Birlinghoven, Sankt Augustin, Germany
– sequence: 7
  givenname: Stefan
  surname: Decker
  fullname: Decker, Stefan
  organization: Fraunhofer Institute for Applied Information Technology FIT, Schloss Birlinghoven, Sankt Augustin, Germany
BackLink https://www.ncbi.nlm.nih.gov/pubmed/32008043$$D View this record in MEDLINE/PubMed
BookMark eNp9kVtr3DAQhUVJaG596Q8ohlIIAWdHknUxhELJHRby0j4LSZYTBa_kSHag_fXVskloQigINEjfHGbO2UNbIQaH0GcMxxhaujDeLIz5gwV8QLu4EaJugDVb65qLmjWc7qC9nO8BCAiJP6IdSgAkNHQXnZw5N1aD0yn4cFsbnV1X2WHOk0vlodLjmKK2dy5XfUyV8dGHUqz05G0-QNu9HrL79HTvo18X5z9Pr-rlzeX16Y9lbRvKp9owSTvBOg2W91ZrSVvbASZMYA7cCCswbblmpeo7TGRvmeGto005lmtC99H3je44m5XrrAtT0oMak1_p9FtF7dXrn-Dv1G18VEISkJIVgcMngRQfZpcntfLZumHQwcU5K0IZ0GIcX6Nf36D3cU6hrKcII23LZUuaQn35d6KXUZ6NLcDRBrAp5pxc_4JgUOvUVElNbVIrMLyBrZ-Kw3G9jR_eb_m2aYnz-D_pvzHEqEk
CitedBy_id crossref_primary_10_3389_fmars_2024_1416247
crossref_primary_10_1016_j_patcog_2023_110136
crossref_primary_10_3389_fimmu_2022_1096587
crossref_primary_10_1007_s11517_023_02985_x
crossref_primary_10_1016_j_artmed_2023_102758
crossref_primary_10_1371_journal_pcbi_1011073
crossref_primary_10_1016_j_ijrobp_2024_02_065
crossref_primary_10_3233_JIFS_231747
crossref_primary_10_1103_PhysRevA_105_052424
crossref_primary_10_3390_ijerph19105893
crossref_primary_10_1016_j_irbm_2022_100748
crossref_primary_10_1016_j_ins_2021_12_045
crossref_primary_10_1007_s11042_025_20603_w
crossref_primary_10_1016_j_neunet_2024_106773
crossref_primary_10_1038_s41540_022_00247_4
crossref_primary_10_1109_ACCESS_2022_3175816
crossref_primary_10_1016_j_knosys_2024_112609
crossref_primary_10_1002_cpe_8387
crossref_primary_10_3390_brainsci14010040
crossref_primary_10_7717_peerj_14779
crossref_primary_10_1111_cgf_15275
crossref_primary_10_1142_S0219877024500238
crossref_primary_10_1142_S0218126623500731
crossref_primary_10_3390_cancers13215546
crossref_primary_10_1016_j_cmpb_2022_107017
crossref_primary_10_1016_j_bcp_2024_116091
crossref_primary_10_3390_rs16244639
crossref_primary_10_1007_s42835_023_01432_z
crossref_primary_10_1109_OAJPE_2025_3535709
crossref_primary_10_1186_s40462_024_00507_4
crossref_primary_10_4236_cn_2024_164007
crossref_primary_10_3390_bioengineering9070316
crossref_primary_10_3389_fgene_2024_1255455
crossref_primary_10_1016_j_cmpb_2023_107808
crossref_primary_10_1016_j_heliyon_2025_e41953
crossref_primary_10_1142_S0219622023300045
crossref_primary_10_3390_math9172121
crossref_primary_10_1007_s44163_024_00102_x
crossref_primary_10_1142_S0219265921410127
crossref_primary_10_1186_s13059_024_03386_5
crossref_primary_10_1016_j_yamp_2021_07_005
crossref_primary_10_1109_TCBB_2020_2994649
crossref_primary_10_1016_j_psychres_2023_115265
crossref_primary_10_1002_advs_202408069
crossref_primary_10_1038_s41746_024_01247_w
crossref_primary_10_3390_math10152559
crossref_primary_10_1123_ijspp_2024_0247
crossref_primary_10_3389_fcomp_2021_672867
crossref_primary_10_1109_ACCESS_2024_3437371
crossref_primary_10_58496_MJCSC_2023_013
crossref_primary_10_61969_jai_1469589
crossref_primary_10_1186_s12987_022_00311_5
crossref_primary_10_1093_bib_bbac072
crossref_primary_10_1109_ACCESS_2023_3244620
crossref_primary_10_1016_j_knosys_2023_111315
crossref_primary_10_3390_genes13010065
crossref_primary_10_3390_ijgi10060391
crossref_primary_10_1016_j_bspc_2022_104182
crossref_primary_10_1038_s43588_022_00234_z
crossref_primary_10_1002_jcc_27470
crossref_primary_10_1007_s11307_023_01828_3
crossref_primary_10_1016_j_future_2022_05_024
crossref_primary_10_1007_s13042_022_01518_6
crossref_primary_10_1016_j_modpat_2023_100381
crossref_primary_10_3389_fneph_2025_1548776
crossref_primary_10_1039_D1CB00069A
crossref_primary_10_1016_j_neucom_2024_127761
crossref_primary_10_1038_s41598_024_63695_4
crossref_primary_10_3390_molecules29163902
crossref_primary_10_1089_dcbr_2023_0018
crossref_primary_10_1093_bib_bbad236
crossref_primary_10_3390_biom14111447
crossref_primary_10_1111_exsy_70011
crossref_primary_10_3389_fmed_2022_950327
crossref_primary_10_1109_TCSS_2023_3327810
crossref_primary_10_3390_bioengineering9040136
crossref_primary_10_3389_fmed_2023_1132676
crossref_primary_10_1177_1934578X231180458
crossref_primary_10_1038_s41598_021_98126_1
crossref_primary_10_1016_j_simpa_2024_100678
crossref_primary_10_1093_bib_bbac018
crossref_primary_10_1093_cvr_cvad106
crossref_primary_10_1371_journal_pone_0261531
crossref_primary_10_3390_bioengineering10020173
crossref_primary_10_56294_mw202468
crossref_primary_10_1016_j_mtcomm_2024_109982
crossref_primary_10_7717_peerj_cs_390
crossref_primary_10_1016_j_trac_2024_118023
crossref_primary_10_1093_nar_gkac351
crossref_primary_10_3390_cancers13194837
crossref_primary_10_1016_j_engappai_2023_106214
crossref_primary_10_12677_PM_2023_1312353
crossref_primary_10_1371_journal_pone_0298039
crossref_primary_10_3390_biom13030498
crossref_primary_10_1186_s44147_022_00125_0
crossref_primary_10_3390_app14209305
crossref_primary_10_1093_bib_bbab285
crossref_primary_10_1007_s11356_023_30931_9
crossref_primary_10_1186_s12859_022_04667_1
crossref_primary_10_1016_j_csbj_2023_10_033
crossref_primary_10_1016_j_est_2024_113458
crossref_primary_10_3389_frqst_2024_1462004
crossref_primary_10_1016_j_compbiomed_2023_107414
crossref_primary_10_1063_5_0157933
crossref_primary_10_3390_logistics8030073
crossref_primary_10_1016_j_bspc_2023_105703
crossref_primary_10_1016_j_asoc_2025_112789
crossref_primary_10_1051_bioconf_202411101019
crossref_primary_10_1007_s00521_023_09366_3
crossref_primary_10_3390_molecules29194600
crossref_primary_10_1371_journal_pdig_0000422
crossref_primary_10_1109_TPDS_2024_3425931
crossref_primary_10_1038_s41551_024_01268_6
crossref_primary_10_1093_bib_bbab071
crossref_primary_10_1016_j_eswa_2024_125454
crossref_primary_10_1016_j_iswa_2024_200387
crossref_primary_10_1155_2021_9982305
crossref_primary_10_3390_math10132231
crossref_primary_10_3390_ijms241914645
crossref_primary_10_1007_s12539_025_00699_2
crossref_primary_10_1016_j_heliyon_2024_e29225
crossref_primary_10_1007_s00521_024_10497_4
crossref_primary_10_3390_pr9081466
crossref_primary_10_2174_0115672050325388240823092338
crossref_primary_10_1016_j_expneurol_2021_113608
crossref_primary_10_1093_bioadv_vbad141
crossref_primary_10_1002_jssc_202200222
crossref_primary_10_3389_fmed_2022_1070385
crossref_primary_10_3390_nano11092385
crossref_primary_10_3389_fimmu_2021_673723
crossref_primary_10_1080_10106049_2024_2326008
crossref_primary_10_1016_j_oret_2020_08_009
crossref_primary_10_1080_10255842_2023_2217978
crossref_primary_10_2174_1574893618666221130094050
crossref_primary_10_3389_fenvs_2022_946729
crossref_primary_10_1038_s42003_021_02610_3
crossref_primary_10_1002_wcms_1597
crossref_primary_10_1089_dcbr_2024_0018
crossref_primary_10_1038_s41598_022_09180_2
crossref_primary_10_3390_pharmaceutics15071916
crossref_primary_10_2139_ssrn_4779486
crossref_primary_10_1016_j_eswa_2024_125286
crossref_primary_10_1016_j_crphys_2022_02_003
Cites_doi 10.1137/1.9780898718348
10.1111/ger.12218
10.1109/ACCESS.2019.2941796
10.1145/1390156.1390294
10.1016/j.protcy.2012.10.058
10.1007/978-3-319-70096-0_39
10.1016/j.media.2019.05.010
10.1145/568574.568575
10.1145/3194658.3194677
10.1073/pnas.95.25.14863
10.1109/ICCV.2017.612
10.1109/LRA.2018.2801475
10.4258/hir.2017.23.3.141
10.1109/TMM.2017.2745702
10.1007/BF01908075
10.1111/1467-9868.00293
10.1016/0169-7439(87)80084-9
10.1109/TCBB.2013.9
10.1038/s41467-018-07931-2
10.1145/1066157.1066236
10.1109/ACCESS.2018.2855437
10.1016/j.ymeth.2017.07.023
10.1080/01621459.1971.10482356
10.1002/nav.3800020109
10.1109/ACII.2017.8273601
10.1038/s41598-019-39459-w
10.5815/ijmecs.2015.01.06
10.1145/3307339.3342161
10.1016/S0925-2312(98)00030-7
10.1186/1471-2105-9-497
10.2174/156652412798376134
10.1038/srep46450
10.1109/ICASSP.2011.5947700
10.1073/pnas.191367098
10.1073/pnas.1700770114
10.1590/S1415-47572004000400025
10.1093/bioinformatics/17.10.977
10.1109/BIBE.2019.00081
10.4137/BBI.S38316
10.1093/bioinformatics/btl406
10.1002/9780470316801.ch2
10.1109/TKDE.2004.68
10.1016/j.patcog.2018.05.019
10.1038/ng.2764
10.1214/009053607000000677
10.1093/bioinformatics/17.9.763
ContentType Journal Article
Copyright The Author(s) 2020. Published by Oxford University Press. 2021
The Author(s) 2020. Published by Oxford University Press.
Copyright_xml – notice: The Author(s) 2020. Published by Oxford University Press. 2021
– notice: The Author(s) 2020. Published by Oxford University Press.
DBID TOX
AAYXX
CITATION
NPM
7QO
7SC
8FD
FR3
JQ2
K9.
L7M
L~C
L~D
P64
RC3
7X8
5PM
DOI 10.1093/bib/bbz170
DatabaseName Oxford Journals Open Access Collection
CrossRef
PubMed
Biotechnology Research Abstracts
Computer and Information Systems Abstracts
Technology Research Database
Engineering Research Database
ProQuest Computer Science Collection
ProQuest Health & Medical Complete (Alumni)
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Biotechnology and BioEngineering Abstracts
Genetics Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle CrossRef
PubMed
Genetics Abstracts
Biotechnology Research Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Health & Medical Complete (Alumni)
Engineering Research Database
Advanced Technologies Database with Aerospace
Biotechnology and BioEngineering Abstracts
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitleList Genetics Abstracts
MEDLINE - Academic
PubMed
CrossRef


Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: TOX
  name: Oxford Journals Open Access Collection
  url: https://academic.oup.com/journals/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1477-4054
EndPage 415
ExternalDocumentID PMC7820885
32008043
10_1093_bib_bbz170
10.1093/bib/bbz170
Genre Journal Article
GrantInformation_xml – fundername: ;
GroupedDBID ---
-E4
.2P
.I3
0R~
1TH
23N
2WC
36B
4.4
48X
53G
5GY
5VS
6J9
70D
8VB
AAGQS
AAHBH
AAIJN
AAIMJ
AAJKP
AAJQQ
AAMDB
AAMVS
AAOGV
AAPQZ
AAPXW
AARHZ
AAUQX
AAVAP
AAVLN
ABDBF
ABEJV
ABEUO
ABGNP
ABIXL
ABNKS
ABPTD
ABQLI
ABQTQ
ABWST
ABXVV
ABZBJ
ACGFO
ACGFS
ACGOD
ACIWK
ACPRK
ACUFI
ACUHS
ACYTK
ADBBV
ADEYI
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADOCK
ADPDF
ADQBN
ADRDM
ADRIX
ADRTK
ADVEK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEGXH
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AEMOZ
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AFXEN
AGINJ
AGKEF
AGQXC
AGSYK
AHMBA
AHQJS
AHXPO
AIAGR
AIJHB
AJEEA
AJEUX
AKHUL
AKVCP
AKWXX
ALMA_UNASSIGNED_HOLDINGS
ALTZX
ALUQC
AMNDL
APIBT
APWMN
ARIXL
AXUDD
AYOIW
AZVOD
BAWUL
BAYMD
BCRHZ
BEYMZ
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C1A
C45
CAG
CDBKE
COF
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
E3Z
EAD
EAP
EAS
EBA
EBC
EBD
EBR
EBS
EBU
EE~
EJD
EMB
EMK
EMOBN
EST
ESX
F5P
F9B
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
H5~
HAR
HW0
HZ~
IOX
J21
K1G
KBUDW
KOP
KSI
KSN
M-Z
M49
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
NU-
O0~
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
P2P
PAFKI
PEELM
PQQKQ
Q1.
Q5Y
QWB
RD5
ROX
RPM
RUSNO
RW1
RXO
SV3
TEORI
TH9
TJP
TLC
TOX
TR2
TUS
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
ZL0
~91
AAYXX
ABPQP
ABXZS
ACUXJ
AHGBF
ALXQX
ANAKG
CITATION
JXSIZ
NPM
7QO
7SC
8FD
FR3
JQ2
K9.
L7M
L~C
L~D
P64
RC3
7X8
5PM
ID FETCH-LOGICAL-c436t-b583d75da0c6fcaa839cd012571606b7c71396a5b7cfd128fc5b69e34e34c6a23
IEDL.DBID TOX
ISSN 1467-5463
1477-4054
IngestDate Thu Aug 21 14:11:48 EDT 2025
Thu Jul 10 18:52:27 EDT 2025
Mon Jun 30 08:59:08 EDT 2025
Mon Jul 21 05:40:22 EDT 2025
Tue Jul 01 03:39:29 EDT 2025
Thu Apr 24 23:03:14 EDT 2025
Thu Feb 27 05:38:07 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
http://creativecommons.org/licenses/by/4.0
The Author(s) 2020. Published by Oxford University Press.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c436t-b583d75da0c6fcaa839cd012571606b7c71396a5b7cfd128fc5b69e34e34c6a23
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Review-3
content type line 23
OpenAccessLink https://dx.doi.org/10.1093/bib/bbz170
PMID 32008043
PQID 2529968924
PQPubID 26846
PageCount 23
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_7820885
proquest_miscellaneous_2350347765
proquest_journals_2529968924
pubmed_primary_32008043
crossref_primary_10_1093_bib_bbz170
crossref_citationtrail_10_1093_bib_bbz170
oup_primary_10_1093_bib_bbz170
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2021-Jan-18
PublicationDateYYYYMMDD 2021-01-18
PublicationDate_xml – month: 01
  year: 2021
  text: 2021-Jan-18
  day: 18
PublicationDecade 2020
PublicationPlace England
PublicationPlace_xml – name: England
– name: Oxford
PublicationTitle Briefings in bioinformatics
PublicationTitleAlternate Brief Bioinform
PublicationYear 2021
Publisher Oxford University Press
Oxford Publishing Limited (England)
Publisher_xml – name: Oxford University Press
– name: Oxford Publishing Limited (England)
References Huang (2021012203310032000_ref43) 2014
Devlin (2021012203310032000_ref95) 2018
Davidson (2021012203310032000_ref19) 2005
Yang (2021012203310032000_ref41) 2016
Aresta (2021012203310032000_ref84) 2019; 56
Kilinc (2021012203310032000_ref50) 2018
Makhzani (2021012203310032000_ref39) 2015
McDaid (2021012203310032000_ref70) 2011
Bertucci (2021012203310032000_ref86) 2012; 12
Goodfellow (2021012203310032000_ref38) 2016
Mostavi (2021012203310032000_ref56) 2019
Chen (2021012203310032000_ref69) 2016
Srivastava (2021012203310032000_ref72) 1929–1958; 15
Chiu (2021012203310032000_ref59) 2017; 34
Zhao (2021012203310032000_ref8) 2005
Jaitly (2021012203310032000_ref60) 2011
Tibshirani (2021012203310032000_ref74) 2001; 63
Kohonen (2021012203310032000_ref17) 1998; 21
Hofmann (2021012203310032000_ref25) 2008
Zheng (2021012203310032000_ref53) 2016
Min (2021012203310032000_ref28) 2017; 18
Park (2021012203310032000_ref63) 2018; 3
Jiang (2021012203310032000_ref6) 2004
Rosenberg (2021012203310032000_ref81) 2007
Gan (2021012203310032000_ref3) 2007
Hubert (2021012203310032000_ref76) 1985; 2
Ronneberger (2021012203310032000_ref91) 2015
Karmakar (2021012203310032000_ref37) 2019; 9
Vincent (2021012203310032000_ref92) 2008
Jaskowiak (2021012203310032000_ref11) 2018; 132
Lukic (2021012203310032000_ref45) 13–16 . 2016
Srivastava (2021012203310032000_ref67) 2015
Hsu (2021012203310032000_ref42) 2015
Rhee (2021012203310032000_ref85) 2017
Gräßber (2021012203310032000_ref88) 2018
Vincent (2021012203310032000_ref71) 2008
Karim (2021012203310032000_ref65) 2019; 2
Joyce (2021012203310032000_ref54) 2011
van der Maaten (2021012203310032000_ref73) 2008; 9
Costa (2021012203310032000_ref5) 2004; 27
Rajanna (2021012203310032000_ref83) 2016
Shah (2021012203310032000_ref49) 2018
Kuhn (2021012203310032000_ref80) 1955; 2
Søorlie (2021012203310032000_ref15) 2001; 98
MacQueen (2021012203310032000_ref16) 1967
Zhao (2021012203310032000_ref57) 2007; 5
Min (2021012203310032000_ref2) 2018; 6
Weinstein (2021012203310032000_ref89) 2013; 45
Edla (2021012203310032000_ref35) 2012; 6
Guo (2021012203310032000_ref29) 2017
Kaufman (2021012203310032000_ref20) 1990
Santos (2021012203310032000_ref79) 2009
Renganathan (2021012203310032000_ref87) 2017; 23
Masood (2021012203310032000_ref7) 2015; 1
Karim (2021012203310032000_ref66) 2019
Hsu (2021012203310032000_ref51) 2018; 20
Rand (2021012203310032000_ref78) 1971; 66
(2021012203310032000_ref23) 2018
Dizaji (2021012203310032000_ref46) 2017
Thalamuthu (2021012203310032000_ref13) 2006; 22
Chen (2021012203310032000_ref32) 2011
Shah (2021012203310032000_ref52) 2017; 114
Xie (2021012203310032000_ref93) 2019
Jaskowiak (2021012203310032000_ref9) 2013; 10
De Souto (2021012203310032000_ref10) 2008; 9
Vinh (2021012203310032000_ref77) 2010
Zhu (2021012203310032000_ref68) 2018
Ka (2021012203310032000_ref27) 2001; 17
Md (2021012203310032000_ref30) 2018
Estivill-Castro (2021012203310032000_ref18) 2002; 4
Kaminski (2021012203310032000_ref96) 2019; 34
Oyelade (2021012203310032000_ref1) 2016
Alirezaie (2021012203310032000_ref62) 2019
Zivkovic (2021012203310032000_ref22) 2004
Choi (2021012203310032000_ref97) 2016
Chang (2021012203310032000_ref48) 2017
Brulé (2021012203310032000_ref33) 2015; 1
Chowdary (2021012203310032000_ref14) 2014; 3
Strehl (2021012203310032000_ref75) 2002; 3
Bandyopadhyay (2021012203310032000_ref34) 2013; 1
Yeung (2021012203310032000_ref36) 2001; 17
An (2021012203310032000_ref64) 2015
Eraslan (2021012203310032000_ref12) 2019; 10
Li (2021012203310032000_ref58) 2018
Li (2021012203310032000_ref47) 2018; 83
Eisen (2021012203310032000_ref4) 1998; 95
Ng (2021012203310032000_ref26) 2002
Chen (2021012203310032000_ref44) 2015
Wold (2021012203310032000_ref24) 1987; 2
Shahapurkar (2021012203310032000_ref21) 2004
Cruz-Roa (2021012203310032000_ref82) 2017; 7
Jaques (2021012203310032000_ref31) 2017
Lintas (2021012203310032000_ref61) 11–14, 2017
Huang (2021012203310032000_ref94) 2017
Karim (2021012203310032000_ref55) 2019; 7
Xie (2021012203310032000_ref40) 2016
Karim (2021012203310032000_ref90) 2019
References_xml – start-page: 175
  volume-title: International Conference on Artificial Neural Networks
  year: 2009
  ident: 2021012203310032000_ref79
  article-title: On the use of the adjusted rand index as a metric for evaluating supervised classification
– volume-title: Data clustering: theory, algorithms, and applications
  year: 2007
  ident: 2021012203310032000_ref3
  doi: 10.1137/1.9780898718348
– year: 2011
  ident: 2021012203310032000_ref70
– volume: 34
  start-page: 57
  issue: 1
  year: 2017
  ident: 2021012203310032000_ref59
  article-title: Dental health status of community-dwelling older singaporeans: findings from a nationally representative survey
  publication-title: Gerodontology
  doi: 10.1111/ger.12218
– volume: 7
  start-page: 133850
  year: 2019
  ident: 2021012203310032000_ref55
  article-title: Prognostically relevant subtypes and survival prediction for breast cancer based on multimodal genomics data
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2019.2941796
– volume: 1
  start-page: 48
  issue: 1
  year: 2015
  ident: 2021012203310032000_ref33
  article-title: PSCAN: parallel, density based clustering of protein sequences
  publication-title: Intell Data Anal
– start-page: 1221
  volume-title: 2004 IEEE International Joint Conference on Neural Networks
  year: 2004
  ident: 2021012203310032000_ref21
  article-title: Comparison of self-organizing map with k-means hierarchical clustering for bioinformatics applications
– start-page: 5879
  year: 2017
  ident: 2021012203310032000_ref48
  article-title: Deep adaptive image clustering
  publication-title: Proceedings of the IEEE International Conference on Computer Vision
– start-page: 79
  volume-title: International Encyclopedia of Statistical Science. Annals of Mathematical Statistics
  year: 2011
  ident: 2021012203310032000_ref54
  article-title: Kullback-Leibler divergence
– year: 11–14, 2017
  ident: 2021012203310032000_ref61
  article-title: Artificial Neural Networks and Machine Learning–ICANN 2017: 26th International Conference on Artificial Neural Networks
– start-page: 1096
  volume-title: Proceedings of the 25th International Conference on Machine Learning
  year: 2008
  ident: 2021012203310032000_ref71
  article-title: Extracting and composing robust features with denoising autoencoders
  doi: 10.1145/1390156.1390294
– start-page: 843
  year: 2015
  ident: 2021012203310032000_ref67
  article-title: Unsupervised learning of video representations using lstms
  publication-title: In: International Conference on Machine Learning
– year: 2018
  ident: 2021012203310032000_ref49
  publication-title: Deep continuous clustering. arXiv preprint, arXiv
– volume: 6
  start-page: 485
  year: 2012
  ident: 2021012203310032000_ref35
  article-title: A prototype-based modified DBSCAN for gene clustering
  publication-title: Procedia Technology
  doi: 10.1016/j.protcy.2012.10.058
– start-page: 373
  volume-title: International Conference on Neural Information Processing
  year: 2017
  ident: 2021012203310032000_ref29
  article-title: Deep clustering with convolutional autoencoders
  doi: 10.1007/978-3-319-70096-0_39
– volume: 56
  start-page: 122
  year: 2019
  ident: 2021012203310032000_ref84
  article-title: BACH: grand challenge on breast cancer histology images
  publication-title: Med Image Anal
  doi: 10.1016/j.media.2019.05.010
– volume-title: 26th IEEE International Workshop on Machine Learning for Signal Processing (MLSP)
  year: 13–16 . 2016
  ident: 2021012203310032000_ref45
  article-title: Speaker identification and clustering using convolutional neural networks
– start-page: 281
  volume-title: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics & Probability
  year: 1967
  ident: 2021012203310032000_ref16
  article-title: Some methods for classification and analysis of multivariate observations
– volume: 4
  start-page: 65
  issue: 1
  year: 2002
  ident: 2021012203310032000_ref18
  article-title: Why so many clustering algorithms: a position paper
  publication-title: SIGKDD Explorations
  doi: 10.1145/568574.568575
– year: 2015
  ident: 2021012203310032000_ref91
– volume: 18
  start-page: 851
  issue: 5
  year: 2017
  ident: 2021012203310032000_ref28
  article-title: Deep learning in bioinformatics
  publication-title: Brief Bioinform
– volume: 15
  start-page: 2014
  issue: 1
  year: 1929–1958
  ident: 2021012203310032000_ref72
  article-title: Dropout: a simple way to prevent neural networks from overfitting
  publication-title: J Mach Learn Res
– year: 2018
  ident: 2021012203310032000_ref30
  article-title: Recurrent deep embedding networks for genotype clustering and ethnicity prediction
– start-page: 121
  year: 2018
  ident: 2021012203310032000_ref88
  article-title: Aspect-based sentiment analysis of drug reviews applying cross-domain and cross-data learning
  publication-title: Proceedings of the 2018 International Conference on Digital Health.
  doi: 10.1145/3194658.3194677
– volume: 95
  start-page: 14863
  issue: 25
  year: 1998
  ident: 2021012203310032000_ref4
  article-title: Cluster analysis and display of genome-wide expression patterns
  publication-title: Proc Natl Acad Sci
  doi: 10.1073/pnas.95.25.14863
– start-page: 5747
  volume-title: 2017 IEEE International Conference on Computer Vision (ICCV)
  year: 2017
  ident: 2021012203310032000_ref46
  article-title: Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization
  doi: 10.1109/ICCV.2017.612
– start-page: 3504
  volume-title: Advances in Neural Information Processing Systems
  year: 2016
  ident: 2021012203310032000_ref97
  article-title: Retain: an interpretable predictive model for healthcare using reverse time attention mechanism
– volume: 3
  start-page: 1544
  issue: 3
  year: 2018
  ident: 2021012203310032000_ref63
  article-title: A multimodal anomaly detector for robot-assisted feeding using an lstm-based variational autoencoder
  publication-title: IEEE Robot Autom Lett
  doi: 10.1109/LRA.2018.2801475
– start-page: 1096
  volume-title: Proceedings of the 25th International Conference on Machine Learning
  year: 2008
  ident: 2021012203310032000_ref92
  article-title: Extracting and composing robust features with denoising autoencoders
  doi: 10.1145/1390156.1390294
– year: 2019
  ident: 2021012203310032000_ref62
  article-title: Sioutis M, and Loutfi A
– volume: 34
  start-page: 189
  year: 2019
  ident: 2021012203310032000_ref96
  article-title: The right to explanation, explained
  publication-title: Berkeley Technol Law J
– volume: 23
  start-page: 141
  issue: 3
  year: 2017
  ident: 2021012203310032000_ref87
  article-title: Text mining in biomedical domain with emphasis on document clustering
  publication-title: Healthcare Inform Res
  doi: 10.4258/hir.2017.23.3.141
– volume: 20
  start-page: 421
  issue: 2
  year: 2018
  ident: 2021012203310032000_ref51
  article-title: CNN-based joint clustering and representation learning with feature drift compensation for large-scale image data
  publication-title: IEEE Trans Multimed
  doi: 10.1109/TMM.2017.2745702
– volume: 2
  start-page: 193
  issue: 1
  year: 1985
  ident: 2021012203310032000_ref76
  article-title: Comparing partitions
  publication-title: J Classification
  doi: 10.1007/BF01908075
– volume: 2
  start-page: 21
  year: 2019
  ident: 2021012203310032000_ref65
  article-title: A snapshot neural ensemble method for cancer type prediction based on copy number variations
  publication-title: Neural Comput Appl
– start-page: 5147
  volume-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
  year: 2016
  ident: 2021012203310032000_ref41
  article-title: Joint unsupervised learning of deep representations and image clusters
– volume: 5
  start-page: 187
  issue: 2
  year: 2007
  ident: 2021012203310032000_ref57
  article-title: Medical x-ray image enhancement based on kramer’s pde model
  publication-title: J Electron Sci Technol
– volume: 63
  start-page: 411
  issue: 2
  year: 2001
  ident: 2021012203310032000_ref74
  article-title: Estimating the number of clusters in a data set via the gap statistic
  publication-title: J R Stat Soc Ser B (Statistical Methodology)
  doi: 10.1111/1467-9868.00293
– year: 2019
  ident: 2021012203310032000_ref93
  article-title: Unsupervised data augmentation for consistency training
– volume: 2
  start-page: 37
  issue: 1–3
  year: 1987
  ident: 2021012203310032000_ref24
  article-title: Principal component analysis
  publication-title: Chemom Intell Lab Syst
  doi: 10.1016/0169-7439(87)80084-9
– volume: 10
  start-page: 845
  issue: 4
  year: 2013
  ident: 2021012203310032000_ref9
  article-title: Proximity measures for clustering gene expression microarray data: a validation methodology and a comparative analysis
  publication-title: IEEE/ACM Trans Comput Biol Bioinform
  doi: 10.1109/TCBB.2013.9
– year: 2017
  ident: 2021012203310032000_ref94
– volume: 10
  start-page: 390
  issue: 1
  year: 2019
  ident: 2021012203310032000_ref12
  article-title: Single-cell RNA-seq denoising using a deep count autoencoder
  publication-title: Nat Commun
  doi: 10.1038/s41467-018-07931-2
– volume: 1
  start-page: 48
  issue: 1
  year: 2013
  ident: 2021012203310032000_ref34
  article-title: Segmentation of brain tumour from MRI image analysis of k-means and dbscan clustering
  publication-title: Int J Res Eng Sci
– start-page: 478
  volume-title: International Conference on Machine Learning
  year: 2016
  ident: 2021012203310032000_ref40
  article-title: Unsupervised deep embedding for clustering analysis
– year: 2016
  ident: 2021012203310032000_ref38
– start-page: 2016
  issue: 15
  year: 2016
  ident: 2021012203310032000_ref83
  article-title: Prostate cancer detection using photoacoustic imaging and deep learning
  publication-title: Electron Imaging
– year: 2018
  ident: 2021012203310032000_ref23
  article-title: Clustering with deep learning: taxonomy and new methods
– start-page: 694
  volume-title: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data
  year: 2005
  ident: 2021012203310032000_ref8
  article-title: Tricluster: an effective algorithm for mining coherent clusters in 3D microarray data
  doi: 10.1145/1066157.1066236
– volume-title: Special Lecture on IE
  year: 2015
  ident: 2021012203310032000_ref64
  article-title: Variational autoencoder based anomaly detection using reconstruction probability
– start-page: 657
  volume-title: Proceedings of the European Conference on Computer Vision (ECCV)
  year: 2018
  ident: 2021012203310032000_ref68
  article-title: Hidden: hiding data with deep networks
– volume: 6
  start-page: 39501
  year: 2018
  ident: 2021012203310032000_ref2
  article-title: A survey of clustering with deep learning: from the perspective of network architecture
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2018.2855437
– volume: 132
  start-page: 42
  year: 2018
  ident: 2021012203310032000_ref11
  article-title: Clustering of rna-seq samples: comparison study on cancer data
  publication-title: Methods
  doi: 10.1016/j.ymeth.2017.07.023
– year: 2017
  ident: 2021012203310032000_ref85
– volume: 66
  start-page: 846
  issue: 336
  year: 1971
  ident: 2021012203310032000_ref78
  article-title: Objective criteria for the evaluation of clustering methods
  publication-title: J Amer Statist Assoc
  doi: 10.1080/01621459.1971.10482356
– year: 2018
  ident: 2021012203310032000_ref50
  publication-title: Learning latent representations in neural networks for clustering through pseudo supervision and graph-based activity regularization. arXiv preprint, arXiv
– volume: 2
  start-page: 83
  issue: 1-2
  year: 1955
  ident: 2021012203310032000_ref80
  article-title: The hungarian method for the assignment problem
  publication-title: Naval Res Logist Quart
  doi: 10.1002/nav.3800020109
– start-page: 202
  volume-title: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII)
  year: 2017
  ident: 2021012203310032000_ref31
  article-title: Multimodal autoencoder: a deep learning approach to filling in missing sensor data and enabling better mood prediction
  doi: 10.1109/ACII.2017.8273601
– volume: 3
  start-page: 583
  issue: Dec
  year: 2002
  ident: 2021012203310032000_ref75
  article-title: Cluster ensembles—a knowledge reuse framework for combining multiple partitions
  publication-title: J Mach Learn Res
– year: 2018
  ident: 2021012203310032000_ref58
  publication-title: Learning mixtures of linear regressions with nearly optimal complexity. arXiv preprint, arXiv
– volume: 9
  start-page: 3053
  issue: 1
  year: 2019
  ident: 2021012203310032000_ref37
  article-title: Tight clustering for large datasets with an application to gene expression data
  publication-title: Sci Rep
  doi: 10.1038/s41598-019-39459-w
– start-page: 1532
  volume-title: 22nd International Conference on Pattern Recognition
  year: 2014
  ident: 2021012203310032000_ref43
  article-title: Deep embedding network for clustering
– start-page: 410
  volume-title: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
  year: 2007
  ident: 2021012203310032000_ref81
  article-title: V-measure: a conditional entropy-based external cluster evaluation measure
– year: 2019
  ident: 2021012203310032000_ref56
  article-title: Convolutional neural network models for cancer type prediction based on gene expression
– start-page: 2172
  volume-title: Advances in Neural Information Processing Systems
  year: 2016
  ident: 2021012203310032000_ref69
  article-title: Infogan: interpretable representation learning by information maximizing generative adversarial nets
– volume: 1
  start-page: 38
  year: 2015
  ident: 2021012203310032000_ref7
  article-title: Clustering techniques in bioinformatics
  publication-title: Int J Modern Educ Comput Sci
  doi: 10.5815/ijmecs.2015.01.06
– start-page: 113
  volume-title: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
  year: 2019
  ident: 2021012203310032000_ref66
  article-title: Drug–drug interaction prediction based on knowledge graph embeddings and convolutional-lstm network
  doi: 10.1145/3307339.3342161
– volume: 21
  start-page: 1
  issue: 1–3
  year: 1998
  ident: 2021012203310032000_ref17
  article-title: The self-organizing map
  publication-title: Neurocomputing
  doi: 10.1016/S0925-2312(98)00030-7
– year: 2015
  ident: 2021012203310032000_ref44
– volume: 9
  start-page: 497
  issue: 1
  year: 2008
  ident: 2021012203310032000_ref10
  article-title: Clustering cancer gene expression data: a comparative study
  publication-title: BMC Bioinform
  doi: 10.1186/1471-2105-9-497
– volume: 9
  start-page: 2579
  issue: Nov
  year: 2008
  ident: 2021012203310032000_ref73
  article-title: Visualizing data using t-SNE
  publication-title: J Mach Learn Res
– start-page: 849
  volume-title: Advances in Neural Information Processing Systems
  year: 2002
  ident: 2021012203310032000_ref26
  article-title: On spectral clustering: analysis and an algorithm
– start-page: 1
  year: 2011
  ident: 2021012203310032000_ref32
  article-title: Constructing super rule tree (SRT) for protein motif clusters using dbscan
  publication-title: Proceedings of the International Conference on Bioinformatics & Computational Biology (BIOCOMP)
– volume: 12
  start-page: 96
  issue: 1
  year: 2012
  ident: 2021012203310032000_ref86
  article-title: Basal breast cancer: a complex and deadly molecular subtype
  publication-title: Curr Mol Med
  doi: 10.2174/156652412798376134
– volume: 7
  start-page: 46450
  year: 2017
  ident: 2021012203310032000_ref82
  article-title: Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent
  publication-title: Sci Rep
  doi: 10.1038/srep46450
– start-page: 5884
  volume-title: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  year: 2011
  ident: 2021012203310032000_ref60
  article-title: Learning a better representation of speech soundwaves using restricted boltzmann machines
  doi: 10.1109/ICASSP.2011.5947700
– volume: 98
  start-page: 10869
  issue: 19
  year: 2001
  ident: 2021012203310032000_ref15
  article-title: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications
  publication-title: Proc Natl Acad Sci
  doi: 10.1073/pnas.191367098
– volume: 3
  start-page: 86
  year: 2014
  ident: 2021012203310032000_ref14
  article-title: Evaluating and analyzing clusters in data mining using different algorithms
  publication-title: Int J Comput Sci Mob Comput
– volume: 114
  start-page: 9814
  issue: 37
  year: 2017
  ident: 2021012203310032000_ref52
  article-title: Robust continuous clustering
  publication-title: Proc Natl Acad Sci
  doi: 10.1073/pnas.1700770114
– volume: 27
  start-page: 623
  issue: 4
  year: 2004
  ident: 2021012203310032000_ref5
  article-title: Comparative analysis of clustering methods for gene expression time course data
  publication-title: Genet Mol Biol
  doi: 10.1590/S1415-47572004000400025
– start-page: 59
  volume-title: European Conference on Principles of Data Mining and Knowledge Discovery
  year: 2005
  ident: 2021012203310032000_ref19
  article-title: Agglomerative hierarchical clustering with constraints: theoretical and empirical results
– volume: 17
  start-page: 977
  issue: 10
  year: 2001
  ident: 2021012203310032000_ref36
  article-title: Model-based clustering and data transformations for gene expression data
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/17.10.977
– year: 2019
  ident: 2021012203310032000_ref90
  article-title: OncoNetExplainer: explainable predictions of cancer types based on gene expression data
  doi: 10.1109/BIBE.2019.00081
– year: 2016
  ident: 2021012203310032000_ref1
  article-title: Clustering algorithms: their application to gene expression data
  publication-title: Bioinform Biol Insights
  doi: 10.4137/BBI.S38316
– year: 2015
  ident: 2021012203310032000_ref39
  article-title: Adversarial autoencoders
– start-page: 2837
  year: 2010
  ident: 2021012203310032000_ref77
  article-title: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance
  publication-title: J Mach Learn Res
– volume: 22
  start-page: 2405
  issue: 19
  year: 2006
  ident: 2021012203310032000_ref13
  article-title: Evaluation and comparison of gene clustering methods in microarray analysis
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btl406
– start-page: 68
  year: 1990
  ident: 2021012203310032000_ref20
  article-title: Partitioning around medoids (program pam)
  publication-title: Finding Groups in Data: An Introduction to Cluster Analysis.
  doi: 10.1002/9780470316801.ch2
– year: 2018
  ident: 2021012203310032000_ref95
  article-title: Bert: pre-training of deep bidirectional transformers for language understanding
– start-page: 1370
  issue: 11
  year: 2004
  ident: 2021012203310032000_ref6
  article-title: Cluster analysis for gene expression data: a survey
  publication-title: IEEE Trans Knowl Data Eng
  doi: 10.1109/TKDE.2004.68
– volume: 83
  start-page: 161
  year: 2018
  ident: 2021012203310032000_ref47
  article-title: Discriminatively boosted image clustering with fully convolutional auto-encoders
  publication-title: Pattern Recogn
  doi: 10.1016/j.patcog.2018.05.019
– start-page: 06321
  year: 2015
  ident: 2021012203310032000_ref42
  article-title: Neural network-based clustering using pairwise constraints
– volume: 45
  start-page: 1113
  issue: 10
  year: 2013
  ident: 2021012203310032000_ref89
  article-title: Collisson EA, Mills GB, et al. The cancer genome atlas pan-cancer analysis project
  publication-title: Nat Genet
  doi: 10.1038/ng.2764
– start-page: 5
  year: 2016
  ident: 2021012203310032000_ref53
  article-title: Variational deep embedding: a generative approach to clustering
– start-page: 28
  volume-title: Proceedings of the 17th International Conference on Pattern Recognition
  year: 2004
  ident: 2021012203310032000_ref22
  article-title: Improved adaptive gaussian mixture model for background subtraction
– start-page: 1171
  year: 2008
  ident: 2021012203310032000_ref25
  article-title: Kernel methods in machine learning
  publication-title: Annals of Stat
  doi: 10.1214/009053607000000677
– volume: 17
  start-page: 763
  issue: 9
  year: 2001
  ident: 2021012203310032000_ref27
  article-title: An empirical study on principal component analysis for clustering gene expression data
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/17.9.763
SSID ssj0020781
Score 2.6442626
SecondaryResourceType review_article
Snippet Abstract Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at...
Clustering is central to many data-driven bioinformatics research and serves a powerful computational method. In particular, clustering helps at analyzing...
SourceID pubmedcentral
proquest
pubmed
crossref
oup
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 393
SubjectTerms Algorithms
Artificial neural networks
Bioinformatics
Biological activity
Cellular structure
Centroids
Cluster analysis
Clustering
Computer applications
Data mining
Data points
Deep learning
Gene mapping
Genomics
Learning algorithms
Machine learning
Medical imaging
Neural networks
Representations
Self organizing maps
State-of-the-art reviews
Unstructured data
Title Deep learning-based clustering approaches for bioinformatics
URI https://www.ncbi.nlm.nih.gov/pubmed/32008043
https://www.proquest.com/docview/2529968924
https://www.proquest.com/docview/2350347765
https://pubmed.ncbi.nlm.nih.gov/PMC7820885
Volume 22
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV1LS8NAEF5EELyIb6O1RPTiYWnSfQa8iFqKoF5a6C3sK1ooabHtQX-9s82DRopCDiE7IcvMhu8bdvYbhG4EsPKIEoUlcGVMqTVYRc5ixXVmFElIbPzh5JdX3h_S5xEblUU08w1b-Anp6LHuaP0dC5-ZA_p6hfzB26hOq7xcTXGGSGAv7l6JkDZebcBO4yjbGqP8XRi5hjS9fbRXUsTwvojpAdpy-SHaKZpGfh2hu0fnZmHZ7eEdexiyoZksveIBPAgrlXA3D4GQhno8LcVRvSDzMRr2ngYPfVz2QMCGEr7AmkliBbMqMhycp4DPGAugwiDPibgWBpLMhCsGd5kFrMkM0zxxhMJluOqSE7SdT3N3hkIAq9jFJpGZV6RRMUQvowYAm1olu0IG6LZyUWpKgXDfp2KSFhvVJAV3poU7A3Rd284KWYyNVm3w9J8GrSoIafnvzNMuA4jkEhLDAF3Vw7Dq_VaGyt10CTaERYQKwVmATouY1Z8hvqQD1l-ARCOatYFX1G6O5OOPlbK2Fw-Ukp3_N-8LtNv1tS1RjGPZQtuLz6W7BHKy0O3V2vwBoTDjWw
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+learning-based+clustering+approaches+for+bioinformatics&rft.jtitle=Briefings+in+bioinformatics&rft.au=Karim%2C+Md+Rezaul&rft.au=Beyan%2C+Oya&rft.au=Zappa%2C+Achille&rft.au=Costa%2C+Ivan+G&rft.date=2021-01-18&rft.issn=1477-4054&rft.eissn=1477-4054&rft.volume=22&rft.issue=1&rft.spage=393&rft_id=info:doi/10.1093%2Fbib%2Fbbz170&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1467-5463&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1467-5463&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1467-5463&client=summon