Graph neural network-based long method and blob code smell detection

•We propose a graph neural network-based model for long method and blob code smell detection.•The best strategies for the class imbalance of graph data and graph pooling are determined through experiments in our method.•During model design for abstract syntax tree of code, Euclidean space and non-Eu...

Full description

Saved in:
Bibliographic Details
Published inScience of computer programming Vol. 243; p. 103284
Main Authors Zhang, Minnan, Jia, Jingdong, Capretz, Luiz Fernando, Hou, Xin, Tan, Huobin
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.07.2025
Subjects
Online AccessGet full text
ISSN0167-6423
DOI10.1016/j.scico.2025.103284

Cover

Loading…
Abstract •We propose a graph neural network-based model for long method and blob code smell detection.•The best strategies for the class imbalance of graph data and graph pooling are determined through experiments in our method.•During model design for abstract syntax tree of code, Euclidean space and non-Euclidean space are combined.•The experiments show that our proposed method outperforms machine learning methods and deep learning methods. The concept of code smell was first proposed in the late nineties, to refer to signals that code may need refactoring. While not necessarily affecting functionality, code smell can hinder understandability and future scalability of the program. As a result, the precise detection of code smell has become an important topic in coding research. However, current detection methods are limited by imbalanced and industrial-irrelevant datasets, a lack of sufficient structural and logical information on the code, and simple model architecture. Given these limitations, this paper utilized an industry-relevant and sufficient dataset and then developed a graph neural network to better detect code smell. First, we identified Long Method and Blob as our research subjects due to their frequent occurrence and impacts on the maintainability of software. We then designed modified fuzzy sampling with focalloss to address the issue of data imbalance. Second, to deal with the large volume of data, we proposed a global and local attention scoring mechanism to extract the key information from the code. Third, in order to design a graph neural network specifically for the abstract syntax tree of code, we combined Euclidean space and non-Euclidean space. Finally, we compared our method with other machine learning methods and deep learning methods. The results demonstrate that our method outperforms the other methods on Long Method and Blob, which indicates the effectiveness of our proposed method.
AbstractList •We propose a graph neural network-based model for long method and blob code smell detection.•The best strategies for the class imbalance of graph data and graph pooling are determined through experiments in our method.•During model design for abstract syntax tree of code, Euclidean space and non-Euclidean space are combined.•The experiments show that our proposed method outperforms machine learning methods and deep learning methods. The concept of code smell was first proposed in the late nineties, to refer to signals that code may need refactoring. While not necessarily affecting functionality, code smell can hinder understandability and future scalability of the program. As a result, the precise detection of code smell has become an important topic in coding research. However, current detection methods are limited by imbalanced and industrial-irrelevant datasets, a lack of sufficient structural and logical information on the code, and simple model architecture. Given these limitations, this paper utilized an industry-relevant and sufficient dataset and then developed a graph neural network to better detect code smell. First, we identified Long Method and Blob as our research subjects due to their frequent occurrence and impacts on the maintainability of software. We then designed modified fuzzy sampling with focalloss to address the issue of data imbalance. Second, to deal with the large volume of data, we proposed a global and local attention scoring mechanism to extract the key information from the code. Third, in order to design a graph neural network specifically for the abstract syntax tree of code, we combined Euclidean space and non-Euclidean space. Finally, we compared our method with other machine learning methods and deep learning methods. The results demonstrate that our method outperforms the other methods on Long Method and Blob, which indicates the effectiveness of our proposed method.
ArticleNumber 103284
Author Hou, Xin
Tan, Huobin
Capretz, Luiz Fernando
Jia, Jingdong
Zhang, Minnan
Author_xml – sequence: 1
  givenname: Minnan
  surname: Zhang
  fullname: Zhang, Minnan
  organization: School of Software, Beihang University, Beijing 100191, China
– sequence: 2
  givenname: Jingdong
  surname: Jia
  fullname: Jia, Jingdong
  email: jiajingdong@buaa.edu.cn
  organization: School of Software, Beihang University, Beijing 100191, China
– sequence: 3
  givenname: Luiz Fernando
  surname: Capretz
  fullname: Capretz, Luiz Fernando
  organization: Department of Electrical & Computer Engineering, Western University, London N6A5B9, Ontario, Canada
– sequence: 4
  givenname: Xin
  surname: Hou
  fullname: Hou, Xin
  organization: School of Software, Beihang University, Beijing 100191, China
– sequence: 5
  givenname: Huobin
  surname: Tan
  fullname: Tan, Huobin
  organization: School of Software, Beihang University, Beijing 100191, China
BookMark eNp9j7FOwzAQhj0UibbwBCx-gQQ7dpxkYEAFClIlFpgtx3ehCYld2QHE25MQZqZfurvv9H8bsnLeISFXnKWccXXdpdG21qcZy_JpIrJSrsh62hSJkpk4J5sYO8aYkgVfk7t9MKcjdfgRTD_F-OXDe1KbiEB7797ogOPRAzUOaN37mloPSOOAfU8BR7Rj690FOWtMH_HyL7fk9eH-ZfeYHJ73T7vbQ2J5KcakktwyW3CsGBPSADOyyXlTGgF5iSLDDFSZ26rAQlVc8YJLo5TNc4CmVhbElojlrw0-xoCNPoV2MOFbc6Zned3pX3k9y-tFfqJuFgqnap8thvkGnUVow9Rfg2__5X8ADfhnWw
Cites_doi 10.1109/TSE.2021.3079841
10.4249/scholarpedia.2776
10.1016/j.infsof.2022.107057
10.1109/TSE.2015.2503740
10.1587/transinf.2023EDP7192
10.1007/s10664-015-9378-4
10.1109/TASLP.2024.3407575
10.1016/j.rse.2008.02.011
10.1016/j.infsof.2017.09.011
10.1016/j.infsof.2021.106736
10.1016/j.jss.2020.110610
10.1007/s10009-022-00662-2
10.18293/SEKE2015-182
10.1016/j.knosys.2022.109737
10.1007/978-981-19-0901-6_25
10.1214/aos/1016218223
10.1109/ACCESS.2022.3213844
10.1016/j.entcs.2005.02.059
10.1016/j.jss.2021.110936
10.3390/app14146149
10.1007/s10664-011-9171-y
10.1002/spe.2697
10.17485/ijst/2015/v8iS2/57796
10.1109/TSE.2013.60
10.1037/0033-2909.114.3.494
10.3390/e24101373
10.18293/SEKE2021-014
10.18293/SEKE2022-077
10.1007/s10664-019-09703-y
10.1016/j.jss.2010.11.918
10.1007/s10664-017-9535-z
10.1007/978-981-99-3734-9_7
10.1016/j.jss.2020.110693
10.1016/j.engappai.2024.109527
10.1016/j.scico.2021.102713
10.1002/smr.344
10.1007/s13369-019-04311-w
10.1109/TSE.2009.50
10.1016/j.infsof.2018.02.004
10.1016/j.eswa.2022.117607
10.1007/s10639-023-12007-w
10.1007/s11219-020-09498-y
10.1109/ACCESS.2024.3387856
10.1016/j.infsof.2016.02.003
10.1109/TSE.2012.89
10.1016/j.infsof.2021.106648
ContentType Journal Article
Copyright 2025
Copyright_xml – notice: 2025
DBID 6I.
AAFTH
AAYXX
CITATION
DOI 10.1016/j.scico.2025.103284
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
ExternalDocumentID 10_1016_j_scico_2025_103284
S0167642325000231
GroupedDBID --K
--M
.DC
.~1
0R~
123
1B1
1RT
1~.
1~5
4.4
457
4G.
5VS
6I.
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAFTH
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXKI
AAXUO
AAYFN
ABBOA
ABFNM
ABJNI
ABMAC
ABTAH
ABWVN
ABXDB
ACDAQ
ACGFS
ACNNM
ACRLP
ACRPL
ACZNC
ADBBV
ADEZE
ADHUB
ADMUD
ADNMO
ADVLN
AEBSH
AEIPS
AEKER
AENEX
AEXQZ
AFFNX
AFJKZ
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMRAJ
ANKPU
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
CS3
DU5
E.L
EBS
EFJIC
EJD
EO8
EO9
EP2
EP3
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
GBLVA
GBOLZ
HVGLF
HZ~
IHE
IXB
J1W
KOM
LG9
M26
M41
MO0
N9A
O-L
O9-
OAUVE
OK1
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSV
SSZ
T5K
TN5
WUQ
XPP
ZMT
ZY4
~G-
AATTM
AAYWO
AAYXX
ACVFH
ADCNI
AEUPX
AFPUW
AFXIZ
AGCQF
AGQPQ
AGRNS
AIGII
AIIUN
AKBMS
AKYEP
APXCP
BNPGV
CITATION
EFKBS
ID FETCH-LOGICAL-c183t-941c0c71e90034ad0a4f51f8a3d58e32e2d685c97e769161714a66c55ddfb6cd3
IEDL.DBID .~1
ISSN 0167-6423
IngestDate Tue Aug 05 12:06:11 EDT 2025
Sat Mar 15 15:42:09 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords Code smell
Class imbalance
Graph pooling
Graph neural network
Hyperbolic space
Language English
License This is an open access article under the CC BY-NC license.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c183t-941c0c71e90034ad0a4f51f8a3d58e32e2d685c97e769161714a66c55ddfb6cd3
OpenAccessLink https://www.sciencedirect.com/science/article/pii/S0167642325000231
ParticipantIDs crossref_primary_10_1016_j_scico_2025_103284
elsevier_sciencedirect_doi_10_1016_j_scico_2025_103284
PublicationCentury 2000
PublicationDate July 2025
PublicationDateYYYYMMDD 2025-07-01
PublicationDate_xml – month: 07
  year: 2025
  text: July 2025
PublicationDecade 2020
PublicationTitle Science of computer programming
PublicationYear 2025
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Liu, Liu, Niu, Liu (bib0046) 2016; 42
Wilcoxon (bib0094) 1945; 6
Dexun, Peijun, Xiaohong, Tiantian (bib0036) 2012
Sahlaoui, Alaoui, Agoujil, Nayyar (bib0081) 2024; 29
Pecorelli, Palomba, Nucci, Lucia (bib0020) 2019
Olbrich, Cruzes, Sjøberg (bib0011) 2010
Afjehei, Chen, Tsantalis (bib0007) 2019; 24
Kovačević, Slivka, Vidaković, Grujić, Luburić, Prokić, Sladić (bib0048) 2022; 204
Chan, Paelinckx (bib0057) 2008; 112
Wang, Liu, Tan (bib0065) 2016
Diehl, Brunner, Le, Knoll (bib0083) 2019
Zhang, Dong (bib0067) 2021
Palomba, Di Nucci, Tufano, Bavota, Oliveto, Poshyvanyk, De Lucia (bib0090) 2015
Alon, Zilberstein, Levy, Yahav (bib0077) 2018
Khomh, Vaucher, Guéhéneuc, Sahraoui (bib0018) 2009
Palomba, Panichella, De Lucia, Oliveto, Zaidman (bib0044) 2016
Yu, Gao (bib0006) 2022; 43
Bavota, Oliveto, Gethers, Poshyvanyk, De Lucia (bib0040) 2014; 40
Chen, Chen, Ma, Zhou, Zhou, Xu (bib0008) 2018; 94
Tempero, Anslow, Dietrich, Han, Li, Lumpe, Melton, Noble (bib0052) 2010
Yedida, Menzies (bib0082) 2022; 48
Kagdi, Collard, Maletic (bib0045) 2007; 19
Di Nucci, Palomba, Tamburri, Serebrenik, Lucia (bib0053) 2018
Yu, Mao, Ye (bib0069) 2021
Fowler (bib0001) 1999
Bu, Liu, Li (bib0066) 2019; 30
Pecorelli, Di Nucci, De Roover, De Lucia (bib0025) 2020; 169
Marinescu (bib0039) 2004; 2004
Gao, Khoshgoftaar, Napolitano (bib0093) 2015
Alazba, Aljamaan (bib0059) 2021; 138
Ying, You, Morris, Ren, Hamilton, Leskovec (bib0084) 2018
Abbes, Khomh, Gueheneuc, Antoniol (bib0002) 2011
Polikar (bib0055) 2009; 4
Lin, Xiao, Zhang, Xiang (bib0024) 2019
Dewangan, Rao, Yadav (bib0050) 2022
Madeyski, Lewowski (bib0091) 2020
Kreimer (bib0017) 2005; 141
Li, Zhang (bib0070) 2022
Li, He, Zhu, Lyu (bib0063) 2017
Allamanis, Brockschmidt, Khademi (bib0075) 2017
Walter, Alkhaeir (bib0035) 2016; 74
Zhang, Kishi (bib0073) 2024; E107
Hadj-Kacem, Bouassida (bib0027) 2019
Brdar, Vlajkov, Slivka, Grujić, Kovačević (bib0092) 2022
Palomba, Bavota, Di Penta, Fasano, Oliveto, De Lucia (bib0079) 2018; 99
Lacerda, Petrillo, Pimenta, Guéhéneuc (bib0005) 2020; 167
Yadav, Rao, Mishra, Gupta (bib0062) 2025; 139
Peng, Mou, Li, Liu, Zhang, Jin (bib0068) 2015
Cui, Long, Jiang, Na (bib0033) 2022; 24
Guggulothu, Moiz (bib0021) 2020; 28
Jain, Saha (bib0047) 2021; 212
Tsantalis, Chaikalis, Chatzigeorgiou (bib0038) 2008
Dewangan, Rao (bib0019) 2022
Yang, Zhou, Zhihao, Liu, Pan, Xiong, King (bib0086) 2022
Song, Feng, Jing (bib0088) 2022
Yadav, Rao, Mishra (bib0060) 2024; 12
Cliff (bib0095) 1993; 3
Yamashita, Moonen (bib0004) 2013
Alkharabsheh, Alawadi, Kebande, Crespo, Fernández-Delgado, Taboada (bib0054) 2022; 143
Yadav, Rao, Mishara, Gupta (bib0061) 2024; 14
Palomba, Bavota, Penta, Fasano, Oliveto, Lucia (bib0078) 2018; 23
Mohammed, Hassine, Alshayeb (bib0016) 2022; 24
Singh, Bindal, Kumar (bib0043) 2020; 8
Bavota, De Lucia, Marcus, Oliveto (bib0042) 2010
Mesbah, El Madhoun, Al Agha, Chalouati (bib0071) 2024
Feng, Guo, Tang, Duan, Feng, Gong, Shou, Qin, Liu, Jiang, Zhou (bib0097) 2020
Zhang, Kishi (bib0074) 2023; 31
Al-Shaaby, Aljamaan, Alshayeb (bib0089) 2020; 45
Khomh, Penta, Guéhéneuc, Antoniol (bib0003) 2012; 17
Hamilton, Ying, Leskovec (bib0072) 2017
Zhang, Ge, Hong, Tian, Dong, Liu (bib0022) 2022; 255
Pecorelli, Di Nucci, De Roover, De Lucia (bib0026) 2019
Liu, Jin, Xu, Zou, Bu, Zhang (bib0029) 2019; 47
Chen, Han, Lin, He, Xie, Zhou, Liu, Sun (bib0087) 2024; 32
Bavota, De Lucia, Oliveto (bib0041) 2011; 84
Gupta, Rajnish, Bhattacharjee (bib0032) 2022; 10
Azhar, Pozi, Din, Jatowt (bib0080) 2023; 35
Dam, Pham, Ng, Tran, Grundy, Ghose, Kim, Kim (bib0064) 2019
Danphitsanuphan, Suwantada (bib0015) 2012
Fontana, Mäntylä, Zanoni, Marino (bib0051) 2016; 21
Xu, Zhang (bib0028) 2021
Fenske, Schulze, Meyer, Saake (bib0037) 2015
Zhang, Bu, Ester, Zhang, Yao, Yu, Wang (bib0085) 2019
Dewangan, Rao (bib0056) 2023; 725
Friedman, Hastie, Tibshirani (bib0058) 2000; 28
Sousa, Bigonha, Ferreira (bib0009) 2019; 49
Moha, Gueheneuc, Duchien, Meur (bib0014) 2010; 36
Fang, Zhao, Jia (bib0023) 2019
Das, Yadav, Dhal (bib0030) 2019
Sjoberg, Yamashita, Anda, Mockus, Dyba (bib0012) 2013; 39
Alon, Levy, Yahav (bib0076) 2018
Habchi, Moha, Rouvoy (bib0010) 2019
Marinescu, Marinescu, Mihancea, Ratiu, Wettel (bib0096) 2005
Zhou, He, Zeng, Ma (bib0034) 2022; 152
Thenral, Thenralmanoharan (bib0013) 2015; 8
Abdou, Ramadan (bib0049) 2022; 34
Sharma, Efstathiou, Louridas, Spinellis (bib0031) 2021; 176
Fenske (10.1016/j.scico.2025.103284_bib0037) 2015
Abdou (10.1016/j.scico.2025.103284_bib0049) 2022; 34
Sahlaoui (10.1016/j.scico.2025.103284_bib0081) 2024; 29
Yu (10.1016/j.scico.2025.103284_bib0006) 2022; 43
Cui (10.1016/j.scico.2025.103284_bib0033) 2022; 24
Pecorelli (10.1016/j.scico.2025.103284_bib0026) 2019
Mesbah (10.1016/j.scico.2025.103284_bib0071) 2024
Fontana (10.1016/j.scico.2025.103284_bib0051) 2016; 21
Palomba (10.1016/j.scico.2025.103284_bib0079) 2018; 99
Yadav (10.1016/j.scico.2025.103284_bib0061) 2024; 14
Sjoberg (10.1016/j.scico.2025.103284_bib0012) 2013; 39
Lin (10.1016/j.scico.2025.103284_bib0024) 2019
Sousa (10.1016/j.scico.2025.103284_bib0009) 2019; 49
Sharma (10.1016/j.scico.2025.103284_bib0031) 2021; 176
Pecorelli (10.1016/j.scico.2025.103284_bib0020) 2019
Dewangan (10.1016/j.scico.2025.103284_bib0050) 2022
Madeyski (10.1016/j.scico.2025.103284_bib0091) 2020
Palomba (10.1016/j.scico.2025.103284_bib0044) 2016
Bavota (10.1016/j.scico.2025.103284_bib0040) 2014; 40
Jain (10.1016/j.scico.2025.103284_bib0047) 2021; 212
Dewangan (10.1016/j.scico.2025.103284_bib0056) 2023; 725
Marinescu (10.1016/j.scico.2025.103284_bib0096) 2005
Abbes (10.1016/j.scico.2025.103284_bib0002) 2011
Di Nucci (10.1016/j.scico.2025.103284_bib0053) 2018
Diehl (10.1016/j.scico.2025.103284_bib0083) 2019
Singh (10.1016/j.scico.2025.103284_bib0043) 2020; 8
Palomba (10.1016/j.scico.2025.103284_bib0078) 2018; 23
Chen (10.1016/j.scico.2025.103284_bib0087) 2024; 32
Wilcoxon (10.1016/j.scico.2025.103284_bib0094) 1945; 6
Liu (10.1016/j.scico.2025.103284_bib0046) 2016; 42
Zhang (10.1016/j.scico.2025.103284_bib0022) 2022; 255
Alazba (10.1016/j.scico.2025.103284_bib0059) 2021; 138
Dam (10.1016/j.scico.2025.103284_bib0064) 2019
Xu (10.1016/j.scico.2025.103284_bib0028) 2021
Ying (10.1016/j.scico.2025.103284_bib0084) 2018
Pecorelli (10.1016/j.scico.2025.103284_bib0025) 2020; 169
Dewangan (10.1016/j.scico.2025.103284_bib0019) 2022
Bu (10.1016/j.scico.2025.103284_bib0066) 2019; 30
Al-Shaaby (10.1016/j.scico.2025.103284_bib0089) 2020; 45
Gao (10.1016/j.scico.2025.103284_bib0093) 2015
Chen (10.1016/j.scico.2025.103284_bib0008) 2018; 94
Feng (10.1016/j.scico.2025.103284_bib0097) 2020
Zhang (10.1016/j.scico.2025.103284_bib0085) 2019
Afjehei (10.1016/j.scico.2025.103284_bib0007) 2019; 24
Friedman (10.1016/j.scico.2025.103284_bib0058) 2000; 28
Peng (10.1016/j.scico.2025.103284_bib0068) 2015
Kagdi (10.1016/j.scico.2025.103284_bib0045) 2007; 19
Thenral (10.1016/j.scico.2025.103284_bib0013) 2015; 8
Das (10.1016/j.scico.2025.103284_bib0030) 2019
Alkharabsheh (10.1016/j.scico.2025.103284_bib0054) 2022; 143
Li (10.1016/j.scico.2025.103284_bib0063) 2017
Azhar (10.1016/j.scico.2025.103284_bib0080) 2023; 35
Walter (10.1016/j.scico.2025.103284_bib0035) 2016; 74
Zhang (10.1016/j.scico.2025.103284_bib0073) 2024; E107
Moha (10.1016/j.scico.2025.103284_bib0014) 2010; 36
Allamanis (10.1016/j.scico.2025.103284_bib0075) 2017
Lacerda (10.1016/j.scico.2025.103284_bib0005) 2020; 167
Liu (10.1016/j.scico.2025.103284_bib0029) 2019; 47
Yang (10.1016/j.scico.2025.103284_bib0086) 2022
Olbrich (10.1016/j.scico.2025.103284_bib0011) 2010
Hamilton (10.1016/j.scico.2025.103284_bib0072) 2017
Alon (10.1016/j.scico.2025.103284_bib0076) 2018
Polikar (10.1016/j.scico.2025.103284_bib0055) 2009; 4
Fang (10.1016/j.scico.2025.103284_bib0023) 2019
Kovačević (10.1016/j.scico.2025.103284_bib0048) 2022; 204
Zhang (10.1016/j.scico.2025.103284_bib0074) 2023; 31
Wang (10.1016/j.scico.2025.103284_bib0065) 2016
Palomba (10.1016/j.scico.2025.103284_bib0090) 2015
Habchi (10.1016/j.scico.2025.103284_bib0010) 2019
Zhang (10.1016/j.scico.2025.103284_bib0067) 2021
Cliff (10.1016/j.scico.2025.103284_bib0095) 1993; 3
Yamashita (10.1016/j.scico.2025.103284_bib0004) 2013
Gupta (10.1016/j.scico.2025.103284_bib0032) 2022; 10
Bavota (10.1016/j.scico.2025.103284_bib0042) 2010
Chan (10.1016/j.scico.2025.103284_bib0057) 2008; 112
Yu (10.1016/j.scico.2025.103284_bib0069) 2021
Song (10.1016/j.scico.2025.103284_bib0088) 2022
Danphitsanuphan (10.1016/j.scico.2025.103284_bib0015) 2012
Guggulothu (10.1016/j.scico.2025.103284_bib0021) 2020; 28
Bavota (10.1016/j.scico.2025.103284_bib0041) 2011; 84
Mohammed (10.1016/j.scico.2025.103284_bib0016) 2022; 24
Zhou (10.1016/j.scico.2025.103284_bib0034) 2022; 152
Tempero (10.1016/j.scico.2025.103284_bib0052) 2010
Hadj-Kacem (10.1016/j.scico.2025.103284_bib0027) 2019
Yedida (10.1016/j.scico.2025.103284_bib0082) 2022; 48
Khomh (10.1016/j.scico.2025.103284_bib0018) 2009
Li (10.1016/j.scico.2025.103284_bib0070) 2022
Dexun (10.1016/j.scico.2025.103284_bib0036) 2012
Marinescu (10.1016/j.scico.2025.103284_bib0039) 2004; 2004
Tsantalis (10.1016/j.scico.2025.103284_bib0038) 2008
Yadav (10.1016/j.scico.2025.103284_bib0060) 2024; 12
Brdar (10.1016/j.scico.2025.103284_bib0092) 2022
Khomh (10.1016/j.scico.2025.103284_bib0003) 2012; 17
Alon (10.1016/j.scico.2025.103284_bib0077) 2018
Yadav (10.1016/j.scico.2025.103284_bib0062) 2025; 139
Fowler (10.1016/j.scico.2025.103284_bib0001) 1999
Kreimer (10.1016/j.scico.2025.103284_bib0017) 2005; 141
References_xml – volume: 39
  start-page: 1144
  year: 2013
  end-page: 1156
  ident: bib0012
  article-title: Quantifying the effect of code smells on maintenance effort
  publication-title: IEEE Trans. Softw. Eng.
– volume: 167
  year: 2020
  ident: bib0005
  article-title: Code smells and refactoring: a tertiary systematic review of challenges and observations
  publication-title: J. Syst. Softw.
– start-page: 19
  year: 2019
  end-page: 24
  ident: bib0026
  article-title: On the role of data balancing for machine learning-based code smell detection
  publication-title: Proceedings of the 3rd ACM SIGSOFT International Workshop on Machine Learning Techniques for Software Quality Evaluation
– volume: 152
  year: 2022
  ident: bib0034
  article-title: Software defect prediction with semantic and structural information of codes based on Graph Neural Networks
  publication-title: Inf. Softw. Technol.
– start-page: 445
  year: 2019
  end-page: 456
  ident: bib0010
  article-title: The rise of android code smells: who is to blame?
  publication-title: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)
– volume: 34
  start-page: 37
  year: 2022
  ident: bib0049
  article-title: Severity classification of software code smells using machine learning techniques: a comparative study
  publication-title: J. Softw. Evolut. Process.
– volume: 3
  start-page: 494
  year: 1993
  end-page: 509
  ident: bib0095
  article-title: Dominance statistics: ordinal analyses to answer ordinal questions
  publication-title: Psychol. Bull.
– year: 2019
  ident: bib0083
  article-title: Towards graph pooling by edge contraction
  publication-title: the ICML 2019 Workshop on Learning and Reasoning with Graph-Structured Data
– volume: 2004
  start-page: 350
  year: 2004
  end-page: 359
  ident: bib0039
  article-title: Detection strategies: metrics-based rules for detecting design flaws
  publication-title: 20th IEEE International Conference on Software Maintenance
– start-page: 46
  year: 2019
  end-page: 57
  ident: bib0064
  article-title: Lessons learned from using a deep tree-based model for software defect prediction in practice
  publication-title: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)
– volume: 10
  start-page: 108870
  year: 2022
  end-page: 108894
  ident: bib0032
  article-title: Cognitive complexity and graph convolutional approach over control flow graph for software defect prediction
  publication-title: IEEe Access.
– year: 2021
  ident: bib0028
  article-title: Multi-granularity code smell detection using deep learning method based on abstract syntax tree
  publication-title: International Conference on Software Engineering and Knowledge Engineering
– volume: 29
  start-page: 5447
  year: 2024
  end-page: 5483
  ident: bib0081
  article-title: An empirical assessment of smote variants techniques and interpretation methods in improving the accuracy and the interpretability of student performance models
  publication-title: Educ. Inf. Technol.
– volume: 112
  start-page: 2999
  year: 2008
  end-page: 3011
  ident: bib0057
  article-title: Evaluation of random Forest and Adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery
  publication-title: Remote Sens. Environ.
– volume: 24
  start-page: 1373
  year: 2022
  ident: bib0033
  article-title: Research of software defect prediction model based on complex network and graph neural network
  publication-title: Entropy
– start-page: 148
  year: 2024
  end-page: 161
  ident: bib0071
  article-title: Beyond the code: unraveling the applicability of graph neural networks in smell detection
  publication-title: 2024 Advances in Network-Based Information Systems (NBiS)
– volume: 8
  start-page: 2223
  year: 2020
  end-page: 2232
  ident: bib0043
  article-title: Long method and Long parameter list code smells detection using functional and semantic characteristics
  publication-title: Int. J. Recent Technol. Eng.
– start-page: 1808
  year: 2018
  ident: bib0076
  article-title: code2seq: generating sequences from structured representations of code
  publication-title: arXiv preprint
– volume: 74
  start-page: 127
  year: 2016
  end-page: 142
  ident: bib0035
  article-title: The relationship between design patterns and code smells: an exploratory study
  publication-title: Inf. Softw. Technol.
– volume: 19
  start-page: 77
  year: 2007
  end-page: 131
  ident: bib0045
  article-title: A survey and taxonomy of approaches for mining software repositories in the context of software evolution
  publication-title: J. Softw. Maint.
– start-page: 547
  year: 2015
  end-page: 553
  ident: bib0068
  article-title: Building program vector representations for deep learning
  publication-title: 2015 8th International Conference on Knowledge Science, Engineering and Management (KSEM)
– start-page: 1
  year: 2012
  end-page: 5
  ident: bib0015
  article-title: Code smell detecting tool and code smell-structure bug relationship
  publication-title: 2012 Spring Congress on Engineering and Technology
– volume: 84
  start-page: 397
  year: 2011
  end-page: 414
  ident: bib0041
  article-title: Identifying extract class refactoring opportunities using structural and semantic cohesion measures
  publication-title: J. Syst. Softw.
– volume: 169
  year: 2020
  ident: bib0025
  article-title: A large empirical assessment of the role of data balancing in machine-learning-based code smell detection
  publication-title: J. Syst. Softw.
– volume: 255
  year: 2022
  ident: bib0022
  article-title: DeleSmell: code smell detection based on deep learning and latent semantic analysis
  publication-title: Knowl. Based. Syst.
– volume: 143
  year: 2022
  ident: bib0054
  article-title: A comparison of machine learning algorithms on design smell detection using balanced and imbalanced dataset: a study of God class
  publication-title: Inf. Softw. Technol.
– start-page: 612
  year: 2018
  end-page: 621
  ident: bib0053
  article-title: Detecting code smells using machine learning techniques: are we there yet?
  publication-title: 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER)
– volume: E107
  start-page: 1140
  year: 2024
  end-page: 1150
  ident: bib0073
  article-title: Large class detection using GNNs: a graph based deep learning approach utilizing three typical GNN model architectures
  publication-title: IEICe Trans. Inf. Syst.
– start-page: 1
  year: 2019
  end-page: 8
  ident: bib0027
  article-title: Deep representation learning for code smells detection using variational auto-encoder
  publication-title: 2019 International Joint Conference on Neural Networks (IJCNN)
– start-page: 171
  year: 2015
  end-page: 180
  ident: bib0037
  article-title: When code smells twice as much: metric-based detection of variability-aware code smells
  publication-title: 2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM)
– start-page: 151
  year: 2010
  end-page: 154
  ident: bib0042
  article-title: A two-step technique for extract class refactoring
  publication-title: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering
– volume: 6
  start-page: 196
  year: 1945
  end-page: 202
  ident: bib0094
  article-title: Individual comparisons by ranking methods
  publication-title: Biom
– volume: 48
  start-page: 3103
  year: 2022
  end-page: 3116
  ident: bib0082
  article-title: On the value of oversampling for deep learning in software defect prediction
  publication-title: IEEE Trans. Softw. Eng.
– start-page: 336
  year: 2010
  end-page: 345
  ident: bib0052
  article-title: The Qualitas corpus: a curated collection of java code for empirical studies
  publication-title: 2010 Asia Pacific Software Engineering Conference
– start-page: 403
  year: 2022
  end-page: 408
  ident: bib0092
  article-title: Semi-supervised detection of Long Method and God Class code smells
  publication-title: 2022 IEEE 20th Jubilee International Symposium on Intelligent Systems and Informatics (SISY)
– volume: 47
  start-page: 1811
  year: 2019
  end-page: 1837
  ident: bib0029
  article-title: Deep learning based code smell detection
  publication-title: IEEE Trans. Softw. Eng.
– volume: 212
  year: 2021
  ident: bib0047
  article-title: Improving performance with hybrid feature selection and ensemble machine learning techniques for code smell detection
  publication-title: Sci. Comput. Program.
– volume: 12
  start-page: 53664
  year: 2024
  end-page: 53676
  ident: bib0060
  article-title: An evaluation of multi-label classification approaches for method-level code smells detection
  publication-title: IEEe Access.
– start-page: 1803
  year: 2018
  ident: bib0077
  article-title: code2vec: learning distributed representations of code
  publication-title: arXiv preprint
– volume: 40
  start-page: 671
  year: 2014
  end-page: 694
  ident: bib0040
  article-title: Methodbook: recommending move method refactorings via relational topic models
  publication-title: IEEE Trans. Softw. Eng.
– start-page: 1911
  year: 2019
  ident: bib0085
  article-title: Hierarchical graph pooling with structure learning
  publication-title: arXiv preprint
– start-page: 1025
  year: 2017
  end-page: 1035
  ident: bib0072
  article-title: Inductive representation learning on large graphs
  publication-title: 2017 31st International Conference on Neural Information Processing Systems (NIPS'17)
– start-page: 1711
  year: 2017
  ident: bib0075
  article-title: Learning to represent programs with graphs
  publication-title: arXiv preprint
– volume: 24
  start-page: 3484
  year: 2019
  end-page: 3513
  ident: bib0007
  article-title: iPerfDetector: characterizing and detecting performance anti-patterns in iOS applications
  publication-title: Empir. Softw. Eng.
– start-page: 2081
  year: 2019
  end-page: 2086
  ident: bib0030
  article-title: Detecting code smells using deep learning
  publication-title: TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON)
– volume: 24
  start-page: 889
  year: 2022
  end-page: 910
  ident: bib0016
  article-title: GSDetector: a tool for automatic detection of bad smells in GRL goal models
  publication-title: Int. J. Softw. Tools. Technol. Transf.
– volume: 21
  start-page: 1143
  year: 2016
  end-page: 1191
  ident: bib0051
  article-title: Comparing and experimenting machine learning techniques for code smell detection
  publication-title: Empir. Softw. Eng.
– start-page: 181
  year: 2011
  end-page: 190
  ident: bib0002
  article-title: An empirical study of the impact of two antipatterns, Blob and Spaghetti Code, on program comprehension
  publication-title: 2011 15th European Conference on Software Maintenance and Reengineering
– volume: 94
  start-page: 14
  year: 2018
  end-page: 29
  ident: bib0008
  article-title: Understanding metric-based detectable smells in Python software: a comparative study
  publication-title: Inf. Softw. Technol.
– start-page: 299
  year: 2012
  end-page: 304
  ident: bib0036
  article-title: Detecting bad smells with weight based distance metrics theory
  publication-title: 2012 S International Conference on Instrumentation, Measurement, Computer, Communication and Control
– volume: 4
  start-page: 2776
  year: 2009
  ident: bib0055
  article-title: Ensemble learning
  publication-title: Scholarpedia
– year: 2021
  ident: bib0067
  article-title: MARS: detecting brain class/method code smell based on metric–attention mechanism and residual network
  publication-title: J. Softw. Evolut. Process.
– volume: 45
  year: 2020
  ident: bib0089
  article-title: Bad smell detection using machine learning techniques: a systematic literature review
  publication-title: Arabian J. Sci. Eng.
– start-page: 2002
  year: 2020
  ident: bib0097
  article-title: CodeBERT: a pre-trained model for programming and natural languages
  publication-title: arXiv preprint
– volume: 49
  year: 2019
  ident: bib0009
  article-title: An exploratory study on cooccurrence of design patterns and bad smells using software metrics
  publication-title: Softw. Pract. Exp.
– start-page: 329
  year: 2008
  end-page: 331
  ident: bib0038
  article-title: JDeodorant: identification and removal of type-checking bad smells
  publication-title: 2008 12th European Conference on Software Maintenance and Reengineering
– volume: 31
  start-page: 469
  year: 2023
  end-page: 477
  ident: bib0074
  article-title: Long method detection using graph convolutional networks
  publication-title: J. Inf. Process.
– volume: 28
  start-page: 337
  year: 2000
  end-page: 407
  ident: bib0058
  article-title: Additive logistic regression: a statistical view of boosting
  publication-title: Ann. Stat.
– start-page: 1806
  year: 2018
  ident: bib0084
  article-title: Hierarchical graph representation learning with differentiable pooling
  publication-title: arXiv preprint
– volume: 99
  start-page: 1
  year: 2018
  end-page: 10
  ident: bib0079
  article-title: A large-scale empirical study on the lifecycle of code smell co-occurrences
  publication-title: Inf. Softw. Technol.
– volume: 32
  start-page: 3101
  year: 2024
  end-page: 3112
  ident: bib0087
  article-title: Hyperbolic pre-trained language model
  publication-title: IEEE/ACM. Trans. Audio Speech. Lang. Process.
– volume: 36
  start-page: 20
  year: 2010
  end-page: 36
  ident: bib0014
  article-title: DECOR: a method for the specification and detection of code and design smells
  publication-title: IEEE Trans. Softw. Eng.
– start-page: 342
  year: 2020
  end-page: 347
  ident: bib0091
  article-title: MLCQ: industry-relevant code smell data set
  publication-title: the 24th International Conference on Evaluation and Assessment in Software Engineering
– year: 1999
  ident: bib0001
  article-title: Refactoring: Improving the Design of Existing Code
– start-page: 1
  year: 2010
  end-page: 10
  ident: bib0011
  article-title: Are all code smells harmful? a study of God classes and Brain classes in the evolution of three open source systems
  publication-title: 2010 IEEE International Conference on Software Maintenance
– start-page: 2205
  year: 2022
  ident: bib0088
  article-title: Hyperbolic relevance matching for neural keyphase extraction
  publication-title: arXiv preprint
– start-page: 2202
  year: 2022
  ident: bib0086
  article-title: Hyperbolic graph neural networks: a review of methods and applications
  publication-title: arXiv preprint
– volume: 28
  start-page: 1063
  year: 2020
  end-page: 1086
  ident: bib0021
  article-title: Code smell detection using multi-label classification approach
  publication-title: Softw. Qual. J.
– volume: 17
  start-page: 243
  year: 2012
  end-page: 275
  ident: bib0003
  article-title: An exploratory study of the impact of antipatterns on class change- and fault-proneness
  publication-title: Empir. Softw. Eng.
– start-page: 738
  year: 2021
  end-page: 748
  ident: bib0069
  article-title: A novel tree-based neural network for android code smells detection
  publication-title: 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS)
– start-page: 25
  year: 2005
  end-page: 30
  ident: bib0096
  article-title: iPlasma: an integrated platform for quality assessment of object-oriented design
  publication-title: 21st IEEE International Conference on Software Maintenance (ICSM)
– volume: 204
  year: 2022
  ident: bib0048
  article-title: Automatic detection of long method and god class code smells through neural source code embeddings
  publication-title: Expert. Syst. Appl.
– start-page: 1
  year: 2016
  end-page: 10
  ident: bib0044
  article-title: A textual-based technique for smell detection
  publication-title: 2016 IEEE 24th International Conference on Program Comprehension (ICPC)
– start-page: 93
  year: 2019
  end-page: 104
  ident: bib0020
  article-title: Comparing heuristic and machine learning approaches for metric-based code smell detection
  publication-title: 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC)
– start-page: 297
  year: 2016
  end-page: 308
  ident: bib0065
  article-title: Automatically learning semantic features for defect prediction
  publication-title: 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)
– volume: 42
  start-page: 544
  year: 2016
  end-page: 558
  ident: bib0046
  article-title: Dynamic and automatic feedback-based threshold adaptation for code smell detection
  publication-title: IEEE Trans. Softw. Eng.
– volume: 30
  start-page: 1359
  year: 2019
  end-page: 1374
  ident: bib0066
  article-title: God Class detection approach based on deep learning
  publication-title: J. Softw.
– start-page: 42
  year: 2022
  end-page: 47
  ident: bib0070
  article-title: Multi-label code smell detection with hybrid model based on deep learning
  publication-title: 2022 International Conference on Software Engineering and Knowledge Engineering
– start-page: 257
  year: 2022
  end-page: 266
  ident: bib0019
  article-title: Code smell detection using classification approaches, in: intelligent Systems
  publication-title: Lecture Notes in Networks and Systems
– volume: 14
  start-page: 6149
  year: 2024
  ident: bib0061
  article-title: Machine learning-based methods for code smell detection: a survey
  publication-title: Appl. Sci.
– start-page: 1
  year: 2022
  end-page: 4
  ident: bib0050
  article-title: Dimensionally reduction based machine learning approaches for code smells detection
  publication-title: 2022 International Conference on Intelligent Controller and Computing for Smart Power (ICICCSP)
– volume: 43
  start-page: 2667
  year: 2022
  end-page: 2674
  ident: bib0006
  article-title: Research on developer perceived code smell intensity prediction model based on LightGBM and CFS
  publication-title: J. Chin. Comput. Syst.
– volume: 8
  start-page: 23
  year: 2015
  end-page: 28
  ident: bib0013
  article-title: Sequential ordering of code smells and usage of heuristic algorithm
  publication-title: Indian J. Sci. Technol.
– start-page: 311
  year: 2019
  end-page: 317
  ident: bib0023
  article-title: Exercise difficulty prediction in online education systems
  publication-title: 2019 International Conference on Data Mining Workshops (ICDMW)
– volume: 35
  start-page: 6651
  year: 2023
  end-page: 6672
  ident: bib0080
  article-title: An investigation of SMOTE based methods for imbalanced datasets with data complexity analysis
  publication-title: IEEe Trans. Knowl. Data Eng.
– volume: 139
  year: 2025
  ident: bib0062
  article-title: Ensemble methods with feature selection and data balancing for improved code smells classification performance
  publication-title: Eng. Appl. Artif. Intell.
– start-page: 318
  year: 2017
  end-page: 328
  ident: bib0063
  article-title: Software defect prediction via convolutional neural network
  publication-title: 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS)
– volume: 176
  year: 2021
  ident: bib0031
  article-title: Code smell detection by deep direct-learning and transfer-learning
  publication-title: J. Syst. Softw.
– volume: 138
  year: 2021
  ident: bib0059
  article-title: Code smell detection using feature selection and stacking ensemble: an empirical investigation
  publication-title: Inf. Softw. Technol.
– start-page: 305
  year: 2009
  end-page: 314
  ident: bib0018
  article-title: A bayesian approach for the detection of code and design smells
  publication-title: 2009 Ninth International Conference on Quality Software
– volume: 725
  start-page: 77
  year: 2023
  end-page: 86
  ident: bib0056
  article-title: Method-level code smells detection using machine learning models
  publication-title: Lect. Notes Netw. Syst.
– volume: 141
  start-page: 117
  year: 2005
  end-page: 136
  ident: bib0017
  article-title: Adaptive detection of design flaws
  publication-title: Electr. Notes Theor. Comput. Sci.
– start-page: 682
  year: 2013
  end-page: 691
  ident: bib0004
  article-title: Exploring the impact of inter-smell relations on software maintainability: an empirical study
  publication-title: 2013 International Conference on Software Engineering
– start-page: 219
  year: 2019
  end-page: 232
  ident: bib0024
  article-title: Deep learning-based vulnerable function detection: a benchmark
  publication-title: 21st International Conference on Information and Communications Security (ICICS)
– start-page: 482
  year: 2015
  end-page: 485
  ident: bib0090
  article-title: Landfill: an open dataset of code smells with public evaluation
  publication-title: 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories
– volume: 23
  start-page: 1188
  year: 2018
  end-page: 1221
  ident: bib0078
  article-title: On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation
  publication-title: Empir. Softw. Eng.
– start-page: 439
  year: 2015
  end-page: 444
  ident: bib0093
  article-title: Combining feature subset selection and data sampling for coping with highly imbalanced software data
  publication-title: International Conference on Software Engineering and Knowledge Engineering
– volume: 48
  start-page: 3103
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0082
  article-title: On the value of oversampling for deep learning in software defect prediction
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2021.3079841
– volume: 4
  start-page: 2776
  year: 2009
  ident: 10.1016/j.scico.2025.103284_bib0055
  article-title: Ensemble learning
  publication-title: Scholarpedia
  doi: 10.4249/scholarpedia.2776
– start-page: 1806
  year: 2018
  ident: 10.1016/j.scico.2025.103284_bib0084
  article-title: Hierarchical graph representation learning with differentiable pooling
  publication-title: arXiv preprint
– start-page: 299
  year: 2012
  ident: 10.1016/j.scico.2025.103284_bib0036
  article-title: Detecting bad smells with weight based distance metrics theory
– start-page: 148
  year: 2024
  ident: 10.1016/j.scico.2025.103284_bib0071
  article-title: Beyond the code: unraveling the applicability of graph neural networks in smell detection
– volume: 152
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0034
  article-title: Software defect prediction with semantic and structural information of codes based on Graph Neural Networks
  publication-title: Inf. Softw. Technol.
  doi: 10.1016/j.infsof.2022.107057
– volume: 42
  start-page: 544
  year: 2016
  ident: 10.1016/j.scico.2025.103284_bib0046
  article-title: Dynamic and automatic feedback-based threshold adaptation for code smell detection
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2015.2503740
– volume: E107
  start-page: 1140
  year: 2024
  ident: 10.1016/j.scico.2025.103284_bib0073
  article-title: Large class detection using GNNs: a graph based deep learning approach utilizing three typical GNN model architectures
  publication-title: IEICe Trans. Inf. Syst.
  doi: 10.1587/transinf.2023EDP7192
– start-page: 151
  year: 2010
  ident: 10.1016/j.scico.2025.103284_bib0042
  article-title: A two-step technique for extract class refactoring
– volume: 21
  start-page: 1143
  year: 2016
  ident: 10.1016/j.scico.2025.103284_bib0051
  article-title: Comparing and experimenting machine learning techniques for code smell detection
  publication-title: Empir. Softw. Eng.
  doi: 10.1007/s10664-015-9378-4
– volume: 32
  start-page: 3101
  year: 2024
  ident: 10.1016/j.scico.2025.103284_bib0087
  article-title: Hyperbolic pre-trained language model
  publication-title: IEEE/ACM. Trans. Audio Speech. Lang. Process.
  doi: 10.1109/TASLP.2024.3407575
– volume: 2004
  start-page: 350
  year: 2004
  ident: 10.1016/j.scico.2025.103284_bib0039
  article-title: Detection strategies: metrics-based rules for detecting design flaws
– year: 2021
  ident: 10.1016/j.scico.2025.103284_bib0067
  article-title: MARS: detecting brain class/method code smell based on metric–attention mechanism and residual network
  publication-title: J. Softw. Evolut. Process.
– volume: 112
  start-page: 2999
  year: 2008
  ident: 10.1016/j.scico.2025.103284_bib0057
  article-title: Evaluation of random Forest and Adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery
  publication-title: Remote Sens. Environ.
  doi: 10.1016/j.rse.2008.02.011
– volume: 94
  start-page: 14
  year: 2018
  ident: 10.1016/j.scico.2025.103284_bib0008
  article-title: Understanding metric-based detectable smells in Python software: a comparative study
  publication-title: Inf. Softw. Technol.
  doi: 10.1016/j.infsof.2017.09.011
– start-page: 336
  year: 2010
  ident: 10.1016/j.scico.2025.103284_bib0052
  article-title: The Qualitas corpus: a curated collection of java code for empirical studies
– volume: 143
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0054
  article-title: A comparison of machine learning algorithms on design smell detection using balanced and imbalanced dataset: a study of God class
  publication-title: Inf. Softw. Technol.
  doi: 10.1016/j.infsof.2021.106736
– start-page: 482
  year: 2015
  ident: 10.1016/j.scico.2025.103284_bib0090
  article-title: Landfill: an open dataset of code smells with public evaluation
– volume: 167
  year: 2020
  ident: 10.1016/j.scico.2025.103284_bib0005
  article-title: Code smells and refactoring: a tertiary systematic review of challenges and observations
  publication-title: J. Syst. Softw.
  doi: 10.1016/j.jss.2020.110610
– volume: 24
  start-page: 889
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0016
  article-title: GSDetector: a tool for automatic detection of bad smells in GRL goal models
  publication-title: Int. J. Softw. Tools. Technol. Transf.
  doi: 10.1007/s10009-022-00662-2
– start-page: 25
  year: 2005
  ident: 10.1016/j.scico.2025.103284_bib0096
  article-title: iPlasma: an integrated platform for quality assessment of object-oriented design
– start-page: 1911
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0085
  article-title: Hierarchical graph pooling with structure learning
  publication-title: arXiv preprint
– start-page: 439
  year: 2015
  ident: 10.1016/j.scico.2025.103284_bib0093
  article-title: Combining feature subset selection and data sampling for coping with highly imbalanced software data
  doi: 10.18293/SEKE2015-182
– start-page: 19
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0026
  article-title: On the role of data balancing for machine learning-based code smell detection
– start-page: 1808
  year: 2018
  ident: 10.1016/j.scico.2025.103284_bib0076
  article-title: code2seq: generating sequences from structured representations of code
  publication-title: arXiv preprint
– volume: 255
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0022
  article-title: DeleSmell: code smell detection based on deep learning and latent semantic analysis
  publication-title: Knowl. Based. Syst.
  doi: 10.1016/j.knosys.2022.109737
– start-page: 445
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0010
  article-title: The rise of android code smells: who is to blame?
– start-page: 257
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0019
  article-title: Code smell detection using classification approaches, in: intelligent Systems
  doi: 10.1007/978-981-19-0901-6_25
– start-page: 403
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0092
  article-title: Semi-supervised detection of Long Method and God Class code smells
– start-page: 1025
  year: 2017
  ident: 10.1016/j.scico.2025.103284_bib0072
  article-title: Inductive representation learning on large graphs
– volume: 31
  start-page: 469
  year: 2023
  ident: 10.1016/j.scico.2025.103284_bib0074
  article-title: Long method detection using graph convolutional networks
  publication-title: J. Inf. Process.
– volume: 28
  start-page: 337
  year: 2000
  ident: 10.1016/j.scico.2025.103284_bib0058
  article-title: Additive logistic regression: a statistical view of boosting
  publication-title: Ann. Stat.
  doi: 10.1214/aos/1016218223
– volume: 10
  start-page: 108870
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0032
  article-title: Cognitive complexity and graph convolutional approach over control flow graph for software defect prediction
  publication-title: IEEe Access.
  doi: 10.1109/ACCESS.2022.3213844
– start-page: 1711
  year: 2017
  ident: 10.1016/j.scico.2025.103284_bib0075
  article-title: Learning to represent programs with graphs
  publication-title: arXiv preprint
– volume: 35
  start-page: 6651
  year: 2023
  ident: 10.1016/j.scico.2025.103284_bib0080
  article-title: An investigation of SMOTE based methods for imbalanced datasets with data complexity analysis
  publication-title: IEEe Trans. Knowl. Data Eng.
– volume: 43
  start-page: 2667
  issue: 12
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0006
  article-title: Research on developer perceived code smell intensity prediction model based on LightGBM and CFS
  publication-title: J. Chin. Comput. Syst.
– start-page: 181
  year: 2011
  ident: 10.1016/j.scico.2025.103284_bib0002
  article-title: An empirical study of the impact of two antipatterns, Blob and Spaghetti Code, on program comprehension
– year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0083
  article-title: Towards graph pooling by edge contraction
– start-page: 738
  year: 2021
  ident: 10.1016/j.scico.2025.103284_bib0069
  article-title: A novel tree-based neural network for android code smells detection
– volume: 141
  start-page: 117
  year: 2005
  ident: 10.1016/j.scico.2025.103284_bib0017
  article-title: Adaptive detection of design flaws
  publication-title: Electr. Notes Theor. Comput. Sci.
  doi: 10.1016/j.entcs.2005.02.059
– start-page: 329
  year: 2008
  ident: 10.1016/j.scico.2025.103284_bib0038
  article-title: JDeodorant: identification and removal of type-checking bad smells
– start-page: 219
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0024
  article-title: Deep learning-based vulnerable function detection: a benchmark
– volume: 176
  year: 2021
  ident: 10.1016/j.scico.2025.103284_bib0031
  article-title: Code smell detection by deep direct-learning and transfer-learning
  publication-title: J. Syst. Softw.
  doi: 10.1016/j.jss.2021.110936
– volume: 14
  start-page: 6149
  year: 2024
  ident: 10.1016/j.scico.2025.103284_bib0061
  article-title: Machine learning-based methods for code smell detection: a survey
  publication-title: Appl. Sci.
  doi: 10.3390/app14146149
– start-page: 547
  year: 2015
  ident: 10.1016/j.scico.2025.103284_bib0068
  article-title: Building program vector representations for deep learning
– start-page: 1
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0050
  article-title: Dimensionally reduction based machine learning approaches for code smells detection
– start-page: 2205
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0088
  article-title: Hyperbolic relevance matching for neural keyphase extraction
  publication-title: arXiv preprint
– volume: 17
  start-page: 243
  year: 2012
  ident: 10.1016/j.scico.2025.103284_bib0003
  article-title: An exploratory study of the impact of antipatterns on class change- and fault-proneness
  publication-title: Empir. Softw. Eng.
  doi: 10.1007/s10664-011-9171-y
– volume: 49
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0009
  article-title: An exploratory study on cooccurrence of design patterns and bad smells using software metrics
  publication-title: Softw. Pract. Exp.
  doi: 10.1002/spe.2697
– volume: 8
  start-page: 23
  year: 2015
  ident: 10.1016/j.scico.2025.103284_bib0013
  article-title: Sequential ordering of code smells and usage of heuristic algorithm
  publication-title: Indian J. Sci. Technol.
  doi: 10.17485/ijst/2015/v8iS2/57796
– start-page: 2081
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0030
  article-title: Detecting code smells using deep learning
– volume: 40
  start-page: 671
  year: 2014
  ident: 10.1016/j.scico.2025.103284_bib0040
  article-title: Methodbook: recommending move method refactorings via relational topic models
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2013.60
– start-page: 93
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0020
  article-title: Comparing heuristic and machine learning approaches for metric-based code smell detection
– volume: 3
  start-page: 494
  year: 1993
  ident: 10.1016/j.scico.2025.103284_bib0095
  article-title: Dominance statistics: ordinal analyses to answer ordinal questions
  publication-title: Psychol. Bull.
  doi: 10.1037/0033-2909.114.3.494
– volume: 24
  start-page: 1373
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0033
  article-title: Research of software defect prediction model based on complex network and graph neural network
  publication-title: Entropy
  doi: 10.3390/e24101373
– start-page: 1
  year: 2016
  ident: 10.1016/j.scico.2025.103284_bib0044
  article-title: A textual-based technique for smell detection
– year: 2021
  ident: 10.1016/j.scico.2025.103284_bib0028
  article-title: Multi-granularity code smell detection using deep learning method based on abstract syntax tree
  doi: 10.18293/SEKE2021-014
– start-page: 42
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0070
  article-title: Multi-label code smell detection with hybrid model based on deep learning
  doi: 10.18293/SEKE2022-077
– volume: 24
  start-page: 3484
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0007
  article-title: iPerfDetector: characterizing and detecting performance anti-patterns in iOS applications
  publication-title: Empir. Softw. Eng.
  doi: 10.1007/s10664-019-09703-y
– start-page: 311
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0023
  article-title: Exercise difficulty prediction in online education systems
– volume: 84
  start-page: 397
  year: 2011
  ident: 10.1016/j.scico.2025.103284_bib0041
  article-title: Identifying extract class refactoring opportunities using structural and semantic cohesion measures
  publication-title: J. Syst. Softw.
  doi: 10.1016/j.jss.2010.11.918
– volume: 23
  start-page: 1188
  year: 2018
  ident: 10.1016/j.scico.2025.103284_bib0078
  article-title: On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation
  publication-title: Empir. Softw. Eng.
  doi: 10.1007/s10664-017-9535-z
– volume: 725
  start-page: 77
  year: 2023
  ident: 10.1016/j.scico.2025.103284_bib0056
  article-title: Method-level code smells detection using machine learning models
  publication-title: Lect. Notes Netw. Syst.
  doi: 10.1007/978-981-99-3734-9_7
– volume: 169
  year: 2020
  ident: 10.1016/j.scico.2025.103284_bib0025
  article-title: A large empirical assessment of the role of data balancing in machine-learning-based code smell detection
  publication-title: J. Syst. Softw.
  doi: 10.1016/j.jss.2020.110693
– volume: 139
  year: 2025
  ident: 10.1016/j.scico.2025.103284_bib0062
  article-title: Ensemble methods with feature selection and data balancing for improved code smells classification performance
  publication-title: Eng. Appl. Artif. Intell.
  doi: 10.1016/j.engappai.2024.109527
– start-page: 297
  year: 2016
  ident: 10.1016/j.scico.2025.103284_bib0065
  article-title: Automatically learning semantic features for defect prediction
– volume: 30
  start-page: 1359
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0066
  article-title: God Class detection approach based on deep learning
  publication-title: J. Softw.
– start-page: 342
  year: 2020
  ident: 10.1016/j.scico.2025.103284_bib0091
  article-title: MLCQ: industry-relevant code smell data set
– volume: 212
  year: 2021
  ident: 10.1016/j.scico.2025.103284_bib0047
  article-title: Improving performance with hybrid feature selection and ensemble machine learning techniques for code smell detection
  publication-title: Sci. Comput. Program.
  doi: 10.1016/j.scico.2021.102713
– volume: 8
  start-page: 2223
  year: 2020
  ident: 10.1016/j.scico.2025.103284_bib0043
  article-title: Long method and Long parameter list code smells detection using functional and semantic characteristics
  publication-title: Int. J. Recent Technol. Eng.
– start-page: 1803
  year: 2018
  ident: 10.1016/j.scico.2025.103284_bib0077
  article-title: code2vec: learning distributed representations of code
  publication-title: arXiv preprint
– volume: 19
  start-page: 77
  year: 2007
  ident: 10.1016/j.scico.2025.103284_bib0045
  article-title: A survey and taxonomy of approaches for mining software repositories in the context of software evolution
  publication-title: J. Softw. Maint.
  doi: 10.1002/smr.344
– volume: 45
  year: 2020
  ident: 10.1016/j.scico.2025.103284_bib0089
  article-title: Bad smell detection using machine learning techniques: a systematic literature review
  publication-title: Arabian J. Sci. Eng.
  doi: 10.1007/s13369-019-04311-w
– start-page: 682
  year: 2013
  ident: 10.1016/j.scico.2025.103284_bib0004
  article-title: Exploring the impact of inter-smell relations on software maintainability: an empirical study
– volume: 36
  start-page: 20
  year: 2010
  ident: 10.1016/j.scico.2025.103284_bib0014
  article-title: DECOR: a method for the specification and detection of code and design smells
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2009.50
– volume: 6
  start-page: 196
  year: 1945
  ident: 10.1016/j.scico.2025.103284_bib0094
  article-title: Individual comparisons by ranking methods
  publication-title: Biom
– start-page: 46
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0064
  article-title: Lessons learned from using a deep tree-based model for software defect prediction in practice
– year: 1999
  ident: 10.1016/j.scico.2025.103284_bib0001
– volume: 99
  start-page: 1
  year: 2018
  ident: 10.1016/j.scico.2025.103284_bib0079
  article-title: A large-scale empirical study on the lifecycle of code smell co-occurrences
  publication-title: Inf. Softw. Technol.
  doi: 10.1016/j.infsof.2018.02.004
– volume: 204
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0048
  article-title: Automatic detection of long method and god class code smells through neural source code embeddings
  publication-title: Expert. Syst. Appl.
  doi: 10.1016/j.eswa.2022.117607
– volume: 29
  start-page: 5447
  year: 2024
  ident: 10.1016/j.scico.2025.103284_bib0081
  article-title: An empirical assessment of smote variants techniques and interpretation methods in improving the accuracy and the interpretability of student performance models
  publication-title: Educ. Inf. Technol.
  doi: 10.1007/s10639-023-12007-w
– volume: 28
  start-page: 1063
  year: 2020
  ident: 10.1016/j.scico.2025.103284_bib0021
  article-title: Code smell detection using multi-label classification approach
  publication-title: Softw. Qual. J.
  doi: 10.1007/s11219-020-09498-y
– start-page: 612
  year: 2018
  ident: 10.1016/j.scico.2025.103284_bib0053
  article-title: Detecting code smells using machine learning techniques: are we there yet?
– volume: 47
  start-page: 1811
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0029
  article-title: Deep learning based code smell detection
  publication-title: IEEE Trans. Softw. Eng.
– start-page: 318
  year: 2017
  ident: 10.1016/j.scico.2025.103284_bib0063
  article-title: Software defect prediction via convolutional neural network
– start-page: 2002
  year: 2020
  ident: 10.1016/j.scico.2025.103284_bib0097
  article-title: CodeBERT: a pre-trained model for programming and natural languages
  publication-title: arXiv preprint
– start-page: 1
  year: 2010
  ident: 10.1016/j.scico.2025.103284_bib0011
  article-title: Are all code smells harmful? a study of God classes and Brain classes in the evolution of three open source systems
– start-page: 305
  year: 2009
  ident: 10.1016/j.scico.2025.103284_bib0018
  article-title: A bayesian approach for the detection of code and design smells
– start-page: 1
  year: 2012
  ident: 10.1016/j.scico.2025.103284_bib0015
  article-title: Code smell detecting tool and code smell-structure bug relationship
– volume: 34
  start-page: 37
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0049
  article-title: Severity classification of software code smells using machine learning techniques: a comparative study
  publication-title: J. Softw. Evolut. Process.
– volume: 12
  start-page: 53664
  year: 2024
  ident: 10.1016/j.scico.2025.103284_bib0060
  article-title: An evaluation of multi-label classification approaches for method-level code smells detection
  publication-title: IEEe Access.
  doi: 10.1109/ACCESS.2024.3387856
– volume: 74
  start-page: 127
  year: 2016
  ident: 10.1016/j.scico.2025.103284_bib0035
  article-title: The relationship between design patterns and code smells: an exploratory study
  publication-title: Inf. Softw. Technol.
  doi: 10.1016/j.infsof.2016.02.003
– start-page: 171
  year: 2015
  ident: 10.1016/j.scico.2025.103284_bib0037
  article-title: When code smells twice as much: metric-based detection of variability-aware code smells
– start-page: 1
  year: 2019
  ident: 10.1016/j.scico.2025.103284_bib0027
  article-title: Deep representation learning for code smells detection using variational auto-encoder
– start-page: 2202
  year: 2022
  ident: 10.1016/j.scico.2025.103284_bib0086
  article-title: Hyperbolic graph neural networks: a review of methods and applications
  publication-title: arXiv preprint
– volume: 39
  start-page: 1144
  year: 2013
  ident: 10.1016/j.scico.2025.103284_bib0012
  article-title: Quantifying the effect of code smells on maintenance effort
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2012.89
– volume: 138
  year: 2021
  ident: 10.1016/j.scico.2025.103284_bib0059
  article-title: Code smell detection using feature selection and stacking ensemble: an empirical investigation
  publication-title: Inf. Softw. Technol.
  doi: 10.1016/j.infsof.2021.106648
SSID ssj0006471
Score 2.467813
Snippet •We propose a graph neural network-based model for long method and blob code smell detection.•The best strategies for the class imbalance of graph data and...
SourceID crossref
elsevier
SourceType Index Database
Publisher
StartPage 103284
SubjectTerms Class imbalance
Code smell
Graph neural network
Graph pooling
Hyperbolic space
Title Graph neural network-based long method and blob code smell detection
URI https://dx.doi.org/10.1016/j.scico.2025.103284
Volume 243
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwED5VZWHhjSiPygMjafOwnXasCqWA1AUqdYsc20FFwa1KWPnt-OxEgIQYGBPZUfTZvvPZ330HcIlXVzaWDYNI42kVL2SQcx0GoczjMBWoWOfUPmd8Oqf3C7ZowbjJhUFaZW37vU131rp-06_R7K-Xy_4jEug53jMyp9riMthpirO89_FF8-A-6HL63ti6UR5yHC_7XYkZgDHrOWE5-rt3-uZxJnuwU28Vycj_zT60tDmA3aYMA6lX5SFc36LoNEFlStvceF53gO5JkXJlnomvEk2EUSQvVznBNHby9qrLkihdOS6WOYL55OZpPA3q4giBtKuwCoY0kqFMI41HkVSoUNCCRcVAJIoNdBLrWPEBk8NUp3yIQUxEBeeSMaWKnEuVHEPbrIw-AaLtHknZwIVHRUKxYJ8QaRjLBGv45oWIO3DVgJKtvQZG1pDDXjKHYYYYZh7DDvAGuOzHUGbWSv_V8fS_Hc9gG588i_Yc2tXmXV_YvUKVd91k6MLW6O5hOvsEmX-7yg
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwED5V7QALb0R5emAkNC877ViVR0tLF1qpm-XYDioKaQXh_-OzEwQSYmBNclH0Ob7z2d99B3CJR1cml_W9QONuFcuklzLte75MQz8RqFhn1T6nbDiPHxZ00YBBXQuDtMrK9zufbr11daVTodlZL5edJyTQMzxnpFa1xaRALVSnok1o9Ufj4fTLITOXd1mJbzSoxYcszcu8WmIRYEivrbZc_HuA-hZ07nZgq1otkr77oF1o6GIPtutODKSamPtwc4-60wTFKc3jhaN2exihFMlXxTNxjaKJKBRJ81VKsJKdvL_qPCdKl5aOVRzA_O52Nhh6VX8ET5qJWHq9OJC-TAKNu5GxUL6IMxpkXREp2tVRqEPFulT2Ep2wHuYxQSwYk5QqlaVMqugQmsWq0EdAtFkmKZO7sCCLYuzZJ0TihzLCNr5pJsI2XNWg8LWTweA1P-yFWww5Ysgdhm1gNXD8x2hy46j_Mjz-r-EFbAxnjxM-GU3HJ7CJdxyp9hSa5duHPjNLhzI9r36NTyYtvns
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Graph+neural+network-based+long+method+and+blob+code+smell+detection&rft.jtitle=Science+of+computer+programming&rft.au=Zhang%2C+Minnan&rft.au=Jia%2C+Jingdong&rft.au=Capretz%2C+Luiz+Fernando&rft.au=Hou%2C+Xin&rft.date=2025-07-01&rft.pub=Elsevier+B.V&rft.issn=0167-6423&rft.volume=243&rft_id=info:doi/10.1016%2Fj.scico.2025.103284&rft.externalDocID=S0167642325000231
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-6423&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-6423&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-6423&client=summon