Scalable Graph Learning with Graph Convolutional Networks and Graph Attention Networks: Addressing Class Imbalance Through Augmentation and Optimized Hyperparameter Tuning

In this study, we propose a graph-based node classification to address challenges such as data scarcity, class imbalance, limited access to original textual content in benchmark datasets, semantic preservation, and model generalization in node classification tasks. Beyond simple data replication, we...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of advanced computer science & applications Vol. 16; no. 7
Main Authors Touate, Chaima Ahle, Ayachi, Rachid El, Biniz, Mohamed
Format Journal Article
LanguageEnglish
Published West Yorkshire Science and Information (SAI) Organization Limited 2025
Subjects
Online AccessGet full text
ISSN2158-107X
2156-5570
DOI10.14569/IJACSA.2025.0160740

Cover

Loading…
Abstract In this study, we propose a graph-based node classification to address challenges such as data scarcity, class imbalance, limited access to original textual content in benchmark datasets, semantic preservation, and model generalization in node classification tasks. Beyond simple data replication, we enhanced the Cora dataset by extracting content from its original PostScript files using a three-dimensional framework that combines in one pipeline NLP-based techniques such as PEGASUS paraphrase, synthetic model generation and a controlled subject aware synonym replacement. We substantially expanded the dataset to 17,780 nodes—representing an approximation of 6.57x scaling while maintaining semantic fidelity (WMD scores: 0.27-0.34). Our Bayesian Hyperparameter tuning was conducted using Optuna, along with k-fold cross-validation for a rigorous optimized model validation protocol. Our Graph Convolutional Network (GCN) model achieves 95.42% accuracy while Graph Attention Network (GAT) reaches 93.46%, even when scaled to a significantly larger dataset than the base. Our empirical analysis demonstrates that semantic-preserving augmentation helped us achieve better performance while maintaining model stability across scaled datasets, offering a cost-effective alternative to architectural complexity, making graph learning accessible to resource-constrained environments.
AbstractList In this study, we propose a graph-based node classification to address challenges such as data scarcity, class imbalance, limited access to original textual content in benchmark datasets, semantic preservation, and model generalization in node classification tasks. Beyond simple data replication, we enhanced the Cora dataset by extracting content from its original PostScript files using a three-dimensional framework that combines in one pipeline NLP-based techniques such as PEGASUS paraphrase, synthetic model generation and a controlled subject aware synonym replacement. We substantially expanded the dataset to 17,780 nodes—representing an approximation of 6.57x scaling while maintaining semantic fidelity (WMD scores: 0.27-0.34). Our Bayesian Hyperparameter tuning was conducted using Optuna, along with k-fold cross-validation for a rigorous optimized model validation protocol. Our Graph Convolutional Network (GCN) model achieves 95.42% accuracy while Graph Attention Network (GAT) reaches 93.46%, even when scaled to a significantly larger dataset than the base. Our empirical analysis demonstrates that semantic-preserving augmentation helped us achieve better performance while maintaining model stability across scaled datasets, offering a cost-effective alternative to architectural complexity, making graph learning accessible to resource-constrained environments.
Author Biniz, Mohamed
Ayachi, Rachid El
Touate, Chaima Ahle
Author_xml – sequence: 1
  givenname: Chaima Ahle
  surname: Touate
  fullname: Touate, Chaima Ahle
– sequence: 2
  givenname: Rachid El
  surname: Ayachi
  fullname: Ayachi, Rachid El
– sequence: 3
  givenname: Mohamed
  surname: Biniz
  fullname: Biniz, Mohamed
BookMark eNo9UU1v2zAMFYYWWNf2H-wgYGdn-la0m2FsbYZgPTQFejNYm0nc2ZIn2Svav9Q_OTsJxgsJ8r1HEu8TOfPBIyGfOVtwpY37uvqZF_f5QjChF4wbZhX7QC4E1ybT2rKzQ73MOLOPH8l1Ss9sCumEWcoL8n5fQQtPLdKbCP2erhGib_yOvjTD_tQrgv8b2nFogoeW_sLhJcTfiYKvT4B8GNDP4__DbzSv64gpzVJFCynRVfc0bfIV0s0-hnE30cZdN_HgwJzV7vqh6Zo3rOnta4-xhwgdDhjpZpxvuiLnW2gTXp_yJXn48X1T3Gbru5tVka-zinPHMmmENFw4sEKZGhG0clwYi85oZ5kS1mnEymq1tLVCttVbqJTlToKTBip5Sb4cdfsY_oyYhvI5jHH6PZVSKOb4UjI7odQRVcWQUsRt2cemg_haclYenCmPzpSzM-XJGfkPvk-Frw
ContentType Journal Article
Copyright 2025. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2025. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
3V.
7XB
8FE
8FG
8FK
8G5
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
GNUQQ
GUQSH
HCIFZ
JQ2
K7-
M2O
MBDVC
P5Z
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
Q9U
DOI 10.14569/IJACSA.2025.0160740
DatabaseName CrossRef
ProQuest Central (Corporate)
ProQuest Central (purchase pre-March 2016)
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Research Library
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Central
Technology collection
ProQuest One Community College
ProQuest Central Korea
ProQuest Central Student
ProQuest Research Library
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
ProQuest Research Library
Research Library (Corporate)
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central Basic
DatabaseTitle CrossRef
Publicly Available Content Database
Research Library Prep
Computer Science Database
ProQuest Central Student
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
Research Library (Alumni Edition)
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Central Korea
ProQuest Research Library
ProQuest Central (New)
Advanced Technologies & Aerospace Collection
ProQuest Central Basic
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
ProQuest One Academic
ProQuest One Academic (New)
ProQuest Central (Alumni)
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Architecture
EISSN 2156-5570
ExternalDocumentID 10_14569_IJACSA_2025_0160740
GroupedDBID .DC
5VS
8G5
AAYXX
ABUWG
ADMLS
AFKRA
ALMA_UNASSIGNED_HOLDINGS
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
CITATION
DWQXO
EBS
EJD
GNUQQ
GUQSH
HCIFZ
K7-
KQ8
M2O
OK1
PHGZM
PHGZT
PIMPY
PQGLB
RNS
3V.
7XB
8FE
8FG
8FK
JQ2
MBDVC
P62
PKEHL
PQEST
PQQKQ
PQUKI
PUEGO
Q9U
ID FETCH-LOGICAL-c1190-36236129a7246deea5491267e96597042795eec75487d4e0f5fac47193a936ac3
IEDL.DBID BENPR
ISSN 2158-107X
IngestDate Fri Aug 29 06:18:14 EDT 2025
Thu Aug 07 06:40:59 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Issue 7
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c1190-36236129a7246deea5491267e96597042795eec75487d4e0f5fac47193a936ac3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://www.proquest.com/docview/3240918307?pq-origsite=%requestingapplication%
PQID 3240918307
PQPubID 5444811
ParticipantIDs proquest_journals_3240918307
crossref_primary_10_14569_IJACSA_2025_0160740
PublicationCentury 2000
PublicationDate 2025-00-00
20250101
PublicationDateYYYYMMDD 2025-01-01
PublicationDate_xml – year: 2025
  text: 2025-00-00
PublicationDecade 2020
PublicationPlace West Yorkshire
PublicationPlace_xml – name: West Yorkshire
PublicationTitle International journal of advanced computer science & applications
PublicationYear 2025
Publisher Science and Information (SAI) Organization Limited
Publisher_xml – name: Science and Information (SAI) Organization Limited
SSID ssj0000392683
Score 2.2789388
Snippet In this study, we propose a graph-based node classification to address challenges such as data scarcity, class imbalance, limited access to original textual...
SourceID proquest
crossref
SourceType Aggregation Database
Index Database
SubjectTerms Approximation
Architecture
Artificial neural networks
Attention
Classification
Computer science
Data replication
Datasets
Empirical analysis
Innovations
Machine learning
Natural language processing
Neural networks
Semantics
Text categorization
Tuning
Title Scalable Graph Learning with Graph Convolutional Networks and Graph Attention Networks: Addressing Class Imbalance Through Augmentation and Optimized Hyperparameter Tuning
URI https://www.proquest.com/docview/3240918307
Volume 16
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LT8JAEJ4oXIyJD9SIItmD14YK223Xi6kEBBLRKCbcmrK7JR4oyMODf8k_6Uy7oFy8tptt05mdb2Y68w3AtZZGEWeJg1iiHc4ld6SQxlE3gUk8chKyavfHvui88d7QG9qE28KWVa5tYmao9VRRjrxGxHES9c_172YfDk2Nor-rdoTGLhTRBAdeAYr3rf7zyybL4iL8i4yLE6GNeEz9oe2fQ8dB1rq9sPkaYpRYJ_JOgXDqbuPTtnnOMKd9BAfWWWRhLt1j2DFpCfbDP7n_Ehyu5zIwe0xP4PsVPzy1RLEHoqNmlkN1zCjpaq81p-mn1Tp8Qj8vBl-wONV2Qbhc5pWQm5u3LNQ6K5vFrbJhmqw7GVFppDJskM_7YeFqPLHtTGm22xPapMn7l9GsgyHvnKjGJ1SCwwYreqdTeGu3Bs2OY4cyoPio7RwBr4FekYz9OhfamBgDzJu68A0xE_o0uUN6xiifIiHNjZt4SawQAWUjlg0Rq8YZFNJpas6BCc1xKU8MruKqLkYqSJTPlSvjYCSULIOzFkU0y7k3IopZSHRRLrqIRBdZ0ZWhspZXZE_iIvrVm4v_b1_CHm2Wp1cqUFjOV-YKHY7lqAq7QfuhanXrByor1o8
linkProvider ProQuest
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3JTsMwEB0BPYCQ2BFLAR_gGBFS16mREAplaVkKgiL1FlLbqTg0BdqC4JeQ-EZmEoflwo1rbNmR3ngWe-YNwKaWRhFniYO2RDucS-5IIY2jdiomLpOTkGa7XzRE7ZaftsqtEfjIa2EorTLXiami1j1Fd-TbRBwnUf5cf__h0aGuUfS6mrfQyMTizLy-YMjW36sfIr5bnnd81KzWHNtVAPenumnU2CU06zLyPS60MRFGSDue8A1R6_nUekKWjVE-ufKaGzcux5FCFS5LkSyJSJVw3VEooJsh8RQVDo4aV9dftzouuhsi5f5EU0q8qX7L1uuhoyK366dB9SbAqNQjslCB5tv9bQ9_m4PUxh3PwJR1TlmQSdMsjJhkDiaDH28NczCd94FgVi3Mw_sNAk0lWOyE6K-Z5WztMLrktd-qveTZSjnu0MiSz_ssSrSdEAwGWebl1-AuC7RO03RxqbR5J6t325SKqQxrZv2FWDDsdG35VJKudok6sHv_ZjSrYYj9RNTmXUr5Yc0h_dMC3P4LXIswlvQSswRMaI5TeWxwFleeaKtKrHyuXBlV2kLJZXByKMKHjOsjpBiJoAsz6EKCLrTQLUMxxyu0J78ffsvpyt_DGzBea16ch-f1xtkqTNDC2dVOEcYGT0Ozhs7OoL1uJYzB3X8L9ScJwA_l
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Scalable+Graph+Learning+with+Graph+Convolutional+Networks+and+Graph+Attention+Networks%3A+Addressing+Class+Imbalance+Through+Augmentation+and+Optimized+Hyperparameter+Tuning&rft.jtitle=International+journal+of+advanced+computer+science+%26+applications&rft.au=Touate%2C+Chaima+Ahle&rft.au=Ayachi%2C+Rachid+El&rft.au=Biniz%2C+Mohamed&rft.date=2025&rft.issn=2158-107X&rft.eissn=2156-5570&rft.volume=16&rft.issue=7&rft_id=info:doi/10.14569%2FIJACSA.2025.0160740&rft.externalDBID=n%2Fa&rft.externalDocID=10_14569_IJACSA_2025_0160740
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2158-107X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2158-107X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2158-107X&client=summon