Investigating the Role of LASSO in Feature Selection for Educational Data Mining (EDM) Applications

With the advent of digitalization, education-related activities have started generating massive amounts of data from various facets, such as student interaction, assessment, and learning management systems. Such vast amounts of data become suitable areas for Educational Data Mining (EDM) to reveal i...

Full description

Saved in:
Bibliographic Details
Published inVFAST Transactions on Software Engineering Vol. 13; no. 2; pp. 56 - 67
Main Authors Khan, Mustafa Ahmed, Mahboob, Khalid, Yousuf, Urooj, Ramzan, Muhammad, Shaikh, Muhammad Taha, Salman Akber
Format Journal Article
LanguageEnglish
Published 04.05.2025
Online AccessGet full text
ISSN2411-6246
2309-3978
DOI10.21015/vtse.v13i2.2111

Cover

Abstract With the advent of digitalization, education-related activities have started generating massive amounts of data from various facets, such as student interaction, assessment, and learning management systems. Such vast amounts of data become suitable areas for Educational Data Mining (EDM) to reveal insights for actionable improvement in academic outcomes and personalized learning experiences. However, high dimensionality and the redundancy of the educational data also pose considerable threats to the accuracy, interpretability, and computational efficiency of modeling. Least Absolute Shrinkage and Selection Operator (LASSO) is one powerful technique for simultaneous regression and feature selection. By introducing sparsity, LASSO minimizes the absolute sum of regression coefficients, thereby forcing insignificant features to be reduced to zero automatically. This feature is handy in EDM, where relevant indicators such as attendance, quiz scores, or study patterns must be distinguished from noisy or redundant variables. This paper systematically investigates the application of LASSO in EDM by giving the mathematical background and geometric interpretation, along with practical usage recommendations. Also, LASSO performance has been checked on synthetic and real datasets, including the famous dataset UCI Student Performance. The findings prove that LASSO significantly enhances model interpretability, predictive accuracy, and a decline in complexity. In conclusion, limitations are discussed, as well as practical considerations and future directions for LASSO applications to next-generation educational analytics.
AbstractList With the advent of digitalization, education-related activities have started generating massive amounts of data from various facets, such as student interaction, assessment, and learning management systems. Such vast amounts of data become suitable areas for Educational Data Mining (EDM) to reveal insights for actionable improvement in academic outcomes and personalized learning experiences. However, high dimensionality and the redundancy of the educational data also pose considerable threats to the accuracy, interpretability, and computational efficiency of modeling. Least Absolute Shrinkage and Selection Operator (LASSO) is one powerful technique for simultaneous regression and feature selection. By introducing sparsity, LASSO minimizes the absolute sum of regression coefficients, thereby forcing insignificant features to be reduced to zero automatically. This feature is handy in EDM, where relevant indicators such as attendance, quiz scores, or study patterns must be distinguished from noisy or redundant variables. This paper systematically investigates the application of LASSO in EDM by giving the mathematical background and geometric interpretation, along with practical usage recommendations. Also, LASSO performance has been checked on synthetic and real datasets, including the famous dataset UCI Student Performance. The findings prove that LASSO significantly enhances model interpretability, predictive accuracy, and a decline in complexity. In conclusion, limitations are discussed, as well as practical considerations and future directions for LASSO applications to next-generation educational analytics.
Author Yousuf, Urooj
Ramzan, Muhammad
Shaikh, Muhammad Taha
Khan, Mustafa Ahmed
Mahboob, Khalid
Salman Akber
Author_xml – sequence: 1
  givenname: Mustafa Ahmed
  orcidid: 0009-0008-5656-9163
  surname: Khan
  fullname: Khan, Mustafa Ahmed
– sequence: 2
  givenname: Khalid
  orcidid: 0000-0001-7431-4430
  surname: Mahboob
  fullname: Mahboob, Khalid
– sequence: 3
  givenname: Urooj
  orcidid: 0000-0002-9514-0078
  surname: Yousuf
  fullname: Yousuf, Urooj
– sequence: 4
  givenname: Muhammad
  surname: Ramzan
  fullname: Ramzan, Muhammad
– sequence: 5
  givenname: Muhammad Taha
  surname: Shaikh
  fullname: Shaikh, Muhammad Taha
– sequence: 6
  surname: Salman Akber
  fullname: Salman Akber
BookMark eNotkM1rwjAYxsNwMOe875jjdqjLm6RJexQ_NkERpveSpW9coEulicL--1nd6fmC5_B7JIPQBiTkGdiEA4P87ZwiTs4gPL8UAHdkyAUrM1HqYnDxEiBTXKoHMo7RfzEptZK50ENiV-GMMfmDST4caPpG-tk2SFtH19Pdbkt9oEs06dQh3WGDNvk2UNd2dFGfrOmTaejcJEM3PvQXL4v55pVOj8fG3_b4RO6daSKO_3VE9svFfvaRrbfvq9l0ndlCQKbrvLbcGkSNeS5LCzkztWZ5qaVTFzWlU6yoAQpulSitsCB1rQrrpORSiBFht1vbtTF26Kpj539M91sBq66Yqh5TdcVU9ZjEH0OVXWg
Cites_doi 10.1186/s40536-022-00150-8
10.1007/s42001-022-00195-3
10.1007/978-3-662-45620-0_3
10.1109/ICISCN64258.2025.10934398
10.1007/s10182-023-00472-0
10.1109/CICN63059.2024.10847424
10.1155/2022/8190814
10.1109/ACCESS.2025.3532099
10.14569/IJACSA.2022.0130156
10.3390/asi6050086
10.1109/TNNLS.2016.2551724
10.3390/math11173738
10.1016/j.aej.2024.09.100
10.1109/ACCESS.2024.3406252
10.3390/encyclopedia4040108
10.1080/00949655.2024.2337342
10.1016/j.eswa.2023.119729
10.1007/978-981-15-5113-0_23
10.1007/s41870-024-02282-2
10.1109/ACCESS.2024.3372082
10.1007/s13571-013-0065-4
10.1109/ASET53988.2022.9734973
10.1109/ICASSP49660.2025.10888851
10.1109/ICIET60671.2024.10542709
10.48084/etasr.8644
ContentType Journal Article
DBID AAYXX
CITATION
DOI 10.21015/vtse.v13i2.2111
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
DeliveryMethod fulltext_linktorsrc
EISSN 2309-3978
EndPage 67
ExternalDocumentID 10_21015_vtse_v13i2_2111
GroupedDBID AAYXX
CITATION
M~E
ID FETCH-LOGICAL-c831-7d5dc2caee7e5549c150ad705974f6705a9f608d1182c639c3c147d68cf442433
ISSN 2411-6246
IngestDate Thu Jul 03 08:45:19 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Issue 2
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c831-7d5dc2caee7e5549c150ad705974f6705a9f608d1182c639c3c147d68cf442433
ORCID 0000-0001-7431-4430
0009-0008-5656-9163
0000-0002-9514-0078
OpenAccessLink https://vfast.org/journals/index.php/VTSE/article/download/2111/1686
PageCount 12
ParticipantIDs crossref_primary_10_21015_vtse_v13i2_2111
PublicationCentury 2000
PublicationDate 2025-05-04
PublicationDateYYYYMMDD 2025-05-04
PublicationDate_xml – month: 05
  year: 2025
  text: 2025-05-04
  day: 04
PublicationDecade 2020
PublicationTitle VFAST Transactions on Software Engineering
PublicationYear 2025
References 58231
58230
58211
58233
58210
58232
58217
58239
58216
58238
58219
58218
58213
58235
58212
58234
58215
58237
58214
58236
58220
58241
58222
58221
58240
58228
58227
58229
58224
58223
58226
58225
References_xml – ident: 58218
  doi: 10.1186/s40536-022-00150-8
– ident: 58212
  doi: 10.1007/s42001-022-00195-3
– ident: 58221
  doi: 10.1007/978-3-662-45620-0_3
– ident: 58226
– ident: 58231
  doi: 10.1109/ICISCN64258.2025.10934398
– ident: 58215
  doi: 10.1007/s10182-023-00472-0
– ident: 58224
– ident: 58234
  doi: 10.1109/CICN63059.2024.10847424
– ident: 58219
– ident: 58220
– ident: 58222
  doi: 10.1155/2022/8190814
– ident: 58228
  doi: 10.1109/ACCESS.2025.3532099
– ident: 58238
  doi: 10.14569/IJACSA.2022.0130156
– ident: 58210
  doi: 10.3390/asi6050086
– ident: 58216
  doi: 10.1109/TNNLS.2016.2551724
– ident: 58241
  doi: 10.3390/math11173738
– ident: 58240
– ident: 58217
  doi: 10.1016/j.aej.2024.09.100
– ident: 58211
  doi: 10.1109/ACCESS.2024.3406252
– ident: 58236
  doi: 10.3390/encyclopedia4040108
– ident: 58237
  doi: 10.1080/00949655.2024.2337342
– ident: 58214
  doi: 10.1016/j.eswa.2023.119729
– ident: 58213
  doi: 10.1007/978-981-15-5113-0_23
– ident: 58225
– ident: 58230
  doi: 10.1007/s41870-024-02282-2
– ident: 58223
– ident: 58235
  doi: 10.1109/ACCESS.2024.3372082
– ident: 58227
  doi: 10.1007/s13571-013-0065-4
– ident: 58239
  doi: 10.1109/ASET53988.2022.9734973
– ident: 58229
  doi: 10.1109/ICASSP49660.2025.10888851
– ident: 58232
  doi: 10.1109/ICIET60671.2024.10542709
– ident: 58233
  doi: 10.48084/etasr.8644
SSID ssib044764537
Score 1.9093097
Snippet With the advent of digitalization, education-related activities have started generating massive amounts of data from various facets, such as student...
SourceID crossref
SourceType Index Database
StartPage 56
Title Investigating the Role of LASSO in Feature Selection for Educational Data Mining (EDM) Applications
Volume 13
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1La9tAEF7c9NJLaUlLXyl7yKEhKLWk1a50NIlDCFUPtVNyE6t9YJfaCYmcQA79S_2LmdmV5bVpIclFFosYJM_H7DfPJWSXCctzlquIqbqIwP-ykWQFYJkVic6trWOFzcnld35yxk7Ps_Ne729QtbRo6gN198--kqdoFdZAr9gl-wjNdkJhAe5Bv3AFDcP1QToOhmS0TU8_2mLBbziuEWMZSPEwRzBy590s6wq7sg40erKR-6U7JwLp5vCoxEjBIEhshwT25_FgNPYj0X1LhEs3jMCY32INWTDesDPlEx9jLbFVy8r9wWTWNlS5QPgEaL7LCcFzv6c6sELXCzcw8gy4_a9VOmp2txQ3kbOZ1GHYIslckSBbWTdgDnHEkzb-aPwapnuAIOVr5jkNYJgEtjbjwa7tz_TY3A_An3WzM26aa3NwE6fTBJZa4742entjS-wKFcFFcjIqlFA5CRVKeEaeJ0K4uoDyz3BpwBgTnGVuTmv3eT417oR83XiNgAoFnGb8irxsnRE68Mh6TXpmvk3UGqoooIoiquiFpQ5VdDqnLapohyoKqKIBqiiiinpU0S-AqT0aIuoNGR8Px4cnUXsUR6TyNI6EzrRKlDRGGOCfhQI3QmrRR2_UcviVheX9XKO3qoDzqlTFTGieK8tYwtL0LdmaX8zNO0L7WoOll7UGksBUURdG4MhKK2FnqbnJ3pO95Z9SXfqBK9X_tPDhEc9-JC9WMPxEtpqrhdkBPtnUn50O7wHm-3eA
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Investigating+the+Role+of+LASSO+in+Feature+Selection+for+Educational+Data+Mining+%28EDM%29+Applications&rft.jtitle=VFAST+Transactions+on+Software+Engineering&rft.au=Khan%2C+Mustafa+Ahmed&rft.au=Mahboob%2C+Khalid&rft.au=Yousuf%2C+Urooj&rft.au=Ramzan%2C+Muhammad&rft.date=2025-05-04&rft.issn=2411-6246&rft.eissn=2309-3978&rft.volume=13&rft.issue=2&rft.spage=56&rft.epage=67&rft_id=info:doi/10.21015%2Fvtse.v13i2.2111&rft.externalDBID=n%2Fa&rft.externalDocID=10_21015_vtse_v13i2_2111
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2411-6246&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2411-6246&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2411-6246&client=summon