Investigating the Role of LASSO in Feature Selection for Educational Data Mining (EDM) Applications
With the advent of digitalization, education-related activities have started generating massive amounts of data from various facets, such as student interaction, assessment, and learning management systems. Such vast amounts of data become suitable areas for Educational Data Mining (EDM) to reveal i...
Saved in:
Published in | VFAST Transactions on Software Engineering Vol. 13; no. 2; pp. 56 - 67 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
04.05.2025
|
Online Access | Get full text |
ISSN | 2411-6246 2309-3978 |
DOI | 10.21015/vtse.v13i2.2111 |
Cover
Abstract | With the advent of digitalization, education-related activities have started generating massive amounts of data from various facets, such as student interaction, assessment, and learning management systems. Such vast amounts of data become suitable areas for Educational Data Mining (EDM) to reveal insights for actionable improvement in academic outcomes and personalized learning experiences. However, high dimensionality and the redundancy of the educational data also pose considerable threats to the accuracy, interpretability, and computational efficiency of modeling. Least Absolute Shrinkage and Selection Operator (LASSO) is one powerful technique for simultaneous regression and feature selection. By introducing sparsity, LASSO minimizes the absolute sum of regression coefficients, thereby forcing insignificant features to be reduced to zero automatically. This feature is handy in EDM, where relevant indicators such as attendance, quiz scores, or study patterns must be distinguished from noisy or redundant variables. This paper systematically investigates the application of LASSO in EDM by giving the mathematical background and geometric interpretation, along with practical usage recommendations. Also, LASSO performance has been checked on synthetic and real datasets, including the famous dataset UCI Student Performance. The findings prove that LASSO significantly enhances model interpretability, predictive accuracy, and a decline in complexity. In conclusion, limitations are discussed, as well as practical considerations and future directions for LASSO applications to next-generation educational analytics. |
---|---|
AbstractList | With the advent of digitalization, education-related activities have started generating massive amounts of data from various facets, such as student interaction, assessment, and learning management systems. Such vast amounts of data become suitable areas for Educational Data Mining (EDM) to reveal insights for actionable improvement in academic outcomes and personalized learning experiences. However, high dimensionality and the redundancy of the educational data also pose considerable threats to the accuracy, interpretability, and computational efficiency of modeling. Least Absolute Shrinkage and Selection Operator (LASSO) is one powerful technique for simultaneous regression and feature selection. By introducing sparsity, LASSO minimizes the absolute sum of regression coefficients, thereby forcing insignificant features to be reduced to zero automatically. This feature is handy in EDM, where relevant indicators such as attendance, quiz scores, or study patterns must be distinguished from noisy or redundant variables. This paper systematically investigates the application of LASSO in EDM by giving the mathematical background and geometric interpretation, along with practical usage recommendations. Also, LASSO performance has been checked on synthetic and real datasets, including the famous dataset UCI Student Performance. The findings prove that LASSO significantly enhances model interpretability, predictive accuracy, and a decline in complexity. In conclusion, limitations are discussed, as well as practical considerations and future directions for LASSO applications to next-generation educational analytics. |
Author | Yousuf, Urooj Ramzan, Muhammad Shaikh, Muhammad Taha Khan, Mustafa Ahmed Mahboob, Khalid Salman Akber |
Author_xml | – sequence: 1 givenname: Mustafa Ahmed orcidid: 0009-0008-5656-9163 surname: Khan fullname: Khan, Mustafa Ahmed – sequence: 2 givenname: Khalid orcidid: 0000-0001-7431-4430 surname: Mahboob fullname: Mahboob, Khalid – sequence: 3 givenname: Urooj orcidid: 0000-0002-9514-0078 surname: Yousuf fullname: Yousuf, Urooj – sequence: 4 givenname: Muhammad surname: Ramzan fullname: Ramzan, Muhammad – sequence: 5 givenname: Muhammad Taha surname: Shaikh fullname: Shaikh, Muhammad Taha – sequence: 6 surname: Salman Akber fullname: Salman Akber |
BookMark | eNotkM1rwjAYxsNwMOe875jjdqjLm6RJexQ_NkERpveSpW9coEulicL--1nd6fmC5_B7JIPQBiTkGdiEA4P87ZwiTs4gPL8UAHdkyAUrM1HqYnDxEiBTXKoHMo7RfzEptZK50ENiV-GMMfmDST4caPpG-tk2SFtH19Pdbkt9oEs06dQh3WGDNvk2UNd2dFGfrOmTaejcJEM3PvQXL4v55pVOj8fG3_b4RO6daSKO_3VE9svFfvaRrbfvq9l0ndlCQKbrvLbcGkSNeS5LCzkztWZ5qaVTFzWlU6yoAQpulSitsCB1rQrrpORSiBFht1vbtTF26Kpj539M91sBq66Yqh5TdcVU9ZjEH0OVXWg |
Cites_doi | 10.1186/s40536-022-00150-8 10.1007/s42001-022-00195-3 10.1007/978-3-662-45620-0_3 10.1109/ICISCN64258.2025.10934398 10.1007/s10182-023-00472-0 10.1109/CICN63059.2024.10847424 10.1155/2022/8190814 10.1109/ACCESS.2025.3532099 10.14569/IJACSA.2022.0130156 10.3390/asi6050086 10.1109/TNNLS.2016.2551724 10.3390/math11173738 10.1016/j.aej.2024.09.100 10.1109/ACCESS.2024.3406252 10.3390/encyclopedia4040108 10.1080/00949655.2024.2337342 10.1016/j.eswa.2023.119729 10.1007/978-981-15-5113-0_23 10.1007/s41870-024-02282-2 10.1109/ACCESS.2024.3372082 10.1007/s13571-013-0065-4 10.1109/ASET53988.2022.9734973 10.1109/ICASSP49660.2025.10888851 10.1109/ICIET60671.2024.10542709 10.48084/etasr.8644 |
ContentType | Journal Article |
DBID | AAYXX CITATION |
DOI | 10.21015/vtse.v13i2.2111 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | CrossRef |
DeliveryMethod | fulltext_linktorsrc |
EISSN | 2309-3978 |
EndPage | 67 |
ExternalDocumentID | 10_21015_vtse_v13i2_2111 |
GroupedDBID | AAYXX CITATION M~E |
ID | FETCH-LOGICAL-c831-7d5dc2caee7e5549c150ad705974f6705a9f608d1182c639c3c147d68cf442433 |
ISSN | 2411-6246 |
IngestDate | Thu Jul 03 08:45:19 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Issue | 2 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c831-7d5dc2caee7e5549c150ad705974f6705a9f608d1182c639c3c147d68cf442433 |
ORCID | 0000-0001-7431-4430 0009-0008-5656-9163 0000-0002-9514-0078 |
OpenAccessLink | https://vfast.org/journals/index.php/VTSE/article/download/2111/1686 |
PageCount | 12 |
ParticipantIDs | crossref_primary_10_21015_vtse_v13i2_2111 |
PublicationCentury | 2000 |
PublicationDate | 2025-05-04 |
PublicationDateYYYYMMDD | 2025-05-04 |
PublicationDate_xml | – month: 05 year: 2025 text: 2025-05-04 day: 04 |
PublicationDecade | 2020 |
PublicationTitle | VFAST Transactions on Software Engineering |
PublicationYear | 2025 |
References | 58231 58230 58211 58233 58210 58232 58217 58239 58216 58238 58219 58218 58213 58235 58212 58234 58215 58237 58214 58236 58220 58241 58222 58221 58240 58228 58227 58229 58224 58223 58226 58225 |
References_xml | – ident: 58218 doi: 10.1186/s40536-022-00150-8 – ident: 58212 doi: 10.1007/s42001-022-00195-3 – ident: 58221 doi: 10.1007/978-3-662-45620-0_3 – ident: 58226 – ident: 58231 doi: 10.1109/ICISCN64258.2025.10934398 – ident: 58215 doi: 10.1007/s10182-023-00472-0 – ident: 58224 – ident: 58234 doi: 10.1109/CICN63059.2024.10847424 – ident: 58219 – ident: 58220 – ident: 58222 doi: 10.1155/2022/8190814 – ident: 58228 doi: 10.1109/ACCESS.2025.3532099 – ident: 58238 doi: 10.14569/IJACSA.2022.0130156 – ident: 58210 doi: 10.3390/asi6050086 – ident: 58216 doi: 10.1109/TNNLS.2016.2551724 – ident: 58241 doi: 10.3390/math11173738 – ident: 58240 – ident: 58217 doi: 10.1016/j.aej.2024.09.100 – ident: 58211 doi: 10.1109/ACCESS.2024.3406252 – ident: 58236 doi: 10.3390/encyclopedia4040108 – ident: 58237 doi: 10.1080/00949655.2024.2337342 – ident: 58214 doi: 10.1016/j.eswa.2023.119729 – ident: 58213 doi: 10.1007/978-981-15-5113-0_23 – ident: 58225 – ident: 58230 doi: 10.1007/s41870-024-02282-2 – ident: 58223 – ident: 58235 doi: 10.1109/ACCESS.2024.3372082 – ident: 58227 doi: 10.1007/s13571-013-0065-4 – ident: 58239 doi: 10.1109/ASET53988.2022.9734973 – ident: 58229 doi: 10.1109/ICASSP49660.2025.10888851 – ident: 58232 doi: 10.1109/ICIET60671.2024.10542709 – ident: 58233 doi: 10.48084/etasr.8644 |
SSID | ssib044764537 |
Score | 1.9093097 |
Snippet | With the advent of digitalization, education-related activities have started generating massive amounts of data from various facets, such as student... |
SourceID | crossref |
SourceType | Index Database |
StartPage | 56 |
Title | Investigating the Role of LASSO in Feature Selection for Educational Data Mining (EDM) Applications |
Volume | 13 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1La9tAEF7c9NJLaUlLXyl7yKEhKLWk1a50NIlDCFUPtVNyE6t9YJfaCYmcQA79S_2LmdmV5bVpIclFFosYJM_H7DfPJWSXCctzlquIqbqIwP-ykWQFYJkVic6trWOFzcnld35yxk7Ps_Ne729QtbRo6gN198--kqdoFdZAr9gl-wjNdkJhAe5Bv3AFDcP1QToOhmS0TU8_2mLBbziuEWMZSPEwRzBy590s6wq7sg40erKR-6U7JwLp5vCoxEjBIEhshwT25_FgNPYj0X1LhEs3jMCY32INWTDesDPlEx9jLbFVy8r9wWTWNlS5QPgEaL7LCcFzv6c6sELXCzcw8gy4_a9VOmp2txQ3kbOZ1GHYIslckSBbWTdgDnHEkzb-aPwapnuAIOVr5jkNYJgEtjbjwa7tz_TY3A_An3WzM26aa3NwE6fTBJZa4742entjS-wKFcFFcjIqlFA5CRVKeEaeJ0K4uoDyz3BpwBgTnGVuTmv3eT417oR83XiNgAoFnGb8irxsnRE68Mh6TXpmvk3UGqoooIoiquiFpQ5VdDqnLapohyoKqKIBqiiiinpU0S-AqT0aIuoNGR8Px4cnUXsUR6TyNI6EzrRKlDRGGOCfhQI3QmrRR2_UcviVheX9XKO3qoDzqlTFTGieK8tYwtL0LdmaX8zNO0L7WoOll7UGksBUURdG4MhKK2FnqbnJ3pO95Z9SXfqBK9X_tPDhEc9-JC9WMPxEtpqrhdkBPtnUn50O7wHm-3eA |
linkProvider | ISSN International Centre |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Investigating+the+Role+of+LASSO+in+Feature+Selection+for+Educational+Data+Mining+%28EDM%29+Applications&rft.jtitle=VFAST+Transactions+on+Software+Engineering&rft.au=Khan%2C+Mustafa+Ahmed&rft.au=Mahboob%2C+Khalid&rft.au=Yousuf%2C+Urooj&rft.au=Ramzan%2C+Muhammad&rft.date=2025-05-04&rft.issn=2411-6246&rft.eissn=2309-3978&rft.volume=13&rft.issue=2&rft.spage=56&rft.epage=67&rft_id=info:doi/10.21015%2Fvtse.v13i2.2111&rft.externalDBID=n%2Fa&rft.externalDocID=10_21015_vtse_v13i2_2111 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2411-6246&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2411-6246&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2411-6246&client=summon |