Malicious URL Detection Using Decision Tree-based Lexical Features Selection and Multilayer Perceptron Model

Network information security risks multiply and become more dangerous. Hackers today generally target end-to-end technology and take advantage of human weaknesses. Furthermore, hackers take advantage of technology weaknesses by applying various methods to attack. Nowadays, one of the greatest danger...

Full description

Saved in:

Bibliographic Details
Published in	UHD Journal of Science and Technology Vol. 6; no. 2; pp. 105 - 116
Main Authors	Ahmed, Warmn, Noor Ghazi M. Jameel
Format	Journal Article
Language	English
Published	University of Human Development 13.11.2022
Subjects	feature selection lexical feature malicious url multilayer perceptron synthetic minority oversampling technique
Online Access	Get full text

Cover

Loading…

Abstract	Network information security risks multiply and become more dangerous. Hackers today generally target end-to-end technology and take advantage of human weaknesses. Furthermore, hackers take advantage of technology weaknesses by applying various methods to attack. Nowadays, one of the greatest dangers to the modern digital world is malicious URLs, and stopping them is one of the biggest challenges in the field of cyber security. Detecting harmful URLs using machine learning and deep learning algorithms have been the subject of various academic papers. However, time and accuracy are the two biggest challenges of these tools. This paper proposes a multilayer perceptron (MLP) model that utilizes two significant aspects to make it more practical, lightweight, and fast: Using only lexical features and a decision tree (DT) algorithm to select the best relevant subset of features. The effectiveness of the experimental outcomes is evaluated in terms of time, accuracy, and error reduction. The results show that a MLP model using 35 features could achieve an accuracy of 94.51% utilizing only URL lexical features. Furthermore, the model is improved in time after applying the DT as feature selection with a slight improvement in accuracy and loss.
AbstractList	Network information security risks multiply and become more dangerous. Hackers today generally target end-to-end technology and take advantage of human weaknesses. Furthermore, hackers take advantage of technology weaknesses by applying various methods to attack. Nowadays, one of the greatest dangers to the modern digital world is malicious URLs, and stopping them is one of the biggest challenges in the field of cyber security. Detecting harmful URLs using machine learning and deep learning algorithms have been the subject of various academic papers. However, time and accuracy are the two biggest challenges of these tools. This paper proposes a multilayer perceptron (MLP) model that utilizes two significant aspects to make it more practical, lightweight, and fast: Using only lexical features and a decision tree (DT) algorithm to select the best relevant subset of features. The effectiveness of the experimental outcomes is evaluated in terms of time, accuracy, and error reduction. The results show that a MLP model using 35 features could achieve an accuracy of 94.51% utilizing only URL lexical features. Furthermore, the model is improved in time after applying the DT as feature selection with a slight improvement in accuracy and loss.
Author	Noor Ghazi M. Jameel Ahmed, Warmn
Author_xml	– sequence: 1 givenname: Warmn surname: Ahmed fullname: Ahmed, Warmn – sequence: 2 surname: Noor Ghazi M. Jameel fullname: Noor Ghazi M. Jameel
BookMark	eNqNkUtLAzEQgIMoqNX_sOB5azK7yWbxJL6hRdH2HLLJrKbE3ZKkYv-921Y9ePI0L-Zjhu-Y7Hd9h4ScMToGVoM8X73ZRUzjD9HBGijAeLlklOeMiT1yBBxYXgKr9n9zWh-S0xgXlFKQvCp4eUT8VHtnXL-K2fx5kl1jQpNc32Xz6LrXoTYubspZQMwbHdFmE_x0RvvsFnVaBYzZC_rvJd3ZbLryyXm9xpA9YTC4TGGYTHuL_oQctNpHPP2OIzK_vZld3eeTx7uHq8tJbgBqkRe6wUJLZAW3ppJ1W5UVp2A5M1xqYLRsmCkNCqCCtbbhFZe8hhrKujTDl8WIPOy4ttcLtQzuXYe16rVT20YfXpUOyRmPSmJVVA0gFwJLwYRspbAIHJmlRloYWJc7lgl9jAFbZVzSm29T0M4rRtXWhtrZUL821NaGGmwMjIs_jJ-b_rP9BckLl4A
CitedBy_id	crossref_primary_10_12720_jait_15_5_591_601
ContentType	Journal Article
DBID	AAYXX CITATION DOA
DOI	10.21928/uhdjst.v6n2y2022.pp105-116
DatabaseName	CrossRef DOAJ - Directory of Open Access Journals
DatabaseTitle	CrossRef
DatabaseTitleList	CrossRef
Database_xml	– sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website
DeliveryMethod	fulltext_linktorsrc
EISSN	2521-4217
EndPage	116
ExternalDocumentID	oai_doaj_org_article_8e737b2e566e46168f86de25e1d0c8d2 10_21928_uhdjst_v6n2y2022_pp105_116
GroupedDBID	AAYXX ADBBV ALMA_UNASSIGNED_HOLDINGS BCNDV CITATION GROUPED_DOAJ OK1
ID	FETCH-LOGICAL-c2296-3abe3a8e135dc789f747502d51c58a2104b1c4ce62061fdb575859292494c2093
IEDL.DBID	DOA
ISSN	2521-4209
IngestDate	Wed Aug 27 01:27:30 EDT 2025 Tue Jul 01 02:48:25 EDT 2025 Thu Apr 24 23:10:27 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	2
Language	English
License	http://creativecommons.org/licenses/by-nc-nd/4.0
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c2296-3abe3a8e135dc789f747502d51c58a2104b1c4ce62061fdb575859292494c2093
OpenAccessLink	https://doaj.org/article/8e737b2e566e46168f86de25e1d0c8d2
PageCount	12
ParticipantIDs	doaj_primary_oai_doaj_org_article_8e737b2e566e46168f86de25e1d0c8d2 crossref_citationtrail_10_21928_uhdjst_v6n2y2022_pp105_116 crossref_primary_10_21928_uhdjst_v6n2y2022_pp105_116
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2022-11-13
PublicationDateYYYYMMDD	2022-11-13
PublicationDate_xml	– month: 11 year: 2022 text: 2022-11-13 day: 13
PublicationDecade	2020
PublicationTitle	UHD Journal of Science and Technology
PublicationYear	2022
Publisher	University of Human Development
Publisher_xml	– name: University of Human Development
SSID	ssj0002857354
Score	2.2016191
Snippet	Network information security risks multiply and become more dangerous. Hackers today generally target end-to-end technology and take advantage of human...
SourceID	doaj crossref
SourceType	Open Website Enrichment Source Index Database
StartPage	105
SubjectTerms	feature selection lexical feature malicious url multilayer perceptron synthetic minority oversampling technique
Title	Malicious URL Detection Using Decision Tree-based Lexical Features Selection and Multilayer Perceptron Model
URI	https://doaj.org/article/8e737b2e566e46168f86de25e1d0c8d2
Volume	6
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA7iQbyIouKbgF7Lbp5NvflEREXUBW8ljykiS5V1F_TfO5Ouy9704LFtUsKXkPm-djIfY0cNKOu1V0UAFwpNaa6VUE2hbZSgRXCqyQmyd_ZqoK-fzfOc1RflhHXlgTvgeg5KVQYJSDtAW2Fd42wCaUCkfnQp774Y8-bE1Gv-ZGRKlS3QJManQmOTJXZIec_IaFxv8pJeP3ADsK38QvWP2vJdUEVOcj2fC09zVfxzuLlcZStTnshPuvGtsQVo19nwFllzpKxVPni44ecwzolULc8__vG688vhTyOAguJT4jfwSdPAiepNUFrzx2x8Q618m3g-fzv0yLv5fZfiMsInZJA23GCDy4uns6tiapdQRCkRY-UDKO9AKJNi6aoGlYLpy2RENM6jtNNBRB3BSozhTQqGpAKyIxRgOiJAapMttm8tbDEeNHjkBb7UQWpX2cprxF0Yb6yuQjLb7PgHpTpOa4mTpcWwRk2RIa47iOsZxHWGGAWH3WZ61vm9K6nxt26nNB2zLlQXO9_A1VJPV0v922rZ-Y-X7LJlGhqdSBRqjy2ORxPYR2oyDgd5FX4DlH3fFQ
linkProvider	Directory of Open Access Journals
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Malicious+URL+Detection+Using+Decision+Tree-based+Lexical+Features+Selection+and+Multilayer+Perceptron+Model&rft.jtitle=UHD+Journal+of+Science+and+Technology&rft.au=Warmn+Ahmed&rft.au=Noor+Ghazi+M.+Jameel&rft.date=2022-11-13&rft.pub=University+of+Human+Development&rft.issn=2521-4209&rft.eissn=2521-4217&rft.volume=6&rft.issue=2&rft.spage=105&rft.epage=116&rft_id=info:doi/10.21928%2Fuhdjst.v6n2y2022.pp105-116&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_8e737b2e566e46168f86de25e1d0c8d2
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2521-4209&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2521-4209&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2521-4209&client=summon