Malicious URL Detection Using Decision Tree-based Lexical Features Selection and Multilayer Perceptron Model

Network information security risks multiply and become more dangerous. Hackers today generally target end-to-end technology and take advantage of human weaknesses. Furthermore, hackers take advantage of technology weaknesses by applying various methods to attack. Nowadays, one of the greatest danger...

Full description

Saved in:
Bibliographic Details
Published inUHD Journal of Science and Technology Vol. 6; no. 2; pp. 105 - 116
Main Authors Ahmed, Warmn, Noor Ghazi M. Jameel
Format Journal Article
LanguageEnglish
Published University of Human Development 13.11.2022
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Network information security risks multiply and become more dangerous. Hackers today generally target end-to-end technology and take advantage of human weaknesses. Furthermore, hackers take advantage of technology weaknesses by applying various methods to attack. Nowadays, one of the greatest dangers to the modern digital world is malicious URLs, and stopping them is one of the biggest challenges in the field of cyber security. Detecting harmful URLs using machine learning and deep learning algorithms have been the subject of various academic papers. However, time and accuracy are the two biggest challenges of these tools. This paper proposes a multilayer perceptron (MLP) model that utilizes two significant aspects to make it more practical, lightweight, and fast: Using only lexical features and a decision tree (DT) algorithm to select the best relevant subset of features. The effectiveness of the experimental outcomes is evaluated in terms of time, accuracy, and error reduction. The results show that a MLP model using 35 features could achieve an accuracy of 94.51% utilizing only URL lexical features. Furthermore, the model is improved in time after applying the DT as feature selection with a slight improvement in accuracy and loss.
AbstractList Network information security risks multiply and become more dangerous. Hackers today generally target end-to-end technology and take advantage of human weaknesses. Furthermore, hackers take advantage of technology weaknesses by applying various methods to attack. Nowadays, one of the greatest dangers to the modern digital world is malicious URLs, and stopping them is one of the biggest challenges in the field of cyber security. Detecting harmful URLs using machine learning and deep learning algorithms have been the subject of various academic papers. However, time and accuracy are the two biggest challenges of these tools. This paper proposes a multilayer perceptron (MLP) model that utilizes two significant aspects to make it more practical, lightweight, and fast: Using only lexical features and a decision tree (DT) algorithm to select the best relevant subset of features. The effectiveness of the experimental outcomes is evaluated in terms of time, accuracy, and error reduction. The results show that a MLP model using 35 features could achieve an accuracy of 94.51% utilizing only URL lexical features. Furthermore, the model is improved in time after applying the DT as feature selection with a slight improvement in accuracy and loss.
Author Noor Ghazi M. Jameel
Ahmed, Warmn
Author_xml – sequence: 1
  givenname: Warmn
  surname: Ahmed
  fullname: Ahmed, Warmn
– sequence: 2
  surname: Noor Ghazi M. Jameel
  fullname: Noor Ghazi M. Jameel
BookMark eNqNkUtLAzEQgIMoqNX_sOB5azK7yWbxJL6hRdH2HLLJrKbE3ZKkYv-921Y9ePI0L-Zjhu-Y7Hd9h4ScMToGVoM8X73ZRUzjD9HBGijAeLlklOeMiT1yBBxYXgKr9n9zWh-S0xgXlFKQvCp4eUT8VHtnXL-K2fx5kl1jQpNc32Xz6LrXoTYubspZQMwbHdFmE_x0RvvsFnVaBYzZC_rvJd3ZbLryyXm9xpA9YTC4TGGYTHuL_oQctNpHPP2OIzK_vZld3eeTx7uHq8tJbgBqkRe6wUJLZAW3ppJ1W5UVp2A5M1xqYLRsmCkNCqCCtbbhFZe8hhrKujTDl8WIPOy4ttcLtQzuXYe16rVT20YfXpUOyRmPSmJVVA0gFwJLwYRspbAIHJmlRloYWJc7lgl9jAFbZVzSm29T0M4rRtXWhtrZUL821NaGGmwMjIs_jJ-b_rP9BckLl4A
CitedBy_id crossref_primary_10_12720_jait_15_5_591_601
ContentType Journal Article
DBID AAYXX
CITATION
DOA
DOI 10.21928/uhdjst.v6n2y2022.pp105-116
DatabaseName CrossRef
DOAJ - Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList CrossRef

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
EISSN 2521-4217
EndPage 116
ExternalDocumentID oai_doaj_org_article_8e737b2e566e46168f86de25e1d0c8d2
10_21928_uhdjst_v6n2y2022_pp105_116
GroupedDBID AAYXX
ADBBV
ALMA_UNASSIGNED_HOLDINGS
BCNDV
CITATION
GROUPED_DOAJ
OK1
ID FETCH-LOGICAL-c2296-3abe3a8e135dc789f747502d51c58a2104b1c4ce62061fdb575859292494c2093
IEDL.DBID DOA
ISSN 2521-4209
IngestDate Wed Aug 27 01:27:30 EDT 2025
Tue Jul 01 02:48:25 EDT 2025
Thu Apr 24 23:10:27 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License http://creativecommons.org/licenses/by-nc-nd/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c2296-3abe3a8e135dc789f747502d51c58a2104b1c4ce62061fdb575859292494c2093
OpenAccessLink https://doaj.org/article/8e737b2e566e46168f86de25e1d0c8d2
PageCount 12
ParticipantIDs doaj_primary_oai_doaj_org_article_8e737b2e566e46168f86de25e1d0c8d2
crossref_citationtrail_10_21928_uhdjst_v6n2y2022_pp105_116
crossref_primary_10_21928_uhdjst_v6n2y2022_pp105_116
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-11-13
PublicationDateYYYYMMDD 2022-11-13
PublicationDate_xml – month: 11
  year: 2022
  text: 2022-11-13
  day: 13
PublicationDecade 2020
PublicationTitle UHD Journal of Science and Technology
PublicationYear 2022
Publisher University of Human Development
Publisher_xml – name: University of Human Development
SSID ssj0002857354
Score 2.2016191
Snippet Network information security risks multiply and become more dangerous. Hackers today generally target end-to-end technology and take advantage of human...
SourceID doaj
crossref
SourceType Open Website
Enrichment Source
Index Database
StartPage 105
SubjectTerms feature selection
lexical feature
malicious url
multilayer perceptron
synthetic minority oversampling technique
Title Malicious URL Detection Using Decision Tree-based Lexical Features Selection and Multilayer Perceptron Model
URI https://doaj.org/article/8e737b2e566e46168f86de25e1d0c8d2
Volume 6
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA7iQbyIouKbgF7Lbp5NvflEREXUBW8ljykiS5V1F_TfO5Ouy9704LFtUsKXkPm-djIfY0cNKOu1V0UAFwpNaa6VUE2hbZSgRXCqyQmyd_ZqoK-fzfOc1RflhHXlgTvgeg5KVQYJSDtAW2Fd42wCaUCkfnQp774Y8-bE1Gv-ZGRKlS3QJManQmOTJXZIec_IaFxv8pJeP3ADsK38QvWP2vJdUEVOcj2fC09zVfxzuLlcZStTnshPuvGtsQVo19nwFllzpKxVPni44ecwzolULc8__vG688vhTyOAguJT4jfwSdPAiepNUFrzx2x8Q618m3g-fzv0yLv5fZfiMsInZJA23GCDy4uns6tiapdQRCkRY-UDKO9AKJNi6aoGlYLpy2RENM6jtNNBRB3BSozhTQqGpAKyIxRgOiJAapMttm8tbDEeNHjkBb7UQWpX2cprxF0Yb6yuQjLb7PgHpTpOa4mTpcWwRk2RIa47iOsZxHWGGAWH3WZ61vm9K6nxt26nNB2zLlQXO9_A1VJPV0v922rZ-Y-X7LJlGhqdSBRqjy2ORxPYR2oyDgd5FX4DlH3fFQ
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Malicious+URL+Detection+Using+Decision+Tree-based+Lexical+Features+Selection+and+Multilayer+Perceptron+Model&rft.jtitle=UHD+Journal+of+Science+and+Technology&rft.au=Warmn+Ahmed&rft.au=Noor+Ghazi+M.+Jameel&rft.date=2022-11-13&rft.pub=University+of+Human+Development&rft.issn=2521-4209&rft.eissn=2521-4217&rft.volume=6&rft.issue=2&rft.spage=105&rft.epage=116&rft_id=info:doi/10.21928%2Fuhdjst.v6n2y2022.pp105-116&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_8e737b2e566e46168f86de25e1d0c8d2
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2521-4209&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2521-4209&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2521-4209&client=summon