Breast tumor prediction and feature importance score finding using machine learning algorithms

The subject matter of this study is breast tumor prediction and feature importance score finding using machine learning algorithms. The goal of this study was to develop an accurate predictive model for identifying breast tumors and determining the importance of various features in the prediction pr...

Full description

Saved in:
Bibliographic Details
Published inRadìoelektronnì ì komp'ûternì sistemi (Online) no. 4; pp. 32 - 42
Main Authors Kabir, Sk. Shalauddin, Ahmmed, Md. Sabbir, Siddique, Md. Moradul, Ema, Romana Rahman, Rahman, Motiur, Md. Galib, Syed
Format Journal Article
LanguageEnglish
Published National Aerospace University «Kharkiv Aviation Institute 06.12.2023
Subjects
Online AccessGet full text
ISSN1814-4225
2663-2012
DOI10.32620/reks.2023.4.03

Cover

Loading…
Abstract The subject matter of this study is breast tumor prediction and feature importance score finding using machine learning algorithms. The goal of this study was to develop an accurate predictive model for identifying breast tumors and determining the importance of various features in the prediction process.  The tasks undertaken included collecting and preprocessing the Wisconsin Breast Cancer original dataset (WBCD). Dividing the dataset into training and testing sets, training using machine learning algorithms such as Random Forest, Decision Tree (DT), Logistic Regression, Multi-Layer Perceptron, Gradient Boosting Classifier, Gradient Boosting Classifier (GBC), and K-Nearest Neighbors, evaluating the models using performance metrics, and calculating feature importance scores. The methods used involve data collection, preprocessing, model training, and evaluation. The outcomes showed that the Random Forest model is the most reliable predictor with 98.56 % accuracy. A total of 699 instances were found, and 461 instances were reached using data optimization methods. In addition, we ranked the top features from the dataset by feature importance scores to determine how they affect the classification models. Furthermore, it was subjected to a 10-fold cross-validation process for performance analysis and comparison. The conclusions drawn from this study highlight the effectiveness of machine learning algorithms in breast tumor prediction, achieving high accuracy and robust performance metrics. In addition, the analysis of feature importance scores provides valuable insights into the key indicators of breast cancer development. These findings contribute to the field of breast cancer diagnosis and prediction by enhancing early detection and personalized treatment strategies and improving patient outcomes.
AbstractList The subject matter of this study is breast tumor prediction and feature importance score finding using machine learning algorithms. The goal of this study was to develop an accurate predictive model for identifying breast tumors and determining the importance of various features in the prediction process.  The tasks undertaken included collecting and preprocessing the Wisconsin Breast Cancer original dataset (WBCD). Dividing the dataset into training and testing sets, training using machine learning algorithms such as Random Forest, Decision Tree (DT), Logistic Regression, Multi-Layer Perceptron, Gradient Boosting Classifier, Gradient Boosting Classifier (GBC), and K-Nearest Neighbors, evaluating the models using performance metrics, and calculating feature importance scores. The methods used involve data collection, preprocessing, model training, and evaluation. The outcomes showed that the Random Forest model is the most reliable predictor with 98.56 % accuracy. A total of 699 instances were found, and 461 instances were reached using data optimization methods. In addition, we ranked the top features from the dataset by feature importance scores to determine how they affect the classification models. Furthermore, it was subjected to a 10-fold cross-validation process for performance analysis and comparison. The conclusions drawn from this study highlight the effectiveness of machine learning algorithms in breast tumor prediction, achieving high accuracy and robust performance metrics. In addition, the analysis of feature importance scores provides valuable insights into the key indicators of breast cancer development. These findings contribute to the field of breast cancer diagnosis and prediction by enhancing early detection and personalized treatment strategies and improving patient outcomes.
Author Rahman, Motiur
Siddique, Md. Moradul
Ahmmed, Md. Sabbir
Ema, Romana Rahman
Md. Galib, Syed
Kabir, Sk. Shalauddin
Author_xml – sequence: 1
  givenname: Sk. Shalauddin
  orcidid: 0000-0002-0031-8807
  surname: Kabir
  fullname: Kabir, Sk. Shalauddin
– sequence: 2
  givenname: Md. Sabbir
  orcidid: 0009-0001-3048-3440
  surname: Ahmmed
  fullname: Ahmmed, Md. Sabbir
– sequence: 3
  givenname: Md. Moradul
  orcidid: 0000-0003-3264-5383
  surname: Siddique
  fullname: Siddique, Md. Moradul
– sequence: 4
  givenname: Romana Rahman
  orcidid: 0000-0002-2384-9539
  surname: Ema
  fullname: Ema, Romana Rahman
– sequence: 5
  givenname: Motiur
  orcidid: 0009-0007-5345-9818
  surname: Rahman
  fullname: Rahman, Motiur
– sequence: 6
  givenname: Syed
  orcidid: 0000-0002-5708-727X
  surname: Md. Galib
  fullname: Md. Galib, Syed
BookMark eNo9kL1OwzAURi1UJErpzJoXSLCvndgZoeKnUiUWWLFc-6Z1aezKTgfenqZFLPfTPcMZzi2ZhBiQkHtGKw4N0IeE37kCCrwSFeVXZApNw0ugDCZkyhQTpQCob8g85x2lFJSsmVRT8vWU0OShGI59TMUhofN28DEUJriiQzMcExa-P8Q0mGCxyDaeQOeD82FTHPN4e2O3PmCxR5PCCMx-E5Mftn2-I9ed2Wec_-2MfL48fyzeytX763LxuCotMMnLjtWwpghyLWXDVess5UyB41IioGwVU42RtG66RrZSCiUEnh4rbN2xtXJ8RpYXr4tmpw_J9yb96Gi8PoOYNtqkwds9apAc63WHTIqThXPDZYvogDvHTHMqOCMPF5dNMeeE3b-PUX2urcfaeqythaac_wJgnnVG
ContentType Journal Article
DBID AAYXX
CITATION
DOA
DOI 10.32620/reks.2023.4.03
DatabaseName CrossRef
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
DatabaseTitleList CrossRef

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2663-2012
EndPage 42
ExternalDocumentID oai_doaj_org_article_273e5bfe17444e33a379eed23dd1a602
10_32620_reks_2023_4_03
GroupedDBID 9MQ
AAYXX
ALMA_UNASSIGNED_HOLDINGS
CITATION
GROUPED_DOAJ
ID FETCH-LOGICAL-c2173-f152b0e27b776389dc03182d377e2e798186a7056f679774844e56fc4c5f1b8d3
IEDL.DBID DOA
ISSN 1814-4225
IngestDate Wed Aug 27 01:04:52 EDT 2025
Tue Jul 01 04:08:43 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c2173-f152b0e27b776389dc03182d377e2e798186a7056f679774844e56fc4c5f1b8d3
ORCID 0000-0002-2384-9539
0009-0001-3048-3440
0000-0002-0031-8807
0009-0007-5345-9818
0000-0003-3264-5383
0000-0002-5708-727X
OpenAccessLink https://doaj.org/article/273e5bfe17444e33a379eed23dd1a602
PageCount 11
ParticipantIDs doaj_primary_oai_doaj_org_article_273e5bfe17444e33a379eed23dd1a602
crossref_primary_10_32620_reks_2023_4_03
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-12-06
PublicationDateYYYYMMDD 2023-12-06
PublicationDate_xml – month: 12
  year: 2023
  text: 2023-12-06
  day: 06
PublicationDecade 2020
PublicationTitle Radìoelektronnì ì komp'ûternì sistemi (Online)
PublicationYear 2023
Publisher National Aerospace University «Kharkiv Aviation Institute
Publisher_xml – name: National Aerospace University «Kharkiv Aviation Institute
SSID ssj0002875178
ssib044757823
ssib052605930
ssib038076033
Score 2.2404816
Snippet The subject matter of this study is breast tumor prediction and feature importance score finding using machine learning algorithms. The goal of this study was...
SourceID doaj
crossref
SourceType Open Website
Index Database
StartPage 32
SubjectTerms benign
breast tumor
classification model
data optimization
machine learning
malignant
tumor
Title Breast tumor prediction and feature importance score finding using machine learning algorithms
URI https://doaj.org/article/273e5bfe17444e33a379eed23dd1a602
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3PT4MwFG6MJz0Yf8b5Kz148MIGtKVwdMZlMdGTS3aStLTgVNjC2P_ve4Ut3LyYcCkQAl8f_b7Sx_cIuRexzUIrfU8EUeZx6ydekgvrGcNhC3RoNf7g_PoWTWf8ZS7mvVJfmBPW2gO3wI2AXq3QuQXlzLllTDGZwLgeMmMCFbU2ksB5vckURBK6qEe99Tl0tQMq3LUFqvhtDcAv94lJisAN28B43OMQ5a0PEEPD9lFtv9HaO2RDPtyW1-oorOf07yhpckyOOi1JH9tnOCF7tjolhz2HwTPyMcak84Y2m3JZ01WNyzLYFVRVhubWuXrSRelEOHQ_XaOrJXUL2VVBMSm-oKXLt7S0KzBRUPVTLOtF81muz8ls8vz-NPW6kgpeBnMP5uVA19q3odRSolYxGb7UoWFSWuiyBP3tlARRlEcSlWEMmEMj45nIAx0bdkH2q2VlLwk1PJe5COJEyYxLBQdFGCseKi0iY_xgQB62KKWr1jkjhRmHAzRFQFMENOWpzwZkjCjuTkPLa7cDAiHtAiH9KxCu_uMi1-QA78rlq0Q3ZL-pN_YWVEej71yA_QLduM8o
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Breast+tumor+prediction+and+feature+importance+score+finding+using+machine+learning+algorithms&rft.jtitle=Rad%C3%ACoelektronn%C3%AC+%C3%AC+komp%27%C3%BBtern%C3%AC+sistemi+%28Online%29&rft.au=Sk.+Shalauddin+Kabir&rft.au=Md.+Sabbir+Ahmmed&rft.au=Md.+Moradul+Siddique&rft.au=Romana+Rahman+Ema&rft.date=2023-12-06&rft.pub=National+Aerospace+University+%C2%ABKharkiv+Aviation+Institute&rft.issn=1814-4225&rft.eissn=2663-2012&rft.issue=4&rft.spage=32&rft.epage=42&rft_id=info:doi/10.32620%2Freks.2023.4.03&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_273e5bfe17444e33a379eed23dd1a602
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1814-4225&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1814-4225&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1814-4225&client=summon