Comparative Study of Machine Learning Models on Multiple Breast Cancer Datasets

Carcinoma is one of the scariest and most frequently occurring cancers nowadays among females. It affects nearly around 10% of females all over the world at some point in their lives. Although the cure for this cancer is currently obtainable, the treatment is not effective enough if the disease is n...

Full description

Saved in:

Bibliographic Details
Published in	International Journal of Advanced Science Computing and Engineering Vol. 5; no. 1; pp. 15 - 24
Main Authors	Hussain Sujon, Md. Arman, Mustafa, Hossen
Format	Journal Article
Language	English
Published	16.01.2023
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Carcinoma is one of the scariest and most frequently occurring cancers nowadays among females. It affects nearly around 10% of females all over the world at some point in their lives. Although the cure for this cancer is currently obtainable, the treatment is not effective enough if the disease is not identified at the early stages. Generally, some contemporary medical tests: roentgenogram, breast ultrasound, biopsy, etc., are used for identifying breast cancer. As an alternative, researchers are exploring machine learning techniques for classifying tumours at different stages, e.g., benign and malignant. Classification and data processing strategies can be effective mechanisms for the prediction of cancer. In this paper, we analyze six classification models: Decision Tree, K Nearest Neighbours, Random Forest, Logistic Regression, Extra Trees, and Support Vector Machine on three different datasets. We applied simple principle component analysis (PCA) to reduce dimensions of the datasets. Experimental results show that Random Forest obtained the best accuracy, recall, and F1 score among the six classification techniques for all three datasets. We also find that data attributes and values are important for accurate classification.
ISSN:	2714-7533 2714-7533
DOI:	10.62527/ijasce.5.1.105