Gene Expression Data Analysis using Supervised Machine Learning Algorithm in Data Mining for Breast Cancer Prediction
The human body is composed of cells, and each cell has a variety of components. The human body's cells are essential for both the construction and operation of living things. Cells in the human body are capable of dividing and self-destructing as necessary. This natural functioning can occasion...
Saved in:
Published in | 2024 5th International Conference on Image Processing and Capsule Networks (ICIPCN) pp. 550 - 553 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
03.07.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The human body is composed of cells, and each cell has a variety of components. The human body's cells are essential for both the construction and operation of living things. Cells in the human body are capable of dividing and self-destructing as necessary. This natural functioning can occasionally alter; for example, certain cells may grow uncontrollably or improperly, giving rise to tumors. When these tumors develop into malignant ones and spread to nearby body parts, the condition is known as cancer. Specifically, if the cells in the breast can grow uncontrollably then this condition is known as breast cancer. A gene is a piece of genetic material that is present in every human cell. The genes that code for the proteins that determine how each cell functions. Gene expression is the name given to this function that produce protein. Occasionally, variations in the genetic material, or mutated genes, produce alterations in the values of gene expression. Breast cancer is caused by these altered gene expression levels. Some diagnostic techniques are necessary to detect this altered gene expression for breast cancer, and they are highly effective in helping medical professionals to pinpoint the disease's biomarker. Thus, the goal of this research is to use gene expression data to create a model that can accurately identify breast cancer biomarkers. The suggested model in this study, a pruned neural network method, is designed to predict breast cancer while requiring less memory and execution time and offering the best accuracy. This suggested model is evaluated against current techniques, including support vector machines and decision trees, and it outperforms them with a \mathbf{98 \%} accuracy rate. Having a larger number of gene characteristics and fewer samples is a drawback of using gene expression data. Therefore, the wrapper approaches are employed as a preprocessing strategy in this research work to remove the undesirable genes. The gene expression data is preprocessed before being applied to both current and proposed models. |
---|---|
DOI: | 10.1109/ICIPCN63822.2024.00096 |