Multi-cancer classification; An analysis of neural network models

Cancer identification is generally framed as binary classification, normally discrimination of a control group from a single cancer group. However, such models lack any cancer-specific information, as they are only trained on one cancer type. The models fail to account for competing cancer risks. Pa...

Full description

Saved in:
Bibliographic Details
Published inMachine learning with applications Vol. 12; p. 100468
Main Authors Webber, James W., Elias, Kevin
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 15.06.2023
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Cancer identification is generally framed as binary classification, normally discrimination of a control group from a single cancer group. However, such models lack any cancer-specific information, as they are only trained on one cancer type. The models fail to account for competing cancer risks. Pan-cancer evaluation requires a model trained on multiple cancer types, and controls, simultaneously, so that a physician can be directed to the correct area of the body for further testing. We investigate neural network models to address multi-cancer classification problems across several data types commonly applied in cancer prediction, including circulating miRNA expression, protein, and mRNA. In particular, we present an analysis of neural network depth and type, and investigate how this relates to classification performance. In our comparisons, we include several state-of-the-art neural networks from the literature. We provide details on the optimal network depth and type, the activation functions and layer sizes. : Our analysis evidences that shallow (i.e., 1 or 2 layer), feed-forward neural network architectures offer greater performance in terms of mean sensitivity and precision when compared to deeper (i.e., >2 layer) feed-forward models, Convolutional Neural Network (CNN), and Graph CNN (GCNN) architectures, across a range of measurement technologies in cancer prediction (e.g., miRNA, mRNA and protein). We also discover that hyperbolic tangent activation functions offer the most consistent performance, and the optimal feed-forward models have descending layer size structure.
ISSN:2666-8270
2666-8270
DOI:10.1016/j.mlwa.2023.100468