Gaussian Mixture Models for Probabilistic Classification of Breast Cancer

In the era of omics-driven research, it remains a common dilemma to stratify individual patients based on the molecular characteristics of their tumors. To improve molecular stratification of patients with breast cancer, we developed the Gaussian mixture model (GMM)-based classifier. This probabilis...

Full description

Saved in:
Bibliographic Details
Published inCancer research (Chicago, Ill.) Vol. 79; no. 13; pp. 3492 - 3502
Main Authors Prabakaran, Indira, Wu, Zhengdong, Lee, Changgun, Tong, Brian, Steeman, Samantha, Koo, Gabriel, Zhang, Paul J, Guvakova, Marina A
Format Journal Article
LanguageEnglish
Published United States 01.07.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In the era of omics-driven research, it remains a common dilemma to stratify individual patients based on the molecular characteristics of their tumors. To improve molecular stratification of patients with breast cancer, we developed the Gaussian mixture model (GMM)-based classifier. This probabilistic classifier was built on mRNA expression data from more than 300 clinical samples of breast cancer and healthy tissue and was validated on datasets of , and , which encode standard clinical markers and therapeutic targets. To demonstrate how a GMM approach could be exploited for multiclass classification using data from a candidate marker, we analyzed the insulin-like growth factor I receptor (IGF1R), a promising target, but a marker of uncertain importance in breast cancer. The GMM defined subclasses with downregulated (40%), unchanged (39%), upregulated (19%), and overexpressed (2%) levels; inter- and intrapatient analyses of transcript and protein levels supported these predictions. Overexpressed IGF1R was observed in a small percentage of tumors. Samples with unchanged and upregulated IGF1R were differentiated tumors, and downregulation of IGF1R correlated with poorly differentiated, high-risk hormone receptor-negative and HER2-positive tumors. A similar correlation was found in the independent cohort of carcinoma , suggesting that loss or low expression of IGF1R is a marker of aggressiveness in subsets of preinvasive and invasive breast cancer. These results demonstrate the importance of probabilistic modeling that delves deeper into molecular data and aims to improve diagnostic classification, prognostic assessment, and treatment selection. SIGNIFICANCE: A GMM classifier demonstrates potential use for clinical validation of markers and determination of target populations, particularly when availability of specimens for marker development is low.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0008-5472
1538-7445
DOI:10.1158/0008-5472.CAN-19-0573