Machine learning on genome-wide association studies to predict the risk of radiation-associated contralateral breast cancer in the WECARE Study

The purpose of this study was to identify germline single nucleotide polymorphisms (SNPs) that optimally predict radiation-associated contralateral breast cancer (RCBC) and to provide new biological insights into the carcinogenic process. Fifty-two women with contralateral breast cancer and 153 wome...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 15; no. 2; p. e0226157
Main Authors Lee, Sangkyu, Liang, Xiaolin, Woods, Meghan, Reiner, Anne S, Concannon, Patrick, Bernstein, Leslie, Lynch, Charles F, Boice, John D, Deasy, Joseph O, Bernstein, Jonine L, Oh, Jung Hun
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 27.02.2020
Public Library of Science (PLoS)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The purpose of this study was to identify germline single nucleotide polymorphisms (SNPs) that optimally predict radiation-associated contralateral breast cancer (RCBC) and to provide new biological insights into the carcinogenic process. Fifty-two women with contralateral breast cancer and 153 women with unilateral breast cancer were identified within the Women's Environmental Cancer and Radiation Epidemiology (WECARE) Study who were at increased risk of RCBC because they were ≤ 40 years of age at first diagnosis of breast cancer and received a scatter radiation dose > 1 Gy to the contralateral breast. A previously reported algorithm, preconditioned random forest regression, was applied to predict the risk of developing RCBC. The resulting model produced an area under the curve (AUC) of 0.62 (p = 0.04) on hold-out validation data. The biological analysis identified the cyclic AMP-mediated signaling and Ephrin-A as significant biological correlates, which were previously shown to influence cell survival after radiation in an ATM-dependent manner. The key connected genes and proteins that are identified in this analysis were previously identified as relevant to breast cancer, radiation response, or both. In summary, machine learning/bioinformatics methods applied to genome-wide genotyping data have great potential to reveal plausible biological correlates associated with the risk of RCBC.
Bibliography:Competing Interests: P.C. is a stockholder in AMGEN; J.O.D. has research contracts with Varian Medical Systems and Philips and is a shareholder in Paige.AI. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
These authors are co- senior authors on this work.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0226157