Predicting the hepatocarcinogenic potential of alkenylbenzene flavoring agents using toxicogenomics and machine learning

Identification of carcinogenic activity is the primary goal of the 2-year bioassay. The expense of these studies limits the number of chemicals that can be studied and therefore chemicals need to be prioritized based on a variety of parameters. We have developed an ensemble of support vector machine...

Full description

Saved in:
Bibliographic Details
Published inToxicology and applied pharmacology Vol. 243; no. 3; pp. 300 - 314
Main Authors Auerbach, Scott S., Shah, Ruchir R., Mav, Deepak, Smith, Cynthia S., Walker, Nigel J., Vallant, Molly K., Boorman, Gary A., Irwin, Richard D.
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 15.03.2010
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Identification of carcinogenic activity is the primary goal of the 2-year bioassay. The expense of these studies limits the number of chemicals that can be studied and therefore chemicals need to be prioritized based on a variety of parameters. We have developed an ensemble of support vector machine classification models based on male F344 rat liver gene expression following 2, 14 or 90 days of exposure to a collection of hepatocarcinogens (aflatoxin B1, 1-amino-2,4-dibromoanthraquinone, N-nitrosodimethylamine, methyleugenol) and non-hepatocarcinogens (acetaminophen, ascorbic acid, tryptophan). Seven models were generated based on individual exposure durations (2, 14 or 90 days) or a combination of exposures (2 + 14, 2 + 90, 14 + 90 and 2 + 14 + 90 days). All sets of data, with the exception of one yielded models with 0% cross-validation error. Independent validation of the models was performed using expression data from the liver of rats exposed at 2 dose levels to a collection of alkenylbenzene flavoring agents. Depending on the model used and the exposure duration of the test data, independent validation error rates ranged from 47% to 10%. The variable with the most notable effect on independent validation accuracy was exposure duration of the alkenylbenzene test data. All models generally exhibited improved performance as the exposure duration of the alkenylbenzene data increased. The models differentiated between hepatocarcinogenic (estragole and safrole) and non-hepatocarcinogenic (anethole, eugenol and isoeugenol) alkenylbenzenes previously studied in a carcinogenicity bioassay. In the case of safrole the models correctly differentiated between carcinogenic and non-carcinogenic dose levels. The models predict that two alkenylbenzenes not previously assessed in a carcinogenicity bioassay, myristicin and isosafrole, would be weakly hepatocarcinogenic if studied at a dose level of 2 mmol/kg bw/day for 2 years in male F344 rats; therefore suggesting that these chemicals should be a higher priority relative to other untested alkenylbenzenes for evaluation in the carcinogenicity bioassay. The results of the study indicate that gene expression-based predictive models are an effective tool for identifying hepatocarcinogens. Furthermore, we find that exposure duration is a critical variable in the success or failure of such an approach, particularly when evaluating chemicals with unknown carcinogenic potency.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:0041-008X
1096-0333
DOI:10.1016/j.taap.2009.11.021