Binary Classification of a Large Collection of Environmental Chemicals from Estrogen Receptor Assays by Quantitative Structure–Activity Relationship and Machine Learning Methods
There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested ∼8200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonis...
Saved in:
Published in | Journal of chemical information and modeling Vol. 53; no. 12; pp. 3244 - 3261 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Washington, DC
American Chemical Society
23.12.2013
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested ∼8200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonist activity. The present work uses this large data set to develop in silico quantitative structure–activity relationship (QSAR) models using machine learning (ML) methods and a novel approach to manage the imbalanced data distribution. Training compounds from the ToxCast project were categorized as active or inactive (binding or nonbinding) classes based on a composite ER Interaction Score derived from a collection of 13 ER in vitro assays. A total of 1537 chemicals from ToxCast were used to derive and optimize the binary classification models while 5073 additional chemicals from the Tox21 project, evaluated in 2 of the 13 in vitro assays, were used to externally validate the model performance. In order to handle the imbalanced distribution of active and inactive chemicals, we developed a cluster-selection strategy to minimize information loss and increase predictive performance and compared this strategy to three currently popular techniques: cost-sensitive learning, oversampling of the minority class, and undersampling of the majority class. QSAR classification models were built to relate the molecular structures of chemicals to their ER activities using linear discriminant analysis (LDA), classification and regression trees (CART), and support vector machines (SVM) with 51 molecular descriptors from QikProp and 4328 bits of structural fingerprints as explanatory variables. A random forest (RF) feature selection method was employed to extract the structural features most relevant to the ER activity. The best model was obtained using SVM in combination with a subset of descriptors identified from a large set via the RF algorithm, which recognized the active and inactive compounds at the accuracies of 76.1% and 82.8% with a total accuracy of 81.6% on the internal test set and 70.8% on the external test set. These results demonstrate that a combination of high-quality experimental data and ML methods can lead to robust models that achieve excellent predictive accuracy, which are potentially useful for facilitating the virtual screening of chemicals for environmental risk assessment. |
---|---|
AbstractList | There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested ∼8200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonist activity. The present work uses this large data set to develop in silico quantitative structure-activity relationship (QSAR) models using machine learning (ML) methods and a novel approach to manage the imbalanced data distribution. Training compounds from the ToxCast project were categorized as active or inactive (binding or nonbinding) classes based on a composite ER Interaction Score derived from a collection of 13 ER in vitro assays. A total of 1537 chemicals from ToxCast were used to derive and optimize the binary classification models while 5073 additional chemicals from the Tox21 project, evaluated in 2 of the 13 in vitro assays, were used to externally validate the model performance. In order to handle the imbalanced distribution of active and inactive chemicals, we developed a cluster-selection strategy to minimize information loss and increase predictive performance and compared this strategy to three currently popular techniques: cost-sensitive learning, oversampling of the minority class, and undersampling of the majority class. QSAR classification models were built to relate the molecular structures of chemicals to their ER activities using linear discriminant analysis (LDA), classification and regression trees (CART), and support vector machines (SVM) with 51 molecular descriptors from QikProp and 4328 bits of structural fingerprints as explanatory variables. A random forest (RF) feature selection method was employed to extract the structural features most relevant to the ER activity. The best model was obtained using SVM in combination with a subset of descriptors identified from a large set via the RF algorithm, which recognized the active and inactive compounds at the accuracies of 76.1% and 82.8% with a total accuracy of 81.6% on the internal test set and 70.8% on the external test set. These results demonstrate that a combination of high-quality experimental data and ML methods can lead to robust models that achieve excellent predictive accuracy, which are potentially useful for facilitating the virtual screening of chemicals for environmental risk assessment.There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested ∼8200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonist activity. The present work uses this large data set to develop in silico quantitative structure-activity relationship (QSAR) models using machine learning (ML) methods and a novel approach to manage the imbalanced data distribution. Training compounds from the ToxCast project were categorized as active or inactive (binding or nonbinding) classes based on a composite ER Interaction Score derived from a collection of 13 ER in vitro assays. A total of 1537 chemicals from ToxCast were used to derive and optimize the binary classification models while 5073 additional chemicals from the Tox21 project, evaluated in 2 of the 13 in vitro assays, were used to externally validate the model performance. In order to handle the imbalanced distribution of active and inactive chemicals, we developed a cluster-selection strategy to minimize information loss and increase predictive performance and compared this strategy to three currently popular techniques: cost-sensitive learning, oversampling of the minority class, and undersampling of the majority class. QSAR classification models were built to relate the molecular structures of chemicals to their ER activities using linear discriminant analysis (LDA), classification and regression trees (CART), and support vector machines (SVM) with 51 molecular descriptors from QikProp and 4328 bits of structural fingerprints as explanatory variables. A random forest (RF) feature selection method was employed to extract the structural features most relevant to the ER activity. The best model was obtained using SVM in combination with a subset of descriptors identified from a large set via the RF algorithm, which recognized the active and inactive compounds at the accuracies of 76.1% and 82.8% with a total accuracy of 81.6% on the internal test set and 70.8% on the external test set. These results demonstrate that a combination of high-quality experimental data and ML methods can lead to robust models that achieve excellent predictive accuracy, which are potentially useful for facilitating the virtual screening of chemicals for environmental risk assessment. There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested ∼8200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonist activity. The present work uses this large data set to develop in silico quantitative structure-activity relationship (QSAR) models using machine learning (ML) methods and a novel approach to manage the imbalanced data distribution. Training compounds from the ToxCast project were categorized as active or inactive (binding or nonbinding) classes based on a composite ER Interaction Score derived from a collection of 13 ER in vitro assays. A total of 1537 chemicals from ToxCast were used to derive and optimize the binary classification models while 5073 additional chemicals from the Tox21 project, evaluated in 2 of the 13 in vitro assays, were used to externally validate the model performance. In order to handle the imbalanced distribution of active and inactive chemicals, we developed a cluster-selection strategy to minimize information loss and increase predictive performance and compared this strategy to three currently popular techniques: cost-sensitive learning, oversampling of the minority class, and undersampling of the majority class. QSAR classification models were built to relate the molecular structures of chemicals to their ER activities using linear discriminant analysis (LDA), classification and regression trees (CART), and support vector machines (SVM) with 51 molecular descriptors from QikProp and 4328 bits of structural fingerprints as explanatory variables. A random forest (RF) feature selection method was employed to extract the structural features most relevant to the ER activity. The best model was obtained using SVM in combination with a subset of descriptors identified from a large set via the RF algorithm, which recognized the active and inactive compounds at the accuracies of 76.1% and 82.8% with a total accuracy of 81.6% on the internal test set and 70.8% on the external test set. These results demonstrate that a combination of high-quality experimental data and ML methods can lead to robust models that achieve excellent predictive accuracy, which are potentially useful for facilitating the virtual screening of chemicals for environmental risk assessment. There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested 8200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonist activity. The present work uses this large data set to develop in silico quantitative structure-activity relationship (QSAR) models using machine learning (ML) methods and a novel approach to manage the imbalanced data distribution. Training compounds from the ToxCast project were categorized as active or inactive (binding or nonbinding) classes based on a composite ER Interaction Score derived from a collection of 13 ER in vitro assays. A total of 1537 chemicals from ToxCast were used to derive and optimize the binary classification models while 5073 additional chemicals from the Tox21 project, evaluated in 2 of the 13 in vitro assays, were used to externally validate the model performance. In order to handle the imbalanced distribution of active and inactive chemicals, we developed a cluster-selection strategy to minimize information loss and increase predictive performance and compared this strategy to three currently popular techniques: cost-sensitive learning, oversampling of the minority class, and undersampling of the majority class. QSAR classification models were built to relate the molecular structures of chemicals to their ER activities using linear discriminant analysis (LDA), classification and regression trees (CART), and support vector machines (SVM) with 51 molecular descriptors from QikProp and 4328 bits of structural fingerprints as explanatory variables. A random forest (RF) feature selection method was employed to extract the structural features most relevant to the ER activity. The best model was obtained using SVM in combination with a subset of descriptors identified from a large set via the RF algorithm, which recognized the active and inactive compounds at the accuracies of 76.1% and 82.8% with a total accuracy of 81.6% on the internal test set and 70.8% on the external test set. These results demonstrate that a combination of high-quality experimental data and ML methods can lead to robust models that achieve excellent predictive accuracy, which are potentially useful for facilitating the virtual screening of chemicals for environmental risk assessment. [PUBLICATION ABSTRACT] |
Author | Judson, Richard S Rotroff, Daniel M Zang, Qingda |
AuthorAffiliation | ORISE Postdoctoral Fellow North Carolina State University U.S. Environmental Protection Agency National Center for Computational Toxicology Bioinformatics Research Center, Department of Statistics |
AuthorAffiliation_xml | – name: National Center for Computational Toxicology – name: U.S. Environmental Protection Agency – name: North Carolina State University – name: ORISE Postdoctoral Fellow – name: Bioinformatics Research Center, Department of Statistics |
Author_xml | – sequence: 1 givenname: Qingda surname: Zang fullname: Zang, Qingda – sequence: 2 givenname: Daniel M surname: Rotroff fullname: Rotroff, Daniel M – sequence: 3 givenname: Richard S surname: Judson fullname: Judson, Richard S email: Judson.Richard@epa.gov |
BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=28083533$$DView record in Pascal Francis https://www.ncbi.nlm.nih.gov/pubmed/24279462$$D View this record in MEDLINE/PubMed |
BookMark | eNpt0tuKEzEYAOAgK-5BL3wBCYigF3VzmHQyl7XUA3QRT-Dd8E_mnzbLTNJNMoXe-Q4-im_kkxh321VWrxLC9x-S_KfkyHmHhDzm7CVngp8bWzCmRNncIydcFdWkmrKvR4e9qqbH5DTGS8akrKbiATkWhSirYipOyI9X1kHY0XkPMdrOGkjWO-o7CnQJYYV07vsezeF04bY2eDegS9DT-RqHHNJH2gU_0EVMwa_Q0Y9ocJN8oLMYYRdps6MfRnDJppx-i_RTCqNJY8Cf377Pcu6tTbsc1F8Xj2u7oeBaegFmbR3SJUJw1q3oBaa1b-NDcr_LNfHRfj0jX14vPs_fTpbv37ybz5YTKDhLE1FAWWgUGjspVIWGdaZRvGyZEMCbyoAWRVOJVugSTdMp4K3hxmiluAKJ8ow8v8m7Cf5qxJjqwUaDfQ8O_RhrXlSslFKXPNOnd-ilH4PL3WWVu9Al0yqrJ3s1NgO29SbYIT9-ffiODJ7tAcT8rF0AZ2z84zTTUkmZ3YsbZ4KPMWB3Szirf49EfTsS2Z7fseb6G7xLAWz_34h9F2DiX_f4x_0CEXTH6w |
CitedBy_id | crossref_primary_10_1186_s13321_020_00468_x crossref_primary_10_3390_s18103483 crossref_primary_10_1186_s13321_016_0117_7 crossref_primary_10_1016_j_tplants_2014_08_004 crossref_primary_10_1186_s13321_025_00950_4 crossref_primary_10_1016_j_tifs_2020_10_034 crossref_primary_10_1021_acs_chemrestox_6b00037 crossref_primary_10_1016_j_ces_2023_119086 crossref_primary_10_1016_j_scitotenv_2016_12_088 crossref_primary_10_1002_etc_3578 crossref_primary_10_1016_j_chemosphere_2016_12_095 crossref_primary_10_1038_srep24817 crossref_primary_10_3389_fbioe_2019_00485 crossref_primary_10_1016_j_chemolab_2018_08_015 crossref_primary_10_1021_acs_chemrestox_5b00358 crossref_primary_10_1289_ehp_1509748 crossref_primary_10_1039_C5MB00468C crossref_primary_10_1016_j_envint_2016_01_010 crossref_primary_10_3389_fenvs_2016_00012 crossref_primary_10_3390_cryst11070818 crossref_primary_10_1016_j_cplett_2018_06_022 crossref_primary_10_12677_SA_2022_116139 crossref_primary_10_1007_s10822_019_00255_3 crossref_primary_10_1021_acs_jcim_8b00433 crossref_primary_10_1021_acssuschemeng_7b03394 crossref_primary_10_1002_jat_3424 crossref_primary_10_1016_j_scitotenv_2021_151103 crossref_primary_10_1021_tx500501h crossref_primary_10_1002_jat_3366 crossref_primary_10_1016_j_scitotenv_2024_174201 crossref_primary_10_1021_acs_jcim_8b00553 crossref_primary_10_1039_C7MD00229G crossref_primary_10_1517_17460441_2016_1117070 crossref_primary_10_1016_j_chemosphere_2015_03_060 crossref_primary_10_1021_acs_est_1c06157 crossref_primary_10_1021_acs_jcim_6b00625 crossref_primary_10_1186_1476_069X_13_57 crossref_primary_10_1039_C5RA10729F crossref_primary_10_1016_j_etap_2021_103688 crossref_primary_10_1289_ehp_1510267 crossref_primary_10_3109_1061186X_2015_1132224 crossref_primary_10_1016_j_aquatox_2017_12_003 crossref_primary_10_1016_j_scitotenv_2020_143082 crossref_primary_10_1155_2015_916240 crossref_primary_10_1016_j_etap_2017_05_015 crossref_primary_10_1021_acs_est_1c01228 |
Cites_doi | 10.1093/toxsci/kfr254 10.1016/j.taap.2013.04.032 10.1016/j.jmgm.2006.01.007 10.1021/ci6002619 10.1080/10937404.2010.483947 10.1124/dmd.108.023507 10.1021/ci100081j 10.1080/10937404.2010.483935 10.1016/j.tox.2010.12.010 10.1021/ci300421n 10.1021/tx900325g 10.1289/ehp.5686 10.1016/j.taap.2010.05.017 10.1080/1062936X.2010.528254 10.1038/nrendo.2010.87 10.1093/bioinformatics/bti623 10.1289/ehp.0901392 10.1021/ci100364a 10.1289/ehp.1002180 10.1021/ci049869h 10.1002/minf.201100069 10.1021/tx049652h 10.1186/1758-2946-3-33 10.1289/ehp.1002476 10.1186/1472-6947-11-51 10.1002/jcc.21148 10.1021/tx0600550 10.1186/1471-2105-8-328 10.1016/j.aca.2011.04.019 10.1109/TSMCB.2008.2002909 10.1007/s00216-011-5155-4 10.1021/ci4000536 10.1289/ehp.0800168 10.1007/s10822-011-9511-4 10.1021/ci060164k 10.1289/ehp.0800471 10.1016/j.taap.2007.12.037 10.1021/ac102832t 10.1021/tx049782q 10.1289/ehp.0211029 10.1007/s11030-011-9321-6 10.1186/1758-2946-4-10 10.1016/j.scitotenv.2011.10.046 10.1016/j.jmgm.2012.01.002 10.1021/tx100428e 10.1016/j.jpba.2010.12.008 10.1007/s00216-010-4268-5 10.1186/1471-2105-9-241 10.1002/minf.201000061 10.1021/tx200099j 10.1093/toxsci/kfl103 10.1093/bioinformatics/btp589 10.3390/ijms13021805 10.1002/jcc.21707 10.1093/toxsci/kfq233 10.1289/ehp.1205065 10.3390/ijms12021259 |
ContentType | Journal Article |
Copyright | Copyright © 2013 American Chemical
Society 2015 INIST-CNRS Copyright American Chemical Society Dec 23, 2013 |
Copyright_xml | – notice: Copyright © 2013 American Chemical Society – notice: 2015 INIST-CNRS – notice: Copyright American Chemical Society Dec 23, 2013 |
DBID | AAYXX CITATION IQODW CGR CUY CVF ECM EIF NPM 7SC 7SR 7U5 8BQ 8FD JG9 JQ2 L7M L~C L~D 7X8 |
DOI | 10.1021/ci400527b |
DatabaseName | CrossRef Pascal-Francis Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Computer and Information Systems Abstracts Engineered Materials Abstracts Solid State and Superconductivity Abstracts METADEX Technology Research Database Materials Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional MEDLINE - Academic |
DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Materials Research Database Engineered Materials Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Computer Science Collection Computer and Information Systems Abstracts Solid State and Superconductivity Abstracts Advanced Technologies Database with Aerospace METADEX Computer and Information Systems Abstracts Professional MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic MEDLINE Materials Research Database |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Chemistry Applied Sciences |
EISSN | 1549-960X |
EndPage | 3261 |
ExternalDocumentID | 3174590581 24279462 28083533 10_1021_ci400527b d175326626 |
Genre | Research Support, Non-U.S. Gov't Journal Article Feature |
GroupedDBID | - 4.4 55A 5GY 7~N AABXI ABFLS ABMVS ABUCX ACGFS ACIWK ACNCT ACS AEESW AENEX AFEFF ALMA_UNASSIGNED_HOLDINGS AQSVZ D0L DU5 EBS ED ED~ EJD F5P GNL IH9 JG JG~ LG6 P2P PQEST PQQKQ RNS ROL UI2 VF5 VG9 W1F X --- -~X 5VS AAYXX ABBLG ABJNI ABLBI ABQRX ADHLV AHGAQ CITATION CUPRZ GGK 1WB 53G ACRPL ADNMO AEYZD ANPPW ANTXH IHE IQODW CGR CUY CVF ECM EIF NPM 7SC 7SR 7U5 8BQ 8FD JG9 JQ2 L7M L~C L~D 7X8 |
ID | FETCH-LOGICAL-a410t-24a748e28ef3259ec0fcb517d022a1b9ca824b92d287ecbf5a1dc1cc85515a3e3 |
IEDL.DBID | ACS |
ISSN | 1549-9596 1549-960X |
IngestDate | Fri Jul 11 01:10:26 EDT 2025 Mon Jun 30 10:51:22 EDT 2025 Thu Jan 02 22:13:17 EST 2025 Wed Apr 02 07:27:34 EDT 2025 Tue Jul 01 03:25:33 EDT 2025 Thu Apr 24 23:08:57 EDT 2025 Thu Aug 27 13:42:28 EDT 2020 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 12 |
Keywords | High throughput screening Estrogen receptor Agonist Very large databases Modeling Information loss Computational chemistry Structure activity relation Information quality Vector support machine Program proof Antagonist Skewed distribution Pattern extraction Binary classification Discriminant analysis Small signal Chemical structure Virtual screening Cluster Oversampling Medical screening Experimental study Random decision forests In vitro Classification and Regression Tree Subsampling Scene analysis Program test Property structure relationship Artificial intelligence |
Language | English |
License | CC BY 4.0 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a410t-24a748e28ef3259ec0fcb517d022a1b9ca824b92d287ecbf5a1dc1cc85515a3e3 |
Notes | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 |
PMID | 24279462 |
PQID | 1474887085 |
PQPubID | 28739 |
PageCount | 18 |
ParticipantIDs | proquest_miscellaneous_1490733871 proquest_journals_1474887085 pubmed_primary_24279462 pascalfrancis_primary_28083533 crossref_primary_10_1021_ci400527b crossref_citationtrail_10_1021_ci400527b acs_journals_10_1021_ci400527b |
ProviderPackageCode | JG~ 55A AABXI GNL VF5 7~N VG9 W1F ACS AEESW AFEFF ABMVS ABUCX IH9 AQSVZ ED~ UI2 CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2013-12-23 |
PublicationDateYYYYMMDD | 2013-12-23 |
PublicationDate_xml | – month: 12 year: 2013 text: 2013-12-23 day: 23 |
PublicationDecade | 2010 |
PublicationPlace | Washington, DC |
PublicationPlace_xml | – name: Washington, DC – name: United States – name: Washington |
PublicationTitle | Journal of chemical information and modeling |
PublicationTitleAlternate | J. Chem. Inf. Model |
PublicationYear | 2013 |
Publisher | American Chemical Society |
Publisher_xml | – name: American Chemical Society |
References | Chang C. Y. (ref43/cit43) 2013; 53 Judson R. S. (ref6/cit6) 2012; 13 Judson R. S. (ref12/cit12) 2009; 117 Judson R. S. (ref7/cit7) 2008; 233 Hao M. (ref55/cit55) 2011; 12 Kavlock R. J. (ref16/cit16) 2010; 13 ref52/cit52 Dejaegher B. (ref36/cit36) 2011; 705 Tang Y. (ref42/cit42) 2009; 39 Egeghy P. P. (ref13/cit13) 2012; 414 Seal A. (ref29/cit29) 2012; 4 National Research Council (ref10/cit10) 1984 Hao M. (ref64/cit64) 2011; 15 Su B. H. (ref30/cit30) 2010; 50 Sing T. (ref65/cit65) 2005; 21 Li Q. (ref41/cit41) 2009; 25 Judson R. S. (ref18/cit18) 2011; 24 Cohen-Hubal E. A. (ref9/cit9) 2010; 13 Khalilia M. (ref45/cit45) 2011; 11 Yap C. W. (ref50/cit50) 2011; 32 Mahoney M. M. (ref4/cit4) 2010; 247 Li H. (ref35/cit35) 2006; 25 Cheng T. (ref38/cit38) 2011; 51 Zang Q. (ref63/cit63) 2011; 83 Birnbaum L. S. (ref5/cit5) 2003; 111 Hong H. (ref67/cit67) 2002; 110 Sedykh A. (ref26/cit26) 2011; 119 Varmuza K. (ref58/cit58) 2009 ref46/cit46 Zhang L. (ref25/cit25) 2013; 272 Luan F. (ref56/cit56) 2005; 18 O’boyle N. M. (ref49/cit49) 2011; 3 Zang Q. (ref60/cit60) 2011; 399 (ref66/cit66) 2011 Wetmore B. A. (ref17/cit17) 2012; 125 Li J. (ref68/cit68) 2010; 21 Vasanthanathan P. (ref37/cit37) 2009; 37 Rotroff D. M. (ref1/cit1) 2013; 121 Yang X. G. (ref34/cit34) 2009; 30 Knudsen T. B. (ref8/cit8) 2011; 282 Zang Q. (ref59/cit59) 2011; 401 Judson R. S. (ref20/cit20) 2010; 118 Xue Y. (ref39/cit39) 2004; 44 Soto A. M. (ref3/cit3) 2010; 6 ref14/cit14 Palmer D. S. (ref53/cit53) 2007; 47 Li H. (ref32/cit32) 2005; 18 Carbon-Mangels M. (ref40/cit40) 2011; 30 ref51/cit51 Shen M. Y. (ref31/cit31) 2011; 24 Diaz-Uriarte R. (ref54/cit54) 2007; 8 Tseng Y. J. (ref28/cit28) 2012; 26 (ref48/cit48) 2011 Zhang L. (ref62/cit62) 2013; 53 Dix D. J. (ref21/cit21) 2007; 95 Zang Q. (ref57/cit57) 2011; 54 ref15/cit15 Tropsha A. (ref24/cit24) 2010; 29 Eitrich T. (ref61/cit61) 2007; 47 Zhu H. (ref27/cit27) 2009; 117 Chen J. (ref44/cit44) 2012; 35 Xue Y. (ref33/cit33) 2006; 19 Martin M. T. (ref19/cit19) 2010; 23 (ref47/cit47) 2012 Reif D. M. (ref2/cit2) 2010; 118 Pease W. (ref11/cit11) 1997 DiMaggio P. A. (ref23/cit23) 2010; 118 Judson R. S. (ref22/cit22) 2008; 9 |
References_xml | – volume-title: MOE (Molecular Operating Environment) year: 2012 ident: ref47/cit47 – volume: 125 start-page: 157 issue: 1 year: 2012 ident: ref17/cit17 publication-title: Toxicol. Sci. doi: 10.1093/toxsci/kfr254 – volume: 272 start-page: 67 issue: 1 year: 2013 ident: ref25/cit25 publication-title: Toxicol. Appl. Pharmacol. doi: 10.1016/j.taap.2013.04.032 – ident: ref14/cit14 – volume: 25 start-page: 313 issue: 3 year: 2006 ident: ref35/cit35 publication-title: J. Mol. Graph. Model. doi: 10.1016/j.jmgm.2006.01.007 – volume: 47 start-page: 92 year: 2007 ident: ref61/cit61 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci6002619 – volume-title: R: A language and environment for statistical computing year: 2011 ident: ref66/cit66 – volume: 13 start-page: 299 issue: 2 year: 2010 ident: ref9/cit9 publication-title: J. Toxicol. Environ. Health B. Crit. Rev. doi: 10.1080/10937404.2010.483947 – volume: 37 start-page: 658 issue: 3 year: 2009 ident: ref37/cit37 publication-title: Drug Metab. Dispos. doi: 10.1124/dmd.108.023507 – volume: 50 start-page: 1304 issue: 7 year: 2010 ident: ref30/cit30 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci100081j – volume: 13 start-page: 197 issue: 2 year: 2010 ident: ref16/cit16 publication-title: J. Toxicol. Environ. Health B. Crit. Rev. doi: 10.1080/10937404.2010.483935 – volume: 282 start-page: 1 issue: 1 year: 2011 ident: ref8/cit8 publication-title: Toxicology doi: 10.1016/j.tox.2010.12.010 – volume: 53 start-page: 475 year: 2013 ident: ref62/cit62 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci300421n – volume: 23 start-page: 578 issue: 3 year: 2010 ident: ref19/cit19 publication-title: Chem. Res. Toxicol. doi: 10.1021/tx900325g – volume: 111 start-page: 389 issue: 4 year: 2003 ident: ref5/cit5 publication-title: Environ. Health Perspect. doi: 10.1289/ehp.5686 – volume: 247 start-page: 98 issue: 2 year: 2010 ident: ref4/cit4 publication-title: Toxicol. Appl. Pharmacol. doi: 10.1016/j.taap.2010.05.017 – volume: 21 start-page: 657 issue: 7 year: 2010 ident: ref68/cit68 publication-title: SAR QSAR Environ. Res. doi: 10.1080/1062936X.2010.528254 – volume: 6 start-page: 363 issue: 7 year: 2010 ident: ref3/cit3 publication-title: Nat. Rev. Endocrinol. doi: 10.1038/nrendo.2010.87 – volume-title: QikProp year: 2011 ident: ref48/cit48 – volume: 21 start-page: 3940 issue: 20 year: 2005 ident: ref65/cit65 publication-title: Bioinformatics doi: 10.1093/bioinformatics/bti623 – volume: 118 start-page: 485 issue: 4 year: 2010 ident: ref20/cit20 publication-title: Environ. Health Perspect. doi: 10.1289/ehp.0901392 – volume: 51 start-page: 229 issue: 2 year: 2011 ident: ref38/cit38 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci100364a – volume: 118 start-page: 1714 issue: 12 year: 2010 ident: ref2/cit2 publication-title: Environ. Health Perspect. doi: 10.1289/ehp.1002180 – volume: 44 start-page: 1630 issue: 5 year: 2004 ident: ref39/cit39 publication-title: J. Chem. Inf. Comput. Sci. doi: 10.1021/ci049869h – ident: ref51/cit51 – volume: 30 start-page: 885 year: 2011 ident: ref40/cit40 publication-title: Mol. Inf. doi: 10.1002/minf.201100069 – volume: 18 start-page: 1071 issue: 6 year: 2005 ident: ref32/cit32 publication-title: Chem. Res. Toxicol. doi: 10.1021/tx049652h – volume-title: Toxicity testing: strategies to determine needs and priorities year: 1984 ident: ref10/cit10 – volume: 3 start-page: 33 year: 2011 ident: ref49/cit49 publication-title: J. Cheminf. doi: 10.1186/1758-2946-3-33 – volume: 119 start-page: 364 issue: 3 year: 2011 ident: ref26/cit26 publication-title: Environ. Health Perspect. doi: 10.1289/ehp.1002476 – volume: 11 start-page: 51 year: 2011 ident: ref45/cit45 publication-title: BMC Med. Inform. Decis. Mak. doi: 10.1186/1472-6947-11-51 – volume: 30 start-page: 1202 issue: 8 year: 2009 ident: ref34/cit34 publication-title: J. Comput. Chem. doi: 10.1002/jcc.21148 – volume: 19 start-page: 1030 issue: 8 year: 2006 ident: ref33/cit33 publication-title: Chem. Res. Toxicol. doi: 10.1021/tx0600550 – volume: 8 start-page: 328 year: 2007 ident: ref54/cit54 publication-title: BMC Bioinf. doi: 10.1186/1471-2105-8-328 – volume: 705 start-page: 98 issue: 1 year: 2011 ident: ref36/cit36 publication-title: Anal. Chim. Acta doi: 10.1016/j.aca.2011.04.019 – volume: 39 start-page: 281 issue: 1 year: 2009 ident: ref42/cit42 publication-title: IEEE Trans. Syst. Man. Cybern. B. Cybern. doi: 10.1109/TSMCB.2008.2002909 – volume: 401 start-page: 939 year: 2011 ident: ref59/cit59 publication-title: Anal. Bioanal. Chem. doi: 10.1007/s00216-011-5155-4 – volume: 53 start-page: 958 issue: 4 year: 2013 ident: ref43/cit43 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci4000536 – volume: 117 start-page: 685 issue: 5 year: 2009 ident: ref12/cit12 publication-title: Environ. Health Perspect. doi: 10.1289/ehp.0800168 – volume: 26 start-page: 39 year: 2012 ident: ref28/cit28 publication-title: J. Comput.-Aided Mol. Des. doi: 10.1007/s10822-011-9511-4 – volume: 47 start-page: 150 issue: 1 year: 2007 ident: ref53/cit53 publication-title: J. Chem. Inf. Model. doi: 10.1021/ci060164k – volume: 117 start-page: 1257 issue: 8 year: 2009 ident: ref27/cit27 publication-title: Environ. Health Perspect. doi: 10.1289/ehp.0800471 – volume-title: Introduction to multivariate statistical analysis in chemometrics year: 2009 ident: ref58/cit58 – volume: 233 start-page: 7 issue: 1 year: 2008 ident: ref7/cit7 publication-title: Toxicol. Appl. Pharmacol. doi: 10.1016/j.taap.2007.12.037 – volume: 83 start-page: 1030 year: 2011 ident: ref63/cit63 publication-title: Anal. Chem. doi: 10.1021/ac102832t – volume: 18 start-page: 198 issue: 2 year: 2005 ident: ref56/cit56 publication-title: Chem. Res. Toxicol. doi: 10.1021/tx049782q – volume: 110 start-page: 29 issue: 1 year: 2002 ident: ref67/cit67 publication-title: Environ. Health Perspect. doi: 10.1289/ehp.0211029 – volume: 15 start-page: 877 year: 2011 ident: ref64/cit64 publication-title: Mol. Divers. doi: 10.1007/s11030-011-9321-6 – ident: ref46/cit46 – volume: 4 start-page: 10 issue: 1 year: 2012 ident: ref29/cit29 publication-title: J. Cheminf. doi: 10.1186/1758-2946-4-10 – volume: 414 start-page: 159 issue: 1 year: 2012 ident: ref13/cit13 publication-title: Sci. Total Environ. doi: 10.1016/j.scitotenv.2011.10.046 – volume: 35 start-page: 21 year: 2012 ident: ref44/cit44 publication-title: J. Mol. Graph. Model. doi: 10.1016/j.jmgm.2012.01.002 – volume: 24 start-page: 451 issue: 4 year: 2011 ident: ref18/cit18 publication-title: Chem. Res. Toxicol. doi: 10.1021/tx100428e – volume: 54 start-page: 1020 year: 2011 ident: ref57/cit57 publication-title: J. Pharm. Biomed. Anal. doi: 10.1016/j.jpba.2010.12.008 – volume-title: Toxic ignorance: the continuing absence of basic health testing for top-selling chemicals in the United States year: 1997 ident: ref11/cit11 – volume: 399 start-page: 635 year: 2011 ident: ref60/cit60 publication-title: Anal. Bioanal. Chem. doi: 10.1007/s00216-010-4268-5 – volume: 9 start-page: 241 year: 2008 ident: ref22/cit22 publication-title: BMC Bioinf. doi: 10.1186/1471-2105-9-241 – volume: 29 start-page: 476 year: 2010 ident: ref24/cit24 publication-title: Mol. Inf. doi: 10.1002/minf.201000061 – volume: 24 start-page: 934 issue: 6 year: 2011 ident: ref31/cit31 publication-title: Chem. Res. Toxicol. doi: 10.1021/tx200099j – volume: 95 start-page: 5 issue: 1 year: 2007 ident: ref21/cit21 publication-title: Toxicol. Sci. doi: 10.1093/toxsci/kfl103 – volume: 25 start-page: 3310 issue: 24 year: 2009 ident: ref41/cit41 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp589 – volume: 13 start-page: 1805 issue: 2 year: 2012 ident: ref6/cit6 publication-title: Int. J. Mol. Sci. doi: 10.3390/ijms13021805 – volume: 32 start-page: 1466 issue: 7 year: 2011 ident: ref50/cit50 publication-title: J. Comput. Chem. doi: 10.1002/jcc.21707 – ident: ref15/cit15 – volume: 118 start-page: 251 issue: 1 year: 2010 ident: ref23/cit23 publication-title: Toxicol. Sci. doi: 10.1093/toxsci/kfq233 – ident: ref52/cit52 – volume: 121 start-page: 7 issue: 1 year: 2013 ident: ref1/cit1 publication-title: Environ. Health Perspect. doi: 10.1289/ehp.1205065 – volume: 12 start-page: 1259 issue: 2 year: 2011 ident: ref55/cit55 publication-title: Int. J. Mol. Sci. doi: 10.3390/ijms12021259 |
SSID | ssj0033962 |
Score | 2.3270626 |
Snippet | There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested... |
SourceID | proquest pubmed pascalfrancis crossref acs |
SourceType | Aggregation Database Index Database Enrichment Source Publisher |
StartPage | 3244 |
SubjectTerms | Algorithms Applied sciences Artificial Intelligence Chemical compounds Chemicals Chemistry Computer science; control theory; systems Computer systems performance. Reliability Data processing. List processing. Character string processing Discriminant Analysis Endocrine Disruptors - classification Endocrine Disruptors - pharmacology Environmental Monitoring Estrogens Exact sciences and technology General and physical chemistry General. Nomenclature, chemical documentation, computer chemistry High-Throughput Screening Assays Humans Learning and adaptive systems Memory organisation. Data processing Molecular structure Quantitative Structure-Activity Relationship Receptors, Estrogen - agonists Receptors, Estrogen - antagonists & inhibitors Receptors, Estrogen - metabolism Risk Assessment Software Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry Water Pollutants, Chemical - classification Water Pollutants, Chemical - pharmacology |
Title | Binary Classification of a Large Collection of Environmental Chemicals from Estrogen Receptor Assays by Quantitative Structure–Activity Relationship and Machine Learning Methods |
URI | http://dx.doi.org/10.1021/ci400527b https://www.ncbi.nlm.nih.gov/pubmed/24279462 https://www.proquest.com/docview/1474887085 https://www.proquest.com/docview/1490733871 |
Volume | 53 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV3NbtQwEB6VcgAJAQUKgXY1_By4pF3_ZJMcy3arCnWRUKnU28p2nFKBsiuyeygn3qGP0jfiSZhxNmkrWrhFycRKPGP7s-fnA3iX9rmiiTZxIdjNSBAitjJN4twr28_LtJShqM_402D_SH88To5X4O0tHnwptt2p5rPL1N6Bu3JAg5fxz_CwnW6VygNrKJcai_MkH7Tlg66-ykuPq68tPQ9mpqZeKBv6itvxZVhn9h7Bbput04SXfNtazO2W-_l38cZ__cJjeLjEmbjTGMYarPjqCdwbtvRuT-HiQ8jExcCKyfFCQUU4LdHgAYeHYzhTcO3d0WVGHLXb1hmokfNTcERtTskUkVCon9E2Hknt5qxGe4afF6YKmWw0r-JhKFe7-OF__zrfcQ1zBXYBeV9PZ2iqAschwtPjsvjrCY4Dz3X9DI72Rl-G-_GSwSE2WvTnsdQm1ZmXmS8V7bO865fOJiItCDkYYXNnMqltLgvat3lny8SIwgnnMsJxiVFercNqNa38C0A9SJxVvmRXraYLww5gQeCrYJKaVEbQIxVPliOwngTnuhSTru8jeN9qn242cQ5Mw_H9JtE3neisKfpxk1Dvmgl1kjJjWKtUBButTV35LE0dQnNklkTwuntMqmc_jan8dMEyOVNp0lY2gueNLV42riVTAsiX__vdV3BfMnOHkLFUG7BK6vWbhJ_mthfGzx9wGhaG |
linkProvider | American Chemical Society |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NbtNAEF5BeyhSBeWvNbRhQBy4uGR_HNvHNEoVIKmE2kq9RbvrNVStnKhODuXEO_AovBFPwsz6Jy0qglvkjFfr3fHut56Z72PsbdwlRhOlw4xTmBEhRGhEHIWpk6ab5nEuPKnP5Kg3OlUfz6KzmiaHamGwEyW2VPog_opdgL-354o-YcbmPltHECLIm_uD42bVlTL14qHEOBamUdprWIRu3ko7kC1v7UCbc13iYOSVisXfYabfbg4fVbpFvqM-y-Rif7kw-_bbHxyO__ckW-xhjTqhX7nJY3bPFU_YxqARe3vKfh74ulzwGpmUPeQnDGY5aBhTsjj4Lwy2uTpc1cdhuw3rQAlUrQJDbHOGjgmISd0cD_WATqCvSzDX8HmpC1_XhqssHHvy2uWV-_X9R99WOhbQpud9PZ-DLjKY-HxPBzUV7BeYeNXr8hk7PRyeDEZhrecQasW7i1AoHavEicTlEk9dznZzayIeZ4gjNDep1YlQJhUZnuKcNXmkeWa5tQmiukhLJ5-ztWJWuB0GqhdZI11OgVuFPzSFgzlCsYwka2IRsA4O_bR-H8upD7ULPm3HPmDvGifAi1XWA4lyXN5l-qY1nVcUIHcZdW55UmspEgK5UgZst3GtG91SOCC4YiZRwF63f-PUU9RGF262JJuUhDXxYBuw7colV40rQQIB4sW_HvcV2xidTMbT8YejTy_ZA0GaHlyEQu6yNZxqt4fIamE6_pX6Dck4Huc |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB5BkQAJ8X4EymIQBy4p60c2yXFZdlWgW0ClUm8r27GhAmUjsnsoJ_4DP4V_xC9hxnm0RUVwi5KJ5dhjeyYz830AT9MhIZooHRecwoxoQsRGpEmcO2mGuU-9CKA-893R9r56fZActI4i1cJgJ2psqQ5BfFrVVeFbhAH-3B4q-o2ZmvNwgcJ1pNHjyV6380qZBwJRQh2L8yQfdUhCJ1-lU8jWp06hK5WucUB8w2Txd1MzHDmza_C272zINPm8tV6ZLfvtDxzH__-a63C1tT7ZuFGXG3DOlTfh0qQjfbsFP1-E-lwWuDIpiyhMHFt6ptkOJY2z8KfBdnenx3Vy2G6HPlAzqlphU2xziQrK0DZ1FTr3DJVBH9XMHLH3a12G-jbcbdleALFdf3W_vv8Y24bPgvVpep8OK6bLgs1D3qdjLSTsRzYP7Nf1bdifTT9MtuOW1yHWig9XsVA6VZkTmfMSvS9nh96ahKcF2hOam9zqTCiTiwK9OWeNTzQvLLc2Q-su0dLJO7BRLkt3D5gaJdZI5ymAq_BCU1iYo0lWEHVNKiIY4PAv2nVZL0LIXfBFP_YRPOsUAW822Q9EzvHlLNEnvWjVQIGcJTQ4pU29pMjI2JUygs1OvU50S-GA4M6ZJRE87h_j1FP0RpduuSaZnAg20cGN4G6jlseNK0FEAeL-vz73EVx893K22Hm1--YBXBZE7cFFLOQmbOBMu4doYK3MIKyq3_BFIWo |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Binary+Classification+of+a+Large+Collection+of+Environmental+Chemicals+from+Estrogen+Receptor+Assays+by+Quantitative+Structure-Activity+Relationship+and+Machine+Learning+Methods&rft.jtitle=Journal+of+chemical+information+and+modeling&rft.au=Zang%2C+Qingda&rft.au=Rotroff%2C+Daniel+M&rft.au=Judson%2C+Richard+S&rft.date=2013-12-23&rft.pub=American+Chemical+Society&rft.issn=1549-9596&rft.eissn=1549-960X&rft.volume=53&rft.issue=12&rft.spage=3244&rft_id=info:doi/10.1021%2Fci400527b&rft.externalDBID=NO_FULL_TEXT&rft.externalDocID=3174590581 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1549-9596&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1549-9596&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1549-9596&client=summon |