Binary Classification of a Large Collection of Environmental Chemicals from Estrogen Receptor Assays by Quantitative Structure–Activity Relationship and Machine Learning Methods

There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested ∼8200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonis...

Full description

Saved in:
Bibliographic Details
Published inJournal of chemical information and modeling Vol. 53; no. 12; pp. 3244 - 3261
Main Authors Zang, Qingda, Rotroff, Daniel M, Judson, Richard S
Format Journal Article
LanguageEnglish
Published Washington, DC American Chemical Society 23.12.2013
Subjects
Online AccessGet full text

Cover

Loading…
Abstract There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested ∼8200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonist activity. The present work uses this large data set to develop in silico quantitative structure–activity relationship (QSAR) models using machine learning (ML) methods and a novel approach to manage the imbalanced data distribution. Training compounds from the ToxCast project were categorized as active or inactive (binding or nonbinding) classes based on a composite ER Interaction Score derived from a collection of 13 ER in vitro assays. A total of 1537 chemicals from ToxCast were used to derive and optimize the binary classification models while 5073 additional chemicals from the Tox21 project, evaluated in 2 of the 13 in vitro assays, were used to externally validate the model performance. In order to handle the imbalanced distribution of active and inactive chemicals, we developed a cluster-selection strategy to minimize information loss and increase predictive performance and compared this strategy to three currently popular techniques: cost-sensitive learning, oversampling of the minority class, and undersampling of the majority class. QSAR classification models were built to relate the molecular structures of chemicals to their ER activities using linear discriminant analysis (LDA), classification and regression trees (CART), and support vector machines (SVM) with 51 molecular descriptors from QikProp and 4328 bits of structural fingerprints as explanatory variables. A random forest (RF) feature selection method was employed to extract the structural features most relevant to the ER activity. The best model was obtained using SVM in combination with a subset of descriptors identified from a large set via the RF algorithm, which recognized the active and inactive compounds at the accuracies of 76.1% and 82.8% with a total accuracy of 81.6% on the internal test set and 70.8% on the external test set. These results demonstrate that a combination of high-quality experimental data and ML methods can lead to robust models that achieve excellent predictive accuracy, which are potentially useful for facilitating the virtual screening of chemicals for environmental risk assessment.
AbstractList There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested ∼8200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonist activity. The present work uses this large data set to develop in silico quantitative structure-activity relationship (QSAR) models using machine learning (ML) methods and a novel approach to manage the imbalanced data distribution. Training compounds from the ToxCast project were categorized as active or inactive (binding or nonbinding) classes based on a composite ER Interaction Score derived from a collection of 13 ER in vitro assays. A total of 1537 chemicals from ToxCast were used to derive and optimize the binary classification models while 5073 additional chemicals from the Tox21 project, evaluated in 2 of the 13 in vitro assays, were used to externally validate the model performance. In order to handle the imbalanced distribution of active and inactive chemicals, we developed a cluster-selection strategy to minimize information loss and increase predictive performance and compared this strategy to three currently popular techniques: cost-sensitive learning, oversampling of the minority class, and undersampling of the majority class. QSAR classification models were built to relate the molecular structures of chemicals to their ER activities using linear discriminant analysis (LDA), classification and regression trees (CART), and support vector machines (SVM) with 51 molecular descriptors from QikProp and 4328 bits of structural fingerprints as explanatory variables. A random forest (RF) feature selection method was employed to extract the structural features most relevant to the ER activity. The best model was obtained using SVM in combination with a subset of descriptors identified from a large set via the RF algorithm, which recognized the active and inactive compounds at the accuracies of 76.1% and 82.8% with a total accuracy of 81.6% on the internal test set and 70.8% on the external test set. These results demonstrate that a combination of high-quality experimental data and ML methods can lead to robust models that achieve excellent predictive accuracy, which are potentially useful for facilitating the virtual screening of chemicals for environmental risk assessment.There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested ∼8200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonist activity. The present work uses this large data set to develop in silico quantitative structure-activity relationship (QSAR) models using machine learning (ML) methods and a novel approach to manage the imbalanced data distribution. Training compounds from the ToxCast project were categorized as active or inactive (binding or nonbinding) classes based on a composite ER Interaction Score derived from a collection of 13 ER in vitro assays. A total of 1537 chemicals from ToxCast were used to derive and optimize the binary classification models while 5073 additional chemicals from the Tox21 project, evaluated in 2 of the 13 in vitro assays, were used to externally validate the model performance. In order to handle the imbalanced distribution of active and inactive chemicals, we developed a cluster-selection strategy to minimize information loss and increase predictive performance and compared this strategy to three currently popular techniques: cost-sensitive learning, oversampling of the minority class, and undersampling of the majority class. QSAR classification models were built to relate the molecular structures of chemicals to their ER activities using linear discriminant analysis (LDA), classification and regression trees (CART), and support vector machines (SVM) with 51 molecular descriptors from QikProp and 4328 bits of structural fingerprints as explanatory variables. A random forest (RF) feature selection method was employed to extract the structural features most relevant to the ER activity. The best model was obtained using SVM in combination with a subset of descriptors identified from a large set via the RF algorithm, which recognized the active and inactive compounds at the accuracies of 76.1% and 82.8% with a total accuracy of 81.6% on the internal test set and 70.8% on the external test set. These results demonstrate that a combination of high-quality experimental data and ML methods can lead to robust models that achieve excellent predictive accuracy, which are potentially useful for facilitating the virtual screening of chemicals for environmental risk assessment.
There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested ∼8200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonist activity. The present work uses this large data set to develop in silico quantitative structure-activity relationship (QSAR) models using machine learning (ML) methods and a novel approach to manage the imbalanced data distribution. Training compounds from the ToxCast project were categorized as active or inactive (binding or nonbinding) classes based on a composite ER Interaction Score derived from a collection of 13 ER in vitro assays. A total of 1537 chemicals from ToxCast were used to derive and optimize the binary classification models while 5073 additional chemicals from the Tox21 project, evaluated in 2 of the 13 in vitro assays, were used to externally validate the model performance. In order to handle the imbalanced distribution of active and inactive chemicals, we developed a cluster-selection strategy to minimize information loss and increase predictive performance and compared this strategy to three currently popular techniques: cost-sensitive learning, oversampling of the minority class, and undersampling of the majority class. QSAR classification models were built to relate the molecular structures of chemicals to their ER activities using linear discriminant analysis (LDA), classification and regression trees (CART), and support vector machines (SVM) with 51 molecular descriptors from QikProp and 4328 bits of structural fingerprints as explanatory variables. A random forest (RF) feature selection method was employed to extract the structural features most relevant to the ER activity. The best model was obtained using SVM in combination with a subset of descriptors identified from a large set via the RF algorithm, which recognized the active and inactive compounds at the accuracies of 76.1% and 82.8% with a total accuracy of 81.6% on the internal test set and 70.8% on the external test set. These results demonstrate that a combination of high-quality experimental data and ML methods can lead to robust models that achieve excellent predictive accuracy, which are potentially useful for facilitating the virtual screening of chemicals for environmental risk assessment.
There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested 8200 chemicals in a broad screening panel of in vitro high-throughput screening (HTS) assays for estrogen receptor (ER) agonist and antagonist activity. The present work uses this large data set to develop in silico quantitative structure-activity relationship (QSAR) models using machine learning (ML) methods and a novel approach to manage the imbalanced data distribution. Training compounds from the ToxCast project were categorized as active or inactive (binding or nonbinding) classes based on a composite ER Interaction Score derived from a collection of 13 ER in vitro assays. A total of 1537 chemicals from ToxCast were used to derive and optimize the binary classification models while 5073 additional chemicals from the Tox21 project, evaluated in 2 of the 13 in vitro assays, were used to externally validate the model performance. In order to handle the imbalanced distribution of active and inactive chemicals, we developed a cluster-selection strategy to minimize information loss and increase predictive performance and compared this strategy to three currently popular techniques: cost-sensitive learning, oversampling of the minority class, and undersampling of the majority class. QSAR classification models were built to relate the molecular structures of chemicals to their ER activities using linear discriminant analysis (LDA), classification and regression trees (CART), and support vector machines (SVM) with 51 molecular descriptors from QikProp and 4328 bits of structural fingerprints as explanatory variables. A random forest (RF) feature selection method was employed to extract the structural features most relevant to the ER activity. The best model was obtained using SVM in combination with a subset of descriptors identified from a large set via the RF algorithm, which recognized the active and inactive compounds at the accuracies of 76.1% and 82.8% with a total accuracy of 81.6% on the internal test set and 70.8% on the external test set. These results demonstrate that a combination of high-quality experimental data and ML methods can lead to robust models that achieve excellent predictive accuracy, which are potentially useful for facilitating the virtual screening of chemicals for environmental risk assessment. [PUBLICATION ABSTRACT]
Author Judson, Richard S
Rotroff, Daniel M
Zang, Qingda
AuthorAffiliation ORISE Postdoctoral Fellow
North Carolina State University
U.S. Environmental Protection Agency
National Center for Computational Toxicology
Bioinformatics Research Center, Department of Statistics
AuthorAffiliation_xml – name: National Center for Computational Toxicology
– name: U.S. Environmental Protection Agency
– name: North Carolina State University
– name: ORISE Postdoctoral Fellow
– name: Bioinformatics Research Center, Department of Statistics
Author_xml – sequence: 1
  givenname: Qingda
  surname: Zang
  fullname: Zang, Qingda
– sequence: 2
  givenname: Daniel M
  surname: Rotroff
  fullname: Rotroff, Daniel M
– sequence: 3
  givenname: Richard S
  surname: Judson
  fullname: Judson, Richard S
  email: Judson.Richard@epa.gov
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=28083533$$DView record in Pascal Francis
https://www.ncbi.nlm.nih.gov/pubmed/24279462$$D View this record in MEDLINE/PubMed
BookMark eNpt0tuKEzEYAOAgK-5BL3wBCYigF3VzmHQyl7XUA3QRT-Dd8E_mnzbLTNJNMoXe-Q4-im_kkxh321VWrxLC9x-S_KfkyHmHhDzm7CVngp8bWzCmRNncIydcFdWkmrKvR4e9qqbH5DTGS8akrKbiATkWhSirYipOyI9X1kHY0XkPMdrOGkjWO-o7CnQJYYV07vsezeF04bY2eDegS9DT-RqHHNJH2gU_0EVMwa_Q0Y9ocJN8oLMYYRdps6MfRnDJppx-i_RTCqNJY8Cf377Pcu6tTbsc1F8Xj2u7oeBaegFmbR3SJUJw1q3oBaa1b-NDcr_LNfHRfj0jX14vPs_fTpbv37ybz5YTKDhLE1FAWWgUGjspVIWGdaZRvGyZEMCbyoAWRVOJVugSTdMp4K3hxmiluAKJ8ow8v8m7Cf5qxJjqwUaDfQ8O_RhrXlSslFKXPNOnd-ilH4PL3WWVu9Al0yqrJ3s1NgO29SbYIT9-ffiODJ7tAcT8rF0AZ2z84zTTUkmZ3YsbZ4KPMWB3Szirf49EfTsS2Z7fseb6G7xLAWz_34h9F2DiX_f4x_0CEXTH6w
CitedBy_id crossref_primary_10_1186_s13321_020_00468_x
crossref_primary_10_3390_s18103483
crossref_primary_10_1186_s13321_016_0117_7
crossref_primary_10_1016_j_tplants_2014_08_004
crossref_primary_10_1186_s13321_025_00950_4
crossref_primary_10_1016_j_tifs_2020_10_034
crossref_primary_10_1021_acs_chemrestox_6b00037
crossref_primary_10_1016_j_ces_2023_119086
crossref_primary_10_1016_j_scitotenv_2016_12_088
crossref_primary_10_1002_etc_3578
crossref_primary_10_1016_j_chemosphere_2016_12_095
crossref_primary_10_1038_srep24817
crossref_primary_10_3389_fbioe_2019_00485
crossref_primary_10_1016_j_chemolab_2018_08_015
crossref_primary_10_1021_acs_chemrestox_5b00358
crossref_primary_10_1289_ehp_1509748
crossref_primary_10_1039_C5MB00468C
crossref_primary_10_1016_j_envint_2016_01_010
crossref_primary_10_3389_fenvs_2016_00012
crossref_primary_10_3390_cryst11070818
crossref_primary_10_1016_j_cplett_2018_06_022
crossref_primary_10_12677_SA_2022_116139
crossref_primary_10_1007_s10822_019_00255_3
crossref_primary_10_1021_acs_jcim_8b00433
crossref_primary_10_1021_acssuschemeng_7b03394
crossref_primary_10_1002_jat_3424
crossref_primary_10_1016_j_scitotenv_2021_151103
crossref_primary_10_1021_tx500501h
crossref_primary_10_1002_jat_3366
crossref_primary_10_1016_j_scitotenv_2024_174201
crossref_primary_10_1021_acs_jcim_8b00553
crossref_primary_10_1039_C7MD00229G
crossref_primary_10_1517_17460441_2016_1117070
crossref_primary_10_1016_j_chemosphere_2015_03_060
crossref_primary_10_1021_acs_est_1c06157
crossref_primary_10_1021_acs_jcim_6b00625
crossref_primary_10_1186_1476_069X_13_57
crossref_primary_10_1039_C5RA10729F
crossref_primary_10_1016_j_etap_2021_103688
crossref_primary_10_1289_ehp_1510267
crossref_primary_10_3109_1061186X_2015_1132224
crossref_primary_10_1016_j_aquatox_2017_12_003
crossref_primary_10_1016_j_scitotenv_2020_143082
crossref_primary_10_1155_2015_916240
crossref_primary_10_1016_j_etap_2017_05_015
crossref_primary_10_1021_acs_est_1c01228
Cites_doi 10.1093/toxsci/kfr254
10.1016/j.taap.2013.04.032
10.1016/j.jmgm.2006.01.007
10.1021/ci6002619
10.1080/10937404.2010.483947
10.1124/dmd.108.023507
10.1021/ci100081j
10.1080/10937404.2010.483935
10.1016/j.tox.2010.12.010
10.1021/ci300421n
10.1021/tx900325g
10.1289/ehp.5686
10.1016/j.taap.2010.05.017
10.1080/1062936X.2010.528254
10.1038/nrendo.2010.87
10.1093/bioinformatics/bti623
10.1289/ehp.0901392
10.1021/ci100364a
10.1289/ehp.1002180
10.1021/ci049869h
10.1002/minf.201100069
10.1021/tx049652h
10.1186/1758-2946-3-33
10.1289/ehp.1002476
10.1186/1472-6947-11-51
10.1002/jcc.21148
10.1021/tx0600550
10.1186/1471-2105-8-328
10.1016/j.aca.2011.04.019
10.1109/TSMCB.2008.2002909
10.1007/s00216-011-5155-4
10.1021/ci4000536
10.1289/ehp.0800168
10.1007/s10822-011-9511-4
10.1021/ci060164k
10.1289/ehp.0800471
10.1016/j.taap.2007.12.037
10.1021/ac102832t
10.1021/tx049782q
10.1289/ehp.0211029
10.1007/s11030-011-9321-6
10.1186/1758-2946-4-10
10.1016/j.scitotenv.2011.10.046
10.1016/j.jmgm.2012.01.002
10.1021/tx100428e
10.1016/j.jpba.2010.12.008
10.1007/s00216-010-4268-5
10.1186/1471-2105-9-241
10.1002/minf.201000061
10.1021/tx200099j
10.1093/toxsci/kfl103
10.1093/bioinformatics/btp589
10.3390/ijms13021805
10.1002/jcc.21707
10.1093/toxsci/kfq233
10.1289/ehp.1205065
10.3390/ijms12021259
ContentType Journal Article
Copyright Copyright © 2013 American Chemical Society
2015 INIST-CNRS
Copyright American Chemical Society Dec 23, 2013
Copyright_xml – notice: Copyright © 2013 American Chemical Society
– notice: 2015 INIST-CNRS
– notice: Copyright American Chemical Society Dec 23, 2013
DBID AAYXX
CITATION
IQODW
CGR
CUY
CVF
ECM
EIF
NPM
7SC
7SR
7U5
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
7X8
DOI 10.1021/ci400527b
DatabaseName CrossRef
Pascal-Francis
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Computer and Information Systems Abstracts
Engineered Materials Abstracts
Solid State and Superconductivity Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Solid State and Superconductivity Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
MEDLINE

Materials Research Database
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Chemistry
Applied Sciences
EISSN 1549-960X
EndPage 3261
ExternalDocumentID 3174590581
24279462
28083533
10_1021_ci400527b
d175326626
Genre Research Support, Non-U.S. Gov't
Journal Article
Feature
GroupedDBID -
4.4
55A
5GY
7~N
AABXI
ABFLS
ABMVS
ABUCX
ACGFS
ACIWK
ACNCT
ACS
AEESW
AENEX
AFEFF
ALMA_UNASSIGNED_HOLDINGS
AQSVZ
D0L
DU5
EBS
ED
ED~
EJD
F5P
GNL
IH9
JG
JG~
LG6
P2P
PQEST
PQQKQ
RNS
ROL
UI2
VF5
VG9
W1F
X
---
-~X
5VS
AAYXX
ABBLG
ABJNI
ABLBI
ABQRX
ADHLV
AHGAQ
CITATION
CUPRZ
GGK
1WB
53G
ACRPL
ADNMO
AEYZD
ANPPW
ANTXH
IHE
IQODW
CGR
CUY
CVF
ECM
EIF
NPM
7SC
7SR
7U5
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
7X8
ID FETCH-LOGICAL-a410t-24a748e28ef3259ec0fcb517d022a1b9ca824b92d287ecbf5a1dc1cc85515a3e3
IEDL.DBID ACS
ISSN 1549-9596
1549-960X
IngestDate Fri Jul 11 01:10:26 EDT 2025
Mon Jun 30 10:51:22 EDT 2025
Thu Jan 02 22:13:17 EST 2025
Wed Apr 02 07:27:34 EDT 2025
Tue Jul 01 03:25:33 EDT 2025
Thu Apr 24 23:08:57 EDT 2025
Thu Aug 27 13:42:28 EDT 2020
IsPeerReviewed true
IsScholarly true
Issue 12
Keywords High throughput screening
Estrogen receptor
Agonist
Very large databases
Modeling
Information loss
Computational chemistry
Structure activity relation
Information quality
Vector support machine
Program proof
Antagonist
Skewed distribution
Pattern extraction
Binary classification
Discriminant analysis
Small signal
Chemical structure
Virtual screening
Cluster
Oversampling
Medical screening
Experimental study
Random decision forests
In vitro
Classification and Regression Tree
Subsampling
Scene analysis
Program test
Property structure relationship
Artificial intelligence
Language English
License CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a410t-24a748e28ef3259ec0fcb517d022a1b9ca824b92d287ecbf5a1dc1cc85515a3e3
Notes SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
PMID 24279462
PQID 1474887085
PQPubID 28739
PageCount 18
ParticipantIDs proquest_miscellaneous_1490733871
proquest_journals_1474887085
pubmed_primary_24279462
pascalfrancis_primary_28083533
crossref_primary_10_1021_ci400527b
crossref_citationtrail_10_1021_ci400527b
acs_journals_10_1021_ci400527b
ProviderPackageCode JG~
55A
AABXI
GNL
VF5
7~N
VG9
W1F
ACS
AEESW
AFEFF
ABMVS
ABUCX
IH9
AQSVZ
ED~
UI2
CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2013-12-23
PublicationDateYYYYMMDD 2013-12-23
PublicationDate_xml – month: 12
  year: 2013
  text: 2013-12-23
  day: 23
PublicationDecade 2010
PublicationPlace Washington, DC
PublicationPlace_xml – name: Washington, DC
– name: United States
– name: Washington
PublicationTitle Journal of chemical information and modeling
PublicationTitleAlternate J. Chem. Inf. Model
PublicationYear 2013
Publisher American Chemical Society
Publisher_xml – name: American Chemical Society
References Chang C. Y. (ref43/cit43) 2013; 53
Judson R. S. (ref6/cit6) 2012; 13
Judson R. S. (ref12/cit12) 2009; 117
Judson R. S. (ref7/cit7) 2008; 233
Hao M. (ref55/cit55) 2011; 12
Kavlock R. J. (ref16/cit16) 2010; 13
ref52/cit52
Dejaegher B. (ref36/cit36) 2011; 705
Tang Y. (ref42/cit42) 2009; 39
Egeghy P. P. (ref13/cit13) 2012; 414
Seal A. (ref29/cit29) 2012; 4
National Research Council (ref10/cit10) 1984
Hao M. (ref64/cit64) 2011; 15
Su B. H. (ref30/cit30) 2010; 50
Sing T. (ref65/cit65) 2005; 21
Li Q. (ref41/cit41) 2009; 25
Judson R. S. (ref18/cit18) 2011; 24
Cohen-Hubal E. A. (ref9/cit9) 2010; 13
Khalilia M. (ref45/cit45) 2011; 11
Yap C. W. (ref50/cit50) 2011; 32
Mahoney M. M. (ref4/cit4) 2010; 247
Li H. (ref35/cit35) 2006; 25
Cheng T. (ref38/cit38) 2011; 51
Zang Q. (ref63/cit63) 2011; 83
Birnbaum L. S. (ref5/cit5) 2003; 111
Hong H. (ref67/cit67) 2002; 110
Sedykh A. (ref26/cit26) 2011; 119
Varmuza K. (ref58/cit58) 2009
ref46/cit46
Zhang L. (ref25/cit25) 2013; 272
Luan F. (ref56/cit56) 2005; 18
O’boyle N. M. (ref49/cit49) 2011; 3
Zang Q. (ref60/cit60) 2011; 399
(ref66/cit66) 2011
Wetmore B. A. (ref17/cit17) 2012; 125
Li J. (ref68/cit68) 2010; 21
Vasanthanathan P. (ref37/cit37) 2009; 37
Rotroff D. M. (ref1/cit1) 2013; 121
Yang X. G. (ref34/cit34) 2009; 30
Knudsen T. B. (ref8/cit8) 2011; 282
Zang Q. (ref59/cit59) 2011; 401
Judson R. S. (ref20/cit20) 2010; 118
Xue Y. (ref39/cit39) 2004; 44
Soto A. M. (ref3/cit3) 2010; 6
ref14/cit14
Palmer D. S. (ref53/cit53) 2007; 47
Li H. (ref32/cit32) 2005; 18
Carbon-Mangels M. (ref40/cit40) 2011; 30
ref51/cit51
Shen M. Y. (ref31/cit31) 2011; 24
Diaz-Uriarte R. (ref54/cit54) 2007; 8
Tseng Y. J. (ref28/cit28) 2012; 26
(ref48/cit48) 2011
Zhang L. (ref62/cit62) 2013; 53
Dix D. J. (ref21/cit21) 2007; 95
Zang Q. (ref57/cit57) 2011; 54
ref15/cit15
Tropsha A. (ref24/cit24) 2010; 29
Eitrich T. (ref61/cit61) 2007; 47
Zhu H. (ref27/cit27) 2009; 117
Chen J. (ref44/cit44) 2012; 35
Xue Y. (ref33/cit33) 2006; 19
Martin M. T. (ref19/cit19) 2010; 23
(ref47/cit47) 2012
Reif D. M. (ref2/cit2) 2010; 118
Pease W. (ref11/cit11) 1997
DiMaggio P. A. (ref23/cit23) 2010; 118
Judson R. S. (ref22/cit22) 2008; 9
References_xml – volume-title: MOE (Molecular Operating Environment)
  year: 2012
  ident: ref47/cit47
– volume: 125
  start-page: 157
  issue: 1
  year: 2012
  ident: ref17/cit17
  publication-title: Toxicol. Sci.
  doi: 10.1093/toxsci/kfr254
– volume: 272
  start-page: 67
  issue: 1
  year: 2013
  ident: ref25/cit25
  publication-title: Toxicol. Appl. Pharmacol.
  doi: 10.1016/j.taap.2013.04.032
– ident: ref14/cit14
– volume: 25
  start-page: 313
  issue: 3
  year: 2006
  ident: ref35/cit35
  publication-title: J. Mol. Graph. Model.
  doi: 10.1016/j.jmgm.2006.01.007
– volume: 47
  start-page: 92
  year: 2007
  ident: ref61/cit61
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci6002619
– volume-title: R: A language and environment for statistical computing
  year: 2011
  ident: ref66/cit66
– volume: 13
  start-page: 299
  issue: 2
  year: 2010
  ident: ref9/cit9
  publication-title: J. Toxicol. Environ. Health B. Crit. Rev.
  doi: 10.1080/10937404.2010.483947
– volume: 37
  start-page: 658
  issue: 3
  year: 2009
  ident: ref37/cit37
  publication-title: Drug Metab. Dispos.
  doi: 10.1124/dmd.108.023507
– volume: 50
  start-page: 1304
  issue: 7
  year: 2010
  ident: ref30/cit30
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci100081j
– volume: 13
  start-page: 197
  issue: 2
  year: 2010
  ident: ref16/cit16
  publication-title: J. Toxicol. Environ. Health B. Crit. Rev.
  doi: 10.1080/10937404.2010.483935
– volume: 282
  start-page: 1
  issue: 1
  year: 2011
  ident: ref8/cit8
  publication-title: Toxicology
  doi: 10.1016/j.tox.2010.12.010
– volume: 53
  start-page: 475
  year: 2013
  ident: ref62/cit62
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci300421n
– volume: 23
  start-page: 578
  issue: 3
  year: 2010
  ident: ref19/cit19
  publication-title: Chem. Res. Toxicol.
  doi: 10.1021/tx900325g
– volume: 111
  start-page: 389
  issue: 4
  year: 2003
  ident: ref5/cit5
  publication-title: Environ. Health Perspect.
  doi: 10.1289/ehp.5686
– volume: 247
  start-page: 98
  issue: 2
  year: 2010
  ident: ref4/cit4
  publication-title: Toxicol. Appl. Pharmacol.
  doi: 10.1016/j.taap.2010.05.017
– volume: 21
  start-page: 657
  issue: 7
  year: 2010
  ident: ref68/cit68
  publication-title: SAR QSAR Environ. Res.
  doi: 10.1080/1062936X.2010.528254
– volume: 6
  start-page: 363
  issue: 7
  year: 2010
  ident: ref3/cit3
  publication-title: Nat. Rev. Endocrinol.
  doi: 10.1038/nrendo.2010.87
– volume-title: QikProp
  year: 2011
  ident: ref48/cit48
– volume: 21
  start-page: 3940
  issue: 20
  year: 2005
  ident: ref65/cit65
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bti623
– volume: 118
  start-page: 485
  issue: 4
  year: 2010
  ident: ref20/cit20
  publication-title: Environ. Health Perspect.
  doi: 10.1289/ehp.0901392
– volume: 51
  start-page: 229
  issue: 2
  year: 2011
  ident: ref38/cit38
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci100364a
– volume: 118
  start-page: 1714
  issue: 12
  year: 2010
  ident: ref2/cit2
  publication-title: Environ. Health Perspect.
  doi: 10.1289/ehp.1002180
– volume: 44
  start-page: 1630
  issue: 5
  year: 2004
  ident: ref39/cit39
  publication-title: J. Chem. Inf. Comput. Sci.
  doi: 10.1021/ci049869h
– ident: ref51/cit51
– volume: 30
  start-page: 885
  year: 2011
  ident: ref40/cit40
  publication-title: Mol. Inf.
  doi: 10.1002/minf.201100069
– volume: 18
  start-page: 1071
  issue: 6
  year: 2005
  ident: ref32/cit32
  publication-title: Chem. Res. Toxicol.
  doi: 10.1021/tx049652h
– volume-title: Toxicity testing: strategies to determine needs and priorities
  year: 1984
  ident: ref10/cit10
– volume: 3
  start-page: 33
  year: 2011
  ident: ref49/cit49
  publication-title: J. Cheminf.
  doi: 10.1186/1758-2946-3-33
– volume: 119
  start-page: 364
  issue: 3
  year: 2011
  ident: ref26/cit26
  publication-title: Environ. Health Perspect.
  doi: 10.1289/ehp.1002476
– volume: 11
  start-page: 51
  year: 2011
  ident: ref45/cit45
  publication-title: BMC Med. Inform. Decis. Mak.
  doi: 10.1186/1472-6947-11-51
– volume: 30
  start-page: 1202
  issue: 8
  year: 2009
  ident: ref34/cit34
  publication-title: J. Comput. Chem.
  doi: 10.1002/jcc.21148
– volume: 19
  start-page: 1030
  issue: 8
  year: 2006
  ident: ref33/cit33
  publication-title: Chem. Res. Toxicol.
  doi: 10.1021/tx0600550
– volume: 8
  start-page: 328
  year: 2007
  ident: ref54/cit54
  publication-title: BMC Bioinf.
  doi: 10.1186/1471-2105-8-328
– volume: 705
  start-page: 98
  issue: 1
  year: 2011
  ident: ref36/cit36
  publication-title: Anal. Chim. Acta
  doi: 10.1016/j.aca.2011.04.019
– volume: 39
  start-page: 281
  issue: 1
  year: 2009
  ident: ref42/cit42
  publication-title: IEEE Trans. Syst. Man. Cybern. B. Cybern.
  doi: 10.1109/TSMCB.2008.2002909
– volume: 401
  start-page: 939
  year: 2011
  ident: ref59/cit59
  publication-title: Anal. Bioanal. Chem.
  doi: 10.1007/s00216-011-5155-4
– volume: 53
  start-page: 958
  issue: 4
  year: 2013
  ident: ref43/cit43
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci4000536
– volume: 117
  start-page: 685
  issue: 5
  year: 2009
  ident: ref12/cit12
  publication-title: Environ. Health Perspect.
  doi: 10.1289/ehp.0800168
– volume: 26
  start-page: 39
  year: 2012
  ident: ref28/cit28
  publication-title: J. Comput.-Aided Mol. Des.
  doi: 10.1007/s10822-011-9511-4
– volume: 47
  start-page: 150
  issue: 1
  year: 2007
  ident: ref53/cit53
  publication-title: J. Chem. Inf. Model.
  doi: 10.1021/ci060164k
– volume: 117
  start-page: 1257
  issue: 8
  year: 2009
  ident: ref27/cit27
  publication-title: Environ. Health Perspect.
  doi: 10.1289/ehp.0800471
– volume-title: Introduction to multivariate statistical analysis in chemometrics
  year: 2009
  ident: ref58/cit58
– volume: 233
  start-page: 7
  issue: 1
  year: 2008
  ident: ref7/cit7
  publication-title: Toxicol. Appl. Pharmacol.
  doi: 10.1016/j.taap.2007.12.037
– volume: 83
  start-page: 1030
  year: 2011
  ident: ref63/cit63
  publication-title: Anal. Chem.
  doi: 10.1021/ac102832t
– volume: 18
  start-page: 198
  issue: 2
  year: 2005
  ident: ref56/cit56
  publication-title: Chem. Res. Toxicol.
  doi: 10.1021/tx049782q
– volume: 110
  start-page: 29
  issue: 1
  year: 2002
  ident: ref67/cit67
  publication-title: Environ. Health Perspect.
  doi: 10.1289/ehp.0211029
– volume: 15
  start-page: 877
  year: 2011
  ident: ref64/cit64
  publication-title: Mol. Divers.
  doi: 10.1007/s11030-011-9321-6
– ident: ref46/cit46
– volume: 4
  start-page: 10
  issue: 1
  year: 2012
  ident: ref29/cit29
  publication-title: J. Cheminf.
  doi: 10.1186/1758-2946-4-10
– volume: 414
  start-page: 159
  issue: 1
  year: 2012
  ident: ref13/cit13
  publication-title: Sci. Total Environ.
  doi: 10.1016/j.scitotenv.2011.10.046
– volume: 35
  start-page: 21
  year: 2012
  ident: ref44/cit44
  publication-title: J. Mol. Graph. Model.
  doi: 10.1016/j.jmgm.2012.01.002
– volume: 24
  start-page: 451
  issue: 4
  year: 2011
  ident: ref18/cit18
  publication-title: Chem. Res. Toxicol.
  doi: 10.1021/tx100428e
– volume: 54
  start-page: 1020
  year: 2011
  ident: ref57/cit57
  publication-title: J. Pharm. Biomed. Anal.
  doi: 10.1016/j.jpba.2010.12.008
– volume-title: Toxic ignorance: the continuing absence of basic health testing for top-selling chemicals in the United States
  year: 1997
  ident: ref11/cit11
– volume: 399
  start-page: 635
  year: 2011
  ident: ref60/cit60
  publication-title: Anal. Bioanal. Chem.
  doi: 10.1007/s00216-010-4268-5
– volume: 9
  start-page: 241
  year: 2008
  ident: ref22/cit22
  publication-title: BMC Bioinf.
  doi: 10.1186/1471-2105-9-241
– volume: 29
  start-page: 476
  year: 2010
  ident: ref24/cit24
  publication-title: Mol. Inf.
  doi: 10.1002/minf.201000061
– volume: 24
  start-page: 934
  issue: 6
  year: 2011
  ident: ref31/cit31
  publication-title: Chem. Res. Toxicol.
  doi: 10.1021/tx200099j
– volume: 95
  start-page: 5
  issue: 1
  year: 2007
  ident: ref21/cit21
  publication-title: Toxicol. Sci.
  doi: 10.1093/toxsci/kfl103
– volume: 25
  start-page: 3310
  issue: 24
  year: 2009
  ident: ref41/cit41
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btp589
– volume: 13
  start-page: 1805
  issue: 2
  year: 2012
  ident: ref6/cit6
  publication-title: Int. J. Mol. Sci.
  doi: 10.3390/ijms13021805
– volume: 32
  start-page: 1466
  issue: 7
  year: 2011
  ident: ref50/cit50
  publication-title: J. Comput. Chem.
  doi: 10.1002/jcc.21707
– ident: ref15/cit15
– volume: 118
  start-page: 251
  issue: 1
  year: 2010
  ident: ref23/cit23
  publication-title: Toxicol. Sci.
  doi: 10.1093/toxsci/kfq233
– ident: ref52/cit52
– volume: 121
  start-page: 7
  issue: 1
  year: 2013
  ident: ref1/cit1
  publication-title: Environ. Health Perspect.
  doi: 10.1289/ehp.1205065
– volume: 12
  start-page: 1259
  issue: 2
  year: 2011
  ident: ref55/cit55
  publication-title: Int. J. Mol. Sci.
  doi: 10.3390/ijms12021259
SSID ssj0033962
Score 2.3270626
Snippet There are thousands of environmental chemicals subject to regulatory decisions for endocrine disrupting potential. The ToxCast and Tox21 programs have tested...
SourceID proquest
pubmed
pascalfrancis
crossref
acs
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 3244
SubjectTerms Algorithms
Applied sciences
Artificial Intelligence
Chemical compounds
Chemicals
Chemistry
Computer science; control theory; systems
Computer systems performance. Reliability
Data processing. List processing. Character string processing
Discriminant Analysis
Endocrine Disruptors - classification
Endocrine Disruptors - pharmacology
Environmental Monitoring
Estrogens
Exact sciences and technology
General and physical chemistry
General. Nomenclature, chemical documentation, computer chemistry
High-Throughput Screening Assays
Humans
Learning and adaptive systems
Memory organisation. Data processing
Molecular structure
Quantitative Structure-Activity Relationship
Receptors, Estrogen - agonists
Receptors, Estrogen - antagonists & inhibitors
Receptors, Estrogen - metabolism
Risk Assessment
Software
Theory of reactions, general kinetics. Catalysis. Nomenclature, chemical documentation, computer chemistry
Water Pollutants, Chemical - classification
Water Pollutants, Chemical - pharmacology
Title Binary Classification of a Large Collection of Environmental Chemicals from Estrogen Receptor Assays by Quantitative Structure–Activity Relationship and Machine Learning Methods
URI http://dx.doi.org/10.1021/ci400527b
https://www.ncbi.nlm.nih.gov/pubmed/24279462
https://www.proquest.com/docview/1474887085
https://www.proquest.com/docview/1490733871
Volume 53
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV3NbtQwEB6VcgAJAQUKgXY1_By4pF3_ZJMcy3arCnWRUKnU28p2nFKBsiuyeygn3qGP0jfiSZhxNmkrWrhFycRKPGP7s-fnA3iX9rmiiTZxIdjNSBAitjJN4twr28_LtJShqM_402D_SH88To5X4O0tHnwptt2p5rPL1N6Bu3JAg5fxz_CwnW6VygNrKJcai_MkH7Tlg66-ykuPq68tPQ9mpqZeKBv6itvxZVhn9h7Bbput04SXfNtazO2W-_l38cZ__cJjeLjEmbjTGMYarPjqCdwbtvRuT-HiQ8jExcCKyfFCQUU4LdHgAYeHYzhTcO3d0WVGHLXb1hmokfNTcERtTskUkVCon9E2Hknt5qxGe4afF6YKmWw0r-JhKFe7-OF__zrfcQ1zBXYBeV9PZ2iqAschwtPjsvjrCY4Dz3X9DI72Rl-G-_GSwSE2WvTnsdQm1ZmXmS8V7bO865fOJiItCDkYYXNnMqltLgvat3lny8SIwgnnMsJxiVFercNqNa38C0A9SJxVvmRXraYLww5gQeCrYJKaVEbQIxVPliOwngTnuhSTru8jeN9qn242cQ5Mw_H9JtE3neisKfpxk1Dvmgl1kjJjWKtUBButTV35LE0dQnNklkTwuntMqmc_jan8dMEyOVNp0lY2gueNLV42riVTAsiX__vdV3BfMnOHkLFUG7BK6vWbhJ_mthfGzx9wGhaG
linkProvider American Chemical Society
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NbtNAEF5BeyhSBeWvNbRhQBy4uGR_HNvHNEoVIKmE2kq9RbvrNVStnKhODuXEO_AovBFPwsz6Jy0qglvkjFfr3fHut56Z72PsbdwlRhOlw4xTmBEhRGhEHIWpk6ab5nEuPKnP5Kg3OlUfz6KzmiaHamGwEyW2VPog_opdgL-354o-YcbmPltHECLIm_uD42bVlTL14qHEOBamUdprWIRu3ko7kC1v7UCbc13iYOSVisXfYabfbg4fVbpFvqM-y-Rif7kw-_bbHxyO__ckW-xhjTqhX7nJY3bPFU_YxqARe3vKfh74ulzwGpmUPeQnDGY5aBhTsjj4Lwy2uTpc1cdhuw3rQAlUrQJDbHOGjgmISd0cD_WATqCvSzDX8HmpC1_XhqssHHvy2uWV-_X9R99WOhbQpud9PZ-DLjKY-HxPBzUV7BeYeNXr8hk7PRyeDEZhrecQasW7i1AoHavEicTlEk9dznZzayIeZ4gjNDep1YlQJhUZnuKcNXmkeWa5tQmiukhLJ5-ztWJWuB0GqhdZI11OgVuFPzSFgzlCsYwka2IRsA4O_bR-H8upD7ULPm3HPmDvGifAi1XWA4lyXN5l-qY1nVcUIHcZdW55UmspEgK5UgZst3GtG91SOCC4YiZRwF63f-PUU9RGF262JJuUhDXxYBuw7colV40rQQIB4sW_HvcV2xidTMbT8YejTy_ZA0GaHlyEQu6yNZxqt4fIamE6_pX6Dck4Huc
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwEB5BkQAJ8X4EymIQBy4p60c2yXFZdlWgW0ClUm8r27GhAmUjsnsoJ_4DP4V_xC9hxnm0RUVwi5KJ5dhjeyYz830AT9MhIZooHRecwoxoQsRGpEmcO2mGuU-9CKA-893R9r56fZActI4i1cJgJ2psqQ5BfFrVVeFbhAH-3B4q-o2ZmvNwgcJ1pNHjyV6380qZBwJRQh2L8yQfdUhCJ1-lU8jWp06hK5WucUB8w2Txd1MzHDmza_C272zINPm8tV6ZLfvtDxzH__-a63C1tT7ZuFGXG3DOlTfh0qQjfbsFP1-E-lwWuDIpiyhMHFt6ptkOJY2z8KfBdnenx3Vy2G6HPlAzqlphU2xziQrK0DZ1FTr3DJVBH9XMHLH3a12G-jbcbdleALFdf3W_vv8Y24bPgvVpep8OK6bLgs1D3qdjLSTsRzYP7Nf1bdifTT9MtuOW1yHWig9XsVA6VZkTmfMSvS9nh96ahKcF2hOam9zqTCiTiwK9OWeNTzQvLLc2Q-su0dLJO7BRLkt3D5gaJdZI5ymAq_BCU1iYo0lWEHVNKiIY4PAv2nVZL0LIXfBFP_YRPOsUAW822Q9EzvHlLNEnvWjVQIGcJTQ4pU29pMjI2JUygs1OvU50S-GA4M6ZJRE87h_j1FP0RpduuSaZnAg20cGN4G6jlseNK0FEAeL-vz73EVx893K22Hm1--YBXBZE7cFFLOQmbOBMu4doYK3MIKyq3_BFIWo
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Binary+Classification+of+a+Large+Collection+of+Environmental+Chemicals+from+Estrogen+Receptor+Assays+by+Quantitative+Structure-Activity+Relationship+and+Machine+Learning+Methods&rft.jtitle=Journal+of+chemical+information+and+modeling&rft.au=Zang%2C+Qingda&rft.au=Rotroff%2C+Daniel+M&rft.au=Judson%2C+Richard+S&rft.date=2013-12-23&rft.pub=American+Chemical+Society&rft.issn=1549-9596&rft.eissn=1549-960X&rft.volume=53&rft.issue=12&rft.spage=3244&rft_id=info:doi/10.1021%2Fci400527b&rft.externalDBID=NO_FULL_TEXT&rft.externalDocID=3174590581
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1549-9596&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1549-9596&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1549-9596&client=summon