In silico toxicology: comprehensive benchmarking of multi‐label classification methods applied to chemical toxicity data

One goal of toxicity testing, among others, is identifying harmful effects of chemicals. Given the high demand for toxicity tests, it is necessary to conduct these tests for multiple toxicity endpoints for the same compound. Current computational toxicology methods aim at developing models mainly to...

Full description

Saved in:
Bibliographic Details
Published inWiley interdisciplinary reviews. Computational molecular science Vol. 8; no. 3; pp. e1352 - n/a
Main Authors Raies, Arwa B., Bajic, Vladimir B.
Format Journal Article
LanguageEnglish
Published Hoboken, USA Wiley Periodicals, Inc 01.05.2018
Online AccessGet full text
ISSN1759-0876
1759-0884
DOI10.1002/wcms.1352

Cover

Abstract One goal of toxicity testing, among others, is identifying harmful effects of chemicals. Given the high demand for toxicity tests, it is necessary to conduct these tests for multiple toxicity endpoints for the same compound. Current computational toxicology methods aim at developing models mainly to predict a single toxicity endpoint. When chemicals cause several toxicity effects, one model is generated to predict toxicity for each endpoint, which can be labor and computationally intensive when the number of toxicity endpoints is large. Additionally, this approach does not take into consideration possible correlation between the endpoints. Therefore, there has been a recent shift in computational toxicity studies toward generating predictive models able to predict several toxicity endpoints by utilizing correlations between these endpoints. Applying such correlations jointly with compounds' features may improve model's performance and reduce the number of required models. This can be achieved through multi‐label classification methods. These methods have not undergone comprehensive benchmarking in the domain of predictive toxicology. Therefore, we performed extensive benchmarking and analysis of over 19,000 multi‐label classification models generated using combinations of the state‐of‐the‐art methods. The methods have been evaluated from different perspectives using various metrics to assess their effectiveness. We were able to illustrate variability in the performance of the methods under several conditions. This review will help researchers to select the most suitable method for the problem at hand and provide a baseline for evaluating new approaches. Based on this analysis, we provided recommendations for potential future directions in this area. This article is categorized under: Computer and Information Science > Chemoinformatics Computer and Information Science > Computer Algorithms and Programming Comprehensive assessment of multi‐label classification methods applied to compounds that may cause several toxicity effects.
AbstractList One goal of toxicity testing, among others, is identifying harmful effects of chemicals. Given the high demand for toxicity tests, it is necessary to conduct these tests for multiple toxicity endpoints for the same compound. Current computational toxicology methods aim at developing models mainly to predict a single toxicity endpoint. When chemicals cause several toxicity effects, one model is generated to predict toxicity for each endpoint, which can be labor and computationally intensive when the number of toxicity endpoints is large. Additionally, this approach does not take into consideration possible correlation between the endpoints. Therefore, there has been a recent shift in computational toxicity studies toward generating predictive models able to predict several toxicity endpoints by utilizing correlations between these endpoints. Applying such correlations jointly with compounds' features may improve model's performance and reduce the number of required models. This can be achieved through multi‐label classification methods. These methods have not undergone comprehensive benchmarking in the domain of predictive toxicology. Therefore, we performed extensive benchmarking and analysis of over 19,000 multi‐label classification models generated using combinations of the state‐of‐the‐art methods. The methods have been evaluated from different perspectives using various metrics to assess their effectiveness. We were able to illustrate variability in the performance of the methods under several conditions. This review will help researchers to select the most suitable method for the problem at hand and provide a baseline for evaluating new approaches. Based on this analysis, we provided recommendations for potential future directions in this area. This article is categorized under: Computer and Information Science > Chemoinformatics Computer and Information Science > Computer Algorithms and Programming Comprehensive assessment of multi‐label classification methods applied to compounds that may cause several toxicity effects.
One goal of toxicity testing, among others, is identifying harmful effects of chemicals. Given the high demand for toxicity tests, it is necessary to conduct these tests for multiple toxicity endpoints for the same compound. Current computational toxicology methods aim at developing models mainly to predict a single toxicity endpoint. When chemicals cause several toxicity effects, one model is generated to predict toxicity for each endpoint, which can be labor and computationally intensive when the number of toxicity endpoints is large. Additionally, this approach does not take into consideration possible correlation between the endpoints. Therefore, there has been a recent shift in computational toxicity studies toward generating predictive models able to predict several toxicity endpoints by utilizing correlations between these endpoints. Applying such correlations jointly with compounds' features may improve model's performance and reduce the number of required models. This can be achieved through multi-label classification methods. These methods have not undergone comprehensive benchmarking in the domain of predictive toxicology. Therefore, we performed extensive benchmarking and analysis of over 19,000 multi-label classification models generated using combinations of the state-of-the-art methods. The methods have been evaluated from different perspectives using various metrics to assess their effectiveness. We were able to illustrate variability in the performance of the methods under several conditions. This review will help researchers to select the most suitable method for the problem at hand and provide a baseline for evaluating new approaches. Based on this analysis, we provided recommendations for potential future directions in this area. This article is categorized under: 1Computer and Information Science > Chemoinformatics2Computer and Information Science > Computer Algorithms and Programming.
Author Raies, Arwa B.
Bajic, Vladimir B.
Author_xml – sequence: 1
  givenname: Arwa B.
  orcidid: 0000-0003-3952-7363
  surname: Raies
  fullname: Raies, Arwa B.
  organization: King Abdullah University of Science and Technology (KAUST)
– sequence: 2
  givenname: Vladimir B.
  orcidid: 0000-0001-5435-4750
  surname: Bajic
  fullname: Bajic, Vladimir B.
  email: vladimir.bajic@kaust.edu.sa
  organization: King Abdullah University of Science and Technology (KAUST)
BackLink https://www.ncbi.nlm.nih.gov/pubmed/29780432$$D View this record in MEDLINE/PubMed
BookMark eNo9UEtOwzAUtFARLaULLoB8gbT-JLXDDlV8KhWxAMQycmynMdhxVKeUsOIInJGTkKjQpyfNSDOapzenYFD5SgNwjtEUI0RmO-nCFNOEHIERZkkaIc7jwYGz-RBMQnhF3cQpJhSfgCFJGUcxJSPwuaxgMNZIDxv_0YH16_YSSu_qjS51Fcy7hrmuZOnE5s1Ua-gL6La2MT9f31bk2kJpRQimMFI0xlfQ6ab0KkBR19Zo1cVCWWrXyXZ_wjQtVKIRZ-C4EDboyR-OwfPN9dPiLlo93C4XV6tIxoiTiMkCiUQRxtJ5IihRqeQcMVbQJE77RXlRCCJixTUiiiuGEE1jrBIpE4oEHYOLfW69zZ1WWb0x3S9t9l9CZ5jtDTtjdXvQMcr6grO-4KwvOHtZ3D_2hP4Cd49ytQ
CitedBy_id crossref_primary_10_1021_acs_jcim_9b00611
crossref_primary_10_1038_s41598_023_31169_8
crossref_primary_10_1021_acs_jcim_8b00551
crossref_primary_10_1021_acs_jcim_9b00749
crossref_primary_10_1093_bib_bbaa034
crossref_primary_10_3389_fchem_2019_00782
crossref_primary_10_1021_acs_jcim_2c00258
crossref_primary_10_1080_10590501_2018_1537148
crossref_primary_10_1080_10408398_2021_1895060
crossref_primary_10_1002_admt_202201274
crossref_primary_10_1016_j_toxlet_2019_05_016
crossref_primary_10_1155_2018_6179427
crossref_primary_10_1088_2632_2153_ad652c
crossref_primary_10_1021_acs_chemrestox_0c00316
crossref_primary_10_1080_10590501_2018_1537563
ContentType Journal Article
Copyright 2017 The Authors. published by Wiley Periodicals, Inc.
Copyright_xml – notice: 2017 The Authors. published by Wiley Periodicals, Inc.
DBID 24P
NPM
DOI 10.1002/wcms.1352
DatabaseName Wiley Open Access Collection
PubMed
DatabaseTitle PubMed
DatabaseTitleList
PubMed
Database_xml – sequence: 1
  dbid: 24P
  name: Wiley Online Library Open Access
  url: https://authorservices.wiley.com/open-science/open-access/browse-journals.html
  sourceTypes: Publisher
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Chemistry
EISSN 1759-0884
EndPage n/a
ExternalDocumentID 29780432
WCMS1352
Genre reviewArticle
Journal Article
Review
GroupedDBID 05W
0R~
1OC
1VH
24P
31~
33P
8-0
8-1
AAESR
AAHQN
AAMNL
AANHP
AANLZ
AASGY
AAXRX
AAYCA
AAZKR
ABCUV
ACAHQ
ACBWZ
ACCZN
ACGFS
ACIWK
ACPOU
ACPRK
ACRPL
ACXBN
ACXQS
ACYXJ
ADBBV
ADEOM
ADKYN
ADMGS
ADNMO
ADOZA
ADXAS
ADZMN
AEIGN
AEUYR
AEYWJ
AFBPY
AFFPM
AFGKR
AFRAH
AFWVQ
AFZJQ
AGHNM
AGQPQ
AGYGG
AHBTC
AITYG
AIURR
AJXKR
ALMA_UNASSIGNED_HOLDINGS
ALUQN
ALVPJ
AMYDB
ASPBG
AUFTA
AVWKF
AZFZN
AZVAB
BDRZF
BFHJK
BHBCM
BMNLL
BMXJE
BRXPI
D-A
DCZOG
DRFUL
DRSTM
EBS
EJD
FEDTE
G-S
GODZA
HGLYW
HVGLF
HZ~
LATKE
LEEKS
LH4
LITHE
LOXES
LUTES
LYRES
MEWTI
MRFUL
MRSTM
MSFUL
MSSTM
MXFUL
MXSTM
MY.
MY~
O66
O9-
P2W
ROL
SUPJJ
WBKPD
WHWMO
WIH
WIK
WOHZO
WVDHM
WXSBR
ZZTAW
~S-
A00
AAHHS
ACCFJ
AEEZP
AEQDE
AFPWT
AIWBW
AJBDE
NPM
WYJ
ID FETCH-LOGICAL-c4082-7cf0a5d277965a32d9c88077f354954950bffa2a4d8e02d8d7003941d5cc530a3
IEDL.DBID 24P
ISSN 1759-0876
IngestDate Wed Feb 19 02:41:33 EST 2025
Tue Sep 09 05:09:25 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Language English
License Attribution-NonCommercial
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c4082-7cf0a5d277965a32d9c88077f354954950bffa2a4d8e02d8d7003941d5cc530a3
ORCID 0000-0001-5435-4750
0000-0003-3952-7363
OpenAccessLink https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fwcms.1352
PMID 29780432
PageCount 26
ParticipantIDs pubmed_primary_29780432
wiley_primary_10_1002_wcms_1352_WCMS1352
PublicationCentury 2000
PublicationDate May/June 2018
PublicationDateYYYYMMDD 2018-05-01
PublicationDate_xml – month: 05
  year: 2018
  text: May/June 2018
PublicationDecade 2010
PublicationPlace Hoboken, USA
PublicationPlace_xml – name: Hoboken, USA
– name: United States
PublicationTitle Wiley interdisciplinary reviews. Computational molecular science
PublicationTitleAlternate Wiley Interdiscip Rev Comput Mol Sci
PublicationYear 2018
Publisher Wiley Periodicals, Inc
Publisher_xml – name: Wiley Periodicals, Inc
References 3839589 - Regul Toxicol Pharmacol. 1985 Jun;5(2):152-74
18853299 - SAR QSAR Environ Res. 2008;19(5-6):495-524
8419153 - Environ Mol Mutagen. 1993;21(1):38-45; discussion 46-57
17813446 - Science. 1934 Jan 12;79(2037):38-9
23114987 - J Comput Chem. 2013 Mar 15;34(7):604-10
24417026 - Int J Data Min Bioinform. 2013;8(3):338-48
21690016 - IEEE Trans Syst Man Cybern B Cybern. 2011 Dec;41(6):1471-82
21772764 - J Pharmacol Pharmacother. 2011 Apr;2(2):74-9
26258538 - Nat Biotechnol. 2015 Sep;33(9):933-40
20708096 - Drug Discov Today. 2010 Dec;15(23-24):997-1007
16386343 - Regul Toxicol Pharmacol. 2006 Mar;44(2):83-96
26811972 - Nat Commun. 2016 Jan 26;7:10425
26508991 - Comput Math Methods Med. 2015;2015:246374
23457578 - PLoS One. 2013;8(2):e56517
26017442 - Nature. 2015 May 28;521(7553):436-44
26855674 - J Cheminform. 2016 Feb 04;8:7
16352383 - Regul Toxicol Pharmacol. 2006 Mar;44(2):97-110
27708580 - Front Pharmacol. 2016 Sep 21;7:321
17514565 - SAR QSAR Environ Res. 2007 May-Jun;18(3-4):195-207
21488656 - J Chem Inf Model. 2011 May 23;51(5):975-85
27066112 - Wiley Interdiscip Rev Comput Mol Sci. 2016 Mar;6(2):147-172
References_xml – reference: 21690016 - IEEE Trans Syst Man Cybern B Cybern. 2011 Dec;41(6):1471-82
– reference: 16386343 - Regul Toxicol Pharmacol. 2006 Mar;44(2):83-96
– reference: 17514565 - SAR QSAR Environ Res. 2007 May-Jun;18(3-4):195-207
– reference: 26811972 - Nat Commun. 2016 Jan 26;7:10425
– reference: 27708580 - Front Pharmacol. 2016 Sep 21;7:321
– reference: 27066112 - Wiley Interdiscip Rev Comput Mol Sci. 2016 Mar;6(2):147-172
– reference: 26855674 - J Cheminform. 2016 Feb 04;8:7
– reference: 21488656 - J Chem Inf Model. 2011 May 23;51(5):975-85
– reference: 3839589 - Regul Toxicol Pharmacol. 1985 Jun;5(2):152-74
– reference: 21772764 - J Pharmacol Pharmacother. 2011 Apr;2(2):74-9
– reference: 23457578 - PLoS One. 2013;8(2):e56517
– reference: 23114987 - J Comput Chem. 2013 Mar 15;34(7):604-10
– reference: 17813446 - Science. 1934 Jan 12;79(2037):38-9
– reference: 24417026 - Int J Data Min Bioinform. 2013;8(3):338-48
– reference: 16352383 - Regul Toxicol Pharmacol. 2006 Mar;44(2):97-110
– reference: 26508991 - Comput Math Methods Med. 2015;2015:246374
– reference: 18853299 - SAR QSAR Environ Res. 2008;19(5-6):495-524
– reference: 26258538 - Nat Biotechnol. 2015 Sep;33(9):933-40
– reference: 26017442 - Nature. 2015 May 28;521(7553):436-44
– reference: 8419153 - Environ Mol Mutagen. 1993;21(1):38-45; discussion 46-57
– reference: 20708096 - Drug Discov Today. 2010 Dec;15(23-24):997-1007
SSID ssj0000491231
Score 2.299648
SecondaryResourceType review_article
Snippet One goal of toxicity testing, among others, is identifying harmful effects of chemicals. Given the high demand for toxicity tests, it is necessary to conduct...
SourceID pubmed
wiley
SourceType Index Database
Publisher
StartPage e1352
Title In silico toxicology: comprehensive benchmarking of multi‐label classification methods applied to chemical toxicity data
URI https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fwcms.1352
https://www.ncbi.nlm.nih.gov/pubmed/29780432
Volume 8
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA6lHvQivq0vcvDgZWk2mzRZPUmxVKGloMXelmw2oYU-pK34OPkT_I3-EjNJW3sUlmVhd3ZhZmfyZZj5BqHL2GhiY6KiJJUqYqnMI0kZj3RNMh6b2OQCGpxb7Vqzyx56vFdCN8temMAPsUq4gWf4eA0OrvJZ9Y809E2PZjC1wcXfDWithbkNlHVWCRYHfV1U9hsuwdMIqNeWzEKEVlfSayvPOjr1y0tjB20vcCG-DYbcRSUz3kOb9eU4tn30eT_Gs8HQGQ7PJ-8DTzb9cY2hJHxq-qEMHefun-uPlM9_44nFvlzw5-vbmdoMsQakDKVB3ho4DI-eYRWAqHst1gv6gPAJB9AxVJAeoG7j7qnejBaDEyIN86MjoS1RvKBCpDWuElqk2rmpEDZxu0E4SG6toooV0hBayEJAiy6LC641T4hKDlF5PBmbY4SNVQW3RmqpJHMyknBOCXfCRlvB8go6CurLXgI7RkY9p1FCK-jK63N1IxAk0wxUn4Hqs-d66xEuTv7_6CnacpBFhpLDM1SeT1_NuYMF8_zCm9-d253WL11VuAA
linkProvider Wiley-Blackwell
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA6lHupFfFufOXjwspjNJk1WvEix1EdF0KK3JZtNaEFbaSs-Tv4Ef6O_xEzSrj0Ke1jYzS7MZCZfhi_fIHQYG01sTFSUpFJFLJV5JCnjkW5IxmMTm1zAAefOTaPdZZeP_LGCTmdnYYI-RFlwg8jw-RoCHArSx3-qoW_6eQxtG1wCXmAOlwOhj7LbssLisK9Ly37HJXgagfbaTFqI0ONy9NzSMw9P_frSWkZLU2CIz4InV1DFDFZRrTnrx7aGPi8GeNx_cp7Dk-F736tNf5xg4ISPTC_w0HHuJl3vWfkCOB5a7PmCP1_fztfmCWuAysAN8u7AoXv0GKuARN1nsZ7qB4RfOISOgUK6jrqt8_tmO5p2Tog0NJCOhLZE8YIKkTa4SmiRahenQtjEbQfhIrm1iipWSENoIQsBZ3RZXHCteUJUsoGqg-HAbCFsrCq4NVJLJZkbIwnnlHA32GgrWF5Hm8F82UuQx8ioFzVKaB0deXuWD4JCMs3A9BmYPntodu7gZvv_rx6gWvu-c51dX9xc7aBFh19k4B_uoupk9Gr2HEaY5Pt-KvwCDJa6Zw
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LSwMxEA6ioF7E91tz8OBlaTabNFk9SbVYH6Wgxd6WbDahgralrfg4-RP8jf4SM0m7ehT2sLA7uzCTSb5MZr5B6Cg2mtiYqChJpYpYKvNIUsYjXZWMxyY2uYAC59tm9bLNrjq8M4NOp7UwgR-iDLiBZ_j5Ghx8UNjKL2noq34eQdcGN__OwWEfeCVlrTLA4qCvm5X9hkvwNALqtSmzEKGVUvrPyvMXnfrlpb6Mlia4EJ8FQ66gGdNbRQu1aTu2NfTR6OHR45MzHB733x492fT7CYaU8KHphjR0nLsx131WPv6N-xb7dMHvzy9navOENSBlSA3y1sChefQIqwBE3WexntAHhF84gI4hg3QdtesX97XLaNI4IdLQPzoS2hLFCypEWuUqoUWqnZsKYRO3G4SL5NYqqlghDaGFLASU6LK44FrzhKhkA832-j2zhbCxquDWSC2VZE5GEs4p4U7YaCtYvo02g_qyQWDHyKjnNEroNjr2-iwfBIJkmoHqM1B99lC7vYObnf-_eojmW-f17KbRvN5Fiw69yJB9uIdmx8MXs-8Qwjg_8CPhByVeuZA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=In+silico+toxicology%3A+comprehensive+benchmarking+of+multi%E2%80%90label+classification+methods+applied+to+chemical+toxicity+data&rft.jtitle=Wiley+interdisciplinary+reviews.+Computational+molecular+science&rft.au=Raies%2C+Arwa+B.&rft.au=Bajic%2C+Vladimir+B.&rft.date=2018-05-01&rft.pub=Wiley+Periodicals%2C+Inc&rft.issn=1759-0876&rft.eissn=1759-0884&rft.volume=8&rft.issue=3&rft.epage=n%2Fa&rft_id=info:doi/10.1002%2Fwcms.1352&rft.externalDBID=10.1002%252Fwcms.1352&rft.externalDocID=WCMS1352
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1759-0876&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1759-0876&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1759-0876&client=summon