Machine learning prediction of nanoparticle in vitro toxicity: A comparative study of classifiers and ensemble-classifiers using the Copeland Index

[Display omitted] •Random Forest (RF) and Neural Network (NN) have the best performance compared to the other base classifiers.•Ensemble classifiers show robustness, compared to basic classifiers, in predicting the toxicity of NP based on their properties and in vitro experimental conditions.•RF and...

Full description

Saved in:

Bibliographic Details
Published in	Toxicology letters Vol. 312; pp. 157 - 166
Main Authors	Furxhi, Irini, Murphy, Finbarr, Mullins, Martin, Poland, Craig A.
Format	Journal Article
Language	English
Published	Netherlands Elsevier B.V 15.09.2019
Subjects	Copeland Index Machine learning Nanoparticles Nanotoxicity Voting NN EXT BD NP Copeland Index LR GLM SVM BN DP INT DT Nanoparticles Voting ID SPEC LWL ACC kNN BIR F1 DIR SENS Nanotoxicity IBk RF SMO LIR NIR Machine learning REL SIR
Online Access	Get full text

Cover

Loading…

More Information
Summary:	[Display omitted] •Random Forest (RF) and Neural Network (NN) have the best performance compared to the other base classifiers.•Ensemble classifiers show robustness, compared to basic classifiers, in predicting the toxicity of NP based on their properties and in vitro experimental conditions.•RF and NN combined with another base classifier have not the best performance. Combining lower rank classifiers can help to catch the outliers.•Copeland Index based on datasets, validation processes and performance metrics can be used to rank base and ensemble classifiers.•RF, Bayesian Network (BN) and ensemble classifiers show high performances with missing values while NN did not. Nano-Particles (NPs) are well established as important components across a broad range of products from cosmetics to electronics. Their utilization is increasing with their significant economic and societal potential yet to be fully realized. Inroads have been made in our understanding of the risks posed to human health and the environment by NPs but this area will require continuous research and monitoring. In recent years Machine Learning (ML) techniques have exploited large datasets and computation power to create breakthroughs in diverse fields from facial recognition to genomics. More recently, ML techniques have been applied to nanotoxicology with very encouraging results. In this study, categories of ML classifiers (rules, trees, lazy, functions and bayes) were compared using datasets from the Safe and Sustainable Nanotechnology (S2NANO) database to investigate their performance in predicting NPs in vitro toxicity. Physicochemical properties, toxicological and quantum-mechanical attributes and in vitro experimental conditions were used as input variables to predict the toxicity of NPs based on cell viability. Voting, an ensemble meta-classifier, was used to combine base models to optimize the classification prediction of toxicity. To facilitate inter-comparison, a Copeland Index was applied that ranks the classifiers according to their performance and suggested the optimal classifier. Neural Network (NN) and Random forest (RF) showed the best performance in the majority of the datasets used in this study. However, the combination of classifiers demonstrated an improved prediction resulting meta-classifier to have higher indices. This proposed Copeland Index can now be used by researchers to identify and clearly prioritize classifiers in order to achieve more accurate classification predictions for NP toxicity for a given dataset.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0378-4274 1879-3169
DOI:	10.1016/j.toxlet.2019.05.016