New opportunities for materials informatics: Resources and data mining techniques for uncovering hidden relationships

Data mining has revolutionized sectors as diverse as pharmaceutical drug discovery, finance, medicine, and marketing, and has the potential to similarly advance materials science. In this paper, we describe advances in simulation-based materials databases, open-source software tools, and machine lea...

Full description

Saved in:
Bibliographic Details
Published inJournal of materials research Vol. 31; no. 8; pp. 977 - 994
Main Authors Jain, Anubhav, Hautier, Geoffroy, Ong, Shyue Ping, Persson, Kristin
Format Journal Article
LanguageEnglish
Published New York, USA Cambridge University Press 28.04.2016
Springer International Publishing
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0884-2914
2044-5326
DOI10.1557/jmr.2016.80

Cover

More Information
Summary:Data mining has revolutionized sectors as diverse as pharmaceutical drug discovery, finance, medicine, and marketing, and has the potential to similarly advance materials science. In this paper, we describe advances in simulation-based materials databases, open-source software tools, and machine learning algorithms that are converging to create new opportunities for materials informatics. We discuss the data mining techniques of exploratory data analysis, clustering, linear models, kernel ridge regression, tree-based regression, and recommendation engines. We present these techniques in the context of several materials application areas, including compound prediction, Li-ion battery design, piezoelectric materials, photocatalysts, and thermoelectric materials. Finally, we demonstrate how new data and tools are making it easier and more accessible than ever to perform data mining through a new analysis that learns trends in the valence and conduction band character of compounds in the Materials Project database using data on over 2500 compounds.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0884-2914
2044-5326
DOI:10.1557/jmr.2016.80