Opportunities and challenges of text mining in aterials research

Research publications are the major repository of scientific knowledge. However, their unstructured and highly heterogenous format creates a significant obstacle to large-scale analysis of the information contained within. Recent progress in natural language processing (NLP) has provided a variety o...

Full description

Saved in:
Bibliographic Details
Published iniScience Vol. 24; no. 3; p. 102155
Main Authors Kononova, Olga, He, Tanjin, Huo, Haoyan, Trewartha, Amalie, Olivetti, Elsa A, Ceder, Gerbrand
Format Journal Article
LanguageEnglish
Published United States Elsevier 19.03.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Research publications are the major repository of scientific knowledge. However, their unstructured and highly heterogenous format creates a significant obstacle to large-scale analysis of the information contained within. Recent progress in natural language processing (NLP) has provided a variety of tools for high-quality information extraction from unstructured text. These tools are primarily trained on non-technical text and struggle to produce accurate results when applied to scientific text, involving specific technical terminology. During the last years, significant efforts in information retrieval have been made for biomedical and biochemical publications. For materials science, text mining (TM) methodology is still at the dawn of its development. In this review, we survey the recent progress in creating and applying TM and NLP approaches to materials science field. This review is directed at the broad class of researchers aiming to learn the fundamentals of TM as applied to the materials science publications.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-3
content type line 23
ObjectType-Review-1
ISSN:2589-0042
DOI:10.1016/j.isci.2021.102155