Comprehensive analysis of embeddings and pre-training in NLP

The amount of data and computing power has drastically increased over the last decade, which leads to the development of several new fronts in the field of Natural Language Processing (NLP). In addition to that, the entanglement of embeddings and large pre-trained models have pushed the field forwar...

Full description

Saved in:
Bibliographic Details
Published inComputer science review Vol. 42; p. 100433
Main Authors Tripathy, Jatin Karthik, Sethuraman, Sibi Chakkaravarthy, Cruz, Meenalosini Vimal, Namburu, Anupama, P., Mangalraj, R., Nandha Kumar, S, Sudhakar Ilango, Vijayakumar, Vaidehi
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.11.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The amount of data and computing power has drastically increased over the last decade, which leads to the development of several new fronts in the field of Natural Language Processing (NLP). In addition to that, the entanglement of embeddings and large pre-trained models have pushed the field forward, covering a wide variety of tasks starting from machine translation to more complex tasks such as contextual text classification. This paper covers the underlying idea behind all embeddings and pre-trained models and provides an insight into fundamental strategies and implementation details of innovative embeddings. Further, it imparts the pros and cons of each specific embedding design and the associated impact on the result. It also comprehends the comparison of all the different strategies, datasets, architectures discussed in different papers with the help of standard metrics used in NLP. The content covered in this review work aims to shed light on different milestones reached in NLP, allowing the reader to deepen their understanding of NLP, which would motivate to explore the field further.
ISSN:1574-0137
1876-7745
DOI:10.1016/j.cosrev.2021.100433