Comprehensive analysis of embeddings and pre-training in NLP

The amount of data and computing power has drastically increased over the last decade, which leads to the development of several new fronts in the field of Natural Language Processing (NLP). In addition to that, the entanglement of embeddings and large pre-trained models have pushed the field forwar...

Full description

Saved in:
Bibliographic Details
Published inComputer science review Vol. 42; p. 100433
Main Authors Tripathy, Jatin Karthik, Sethuraman, Sibi Chakkaravarthy, Cruz, Meenalosini Vimal, Namburu, Anupama, P., Mangalraj, R., Nandha Kumar, S, Sudhakar Ilango, Vijayakumar, Vaidehi
Format Journal Article
LanguageEnglish
Published Elsevier Inc 01.11.2021
Subjects
Online AccessGet full text

Cover

Loading…
Abstract The amount of data and computing power has drastically increased over the last decade, which leads to the development of several new fronts in the field of Natural Language Processing (NLP). In addition to that, the entanglement of embeddings and large pre-trained models have pushed the field forward, covering a wide variety of tasks starting from machine translation to more complex tasks such as contextual text classification. This paper covers the underlying idea behind all embeddings and pre-trained models and provides an insight into fundamental strategies and implementation details of innovative embeddings. Further, it imparts the pros and cons of each specific embedding design and the associated impact on the result. It also comprehends the comparison of all the different strategies, datasets, architectures discussed in different papers with the help of standard metrics used in NLP. The content covered in this review work aims to shed light on different milestones reached in NLP, allowing the reader to deepen their understanding of NLP, which would motivate to explore the field further.
AbstractList The amount of data and computing power has drastically increased over the last decade, which leads to the development of several new fronts in the field of Natural Language Processing (NLP). In addition to that, the entanglement of embeddings and large pre-trained models have pushed the field forward, covering a wide variety of tasks starting from machine translation to more complex tasks such as contextual text classification. This paper covers the underlying idea behind all embeddings and pre-trained models and provides an insight into fundamental strategies and implementation details of innovative embeddings. Further, it imparts the pros and cons of each specific embedding design and the associated impact on the result. It also comprehends the comparison of all the different strategies, datasets, architectures discussed in different papers with the help of standard metrics used in NLP. The content covered in this review work aims to shed light on different milestones reached in NLP, allowing the reader to deepen their understanding of NLP, which would motivate to explore the field further.
ArticleNumber 100433
Author Namburu, Anupama
Vijayakumar, Vaidehi
R., Nandha Kumar
Tripathy, Jatin Karthik
S, Sudhakar Ilango
P., Mangalraj
Sethuraman, Sibi Chakkaravarthy
Cruz, Meenalosini Vimal
Author_xml – sequence: 1
  givenname: Jatin Karthik
  surname: Tripathy
  fullname: Tripathy, Jatin Karthik
  organization: School of Computer Science and Engineering, VIT-AP University, Andhra Pradesh, India
– sequence: 2
  givenname: Sibi Chakkaravarthy
  surname: Sethuraman
  fullname: Sethuraman, Sibi Chakkaravarthy
  email: sb.sibi@gmail.com
  organization: School of Computer Science and Engineering, VIT-AP University, Andhra Pradesh, India
– sequence: 3
  givenname: Meenalosini Vimal
  orcidid: 0000-0003-3164-4848
  surname: Cruz
  fullname: Cruz, Meenalosini Vimal
  organization: Department of Information Technology, Georgia Southern University, GA, USA
– sequence: 4
  givenname: Anupama
  surname: Namburu
  fullname: Namburu, Anupama
  organization: School of Computer Science and Engineering, VIT-AP University, Andhra Pradesh, India
– sequence: 5
  givenname: Mangalraj
  surname: P.
  fullname: P., Mangalraj
  organization: School of Computer Science and Engineering, VIT-AP University, Andhra Pradesh, India
– sequence: 6
  givenname: Nandha Kumar
  surname: R.
  fullname: R., Nandha Kumar
  organization: School of Computer Science and Engineering, VIT-AP University, Andhra Pradesh, India
– sequence: 7
  givenname: Sudhakar Ilango
  surname: S
  fullname: S, Sudhakar Ilango
  organization: School of Computer Science and Engineering, VIT-AP University, Andhra Pradesh, India
– sequence: 8
  givenname: Vaidehi
  orcidid: 0000-0002-9524-5291
  surname: Vijayakumar
  fullname: Vijayakumar, Vaidehi
  organization: Mother Teresa Women’s University, Kodaikanal, Tamilnadu, India
BookMark eNp9kN1KAzEQhYNUsK2-gRf7AlsnO7vJCiJI8Q-KeqHXITuZ1ZQ2W5JS6Nubsl57NcNhzmHONxOTMAQW4lrCQoJUN-sFDSnyYVFBJbMENeKZmMpWq1LrupnkvdF1CRL1hZiltAbQAI2airvlsN1F_uGQ_IELG-zmmHwqhr7gbcfO-fCdsuyKfFXuo_UhK4UPxdvq41Kc93aT-OpvzsXX0-Pn8qVcvT-_Lh9WJSGofanqW1JE1lUtOkRqHbOT1FjVQa-rHklrJ6saHbVKV7Jh4B6wwQ4lth3gXNRjLsUh5aK92UW_tfFoJJgTAbM2IwFzImBGAtl2P9o4_3bwHE0iz4HY-ci0N27w_wf8At3jZ-I
CitedBy_id crossref_primary_10_3390_technologies11050123
crossref_primary_10_48168_innosoft_s11_a88
crossref_primary_10_1016_j_eswa_2023_120439
crossref_primary_10_1016_j_techfore_2022_122306
crossref_primary_10_32604_csse_2023_036419
crossref_primary_10_3390_app122110765
crossref_primary_10_32604_iasc_2023_027848
Cites_doi 10.3115/1073083.1073135
10.1109/ICCV.2015.11
10.1145/3331184.3331341
10.1016/j.neunet.2005.06.042
10.1145/3340531.3411908
10.3115/1289189.1289272
10.1145/1150402.1150464
10.1109/MSP.2012.2205597
10.3115/v1/D14-1162
10.3115/v1/P14-1023
10.1109/TKDE.2009.191
10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
10.1038/s41586-019-1923-7
10.1109/TASL.2011.2134090
10.1109/CVPR.2016.90
10.1109/5.726791
10.1162/tacl_a_00179
10.1109/72.279181
ContentType Journal Article
Copyright 2021 Elsevier Inc.
Copyright_xml – notice: 2021 Elsevier Inc.
DBID AAYXX
CITATION
DOI 10.1016/j.cosrev.2021.100433
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1876-7745
ExternalDocumentID 10_1016_j_cosrev_2021_100433
S1574013721000733
GroupedDBID --K
--M
.~1
0R~
1B1
1~.
1~5
4.4
457
4G.
5GY
5VS
6J9
7-5
71M
8P~
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AARIN
AAXUO
AAYFN
ABBOA
ABFRF
ABJNI
ABMAC
ABUCO
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADMUD
AEBSH
AEFWE
AEKER
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
APLSM
AXJTR
BKOJK
BLXMC
CS3
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
FDB
FEDTE
FIRID
FNPLU
FYGXN
GBLVA
GBOLZ
HAMUX
HVGLF
HZ~
IHE
J1W
KOM
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
PC.
Q38
RIG
ROL
RPZ
SDF
SDG
SES
SPC
SPCBC
SSB
SSD
SSV
SSZ
T5K
UNMZH
~G-
AAXKI
AAYXX
AFJKZ
AKRWK
CITATION
ID FETCH-LOGICAL-c306t-649c6ccad283d33c8deed1c5a6b0f72f3c77d1243dc867215e0ef0353b3138b03
IEDL.DBID AIKHN
ISSN 1574-0137
IngestDate Thu Sep 26 16:27:37 EDT 2024
Fri Feb 23 02:42:47 EST 2024
IsPeerReviewed false
IsScholarly true
Keywords Attention mechanism
Embedding
NLP
Pre-training model
Natural Language Processing
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c306t-649c6ccad283d33c8deed1c5a6b0f72f3c77d1243dc867215e0ef0353b3138b03
ORCID 0000-0003-3164-4848
0000-0002-9524-5291
ParticipantIDs crossref_primary_10_1016_j_cosrev_2021_100433
elsevier_sciencedirect_doi_10_1016_j_cosrev_2021_100433
PublicationCentury 2000
PublicationDate November 2021
2021-11-00
PublicationDateYYYYMMDD 2021-11-01
PublicationDate_xml – month: 11
  year: 2021
  text: November 2021
PublicationDecade 2020
PublicationTitle Computer science review
PublicationYear 2021
Publisher Elsevier Inc
Publisher_xml – name: Elsevier Inc
References Peters, Neumann, Iyyer, Gardner, Clark, Lee, Zettlemoyer (b27) 2018
Bengio, Simard, Frasconi (b75) 1994; 5
Luong, Pham, Manning (b15) 2015
Lan, Chen, Goodman, Gimpel, Sharma, Soricut (b59) 2019
Dahl, Yu, Deng, Acero (b2) 2011; 20
Kaplan, McCandlish, Henighan, Brown, Chess, Child, Gray, Radford, Wu, Amodei (b47) 2020
He, Liu, Gao, Chen (b63) 2021
Weaver (b28) 1949
Sanh, Debut, Chaumond, Wolf (b56) 2019
Hinton, Vinyals, Dean (b58) 2015
McCann, Keskar, Xiong, Socher (b74) 2018
Lu, Keung, Ladhak, Bhardwaj, Zhang, Sun (b8) 2018
Graves, Schmidhuber (b11) 2005; 18
C. Buciluǎ, R. Caruana, A. Niculescu-Mizil, Model compression, in: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006, pp. 535–541.
Alberti, Lee, Collins (b40) 2019
Mikolov, Chen, Corrado, Dean (b21) 2013
Liu, Lapata (b43) 2019
Deerwester, Dumais, Furnas, Landauer, Harshman (b25) 1990; 41
Chiu, Sainath, Wu, Prabhavalkar, Nguyen, Chen, Kannan, Weiss, Rao, Gonina (b78) 2018
Devlin, Chang, Lee, Toutanova (b48) 2018
C. Qu, L. Yang, M. Qiu, W.B. Croft, Y. Zhang, M. Iyyer, BERT with history answer embedding for conversational question answering, in: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019, pp. 1133–1136.
J. Dean, G.S. Corrado, R. Monga, K. Chen, M. Devin, Q.V. Le, M.Z. Mao, M. Ranzato, A. Senior, P. Tucker, et al. Large scale distributed deep networks, in: Proceedings of the 25th International Conference on Neural Information Processing Systems-Volume 1, 2012, pp. 1223–1231.
Clark, Luong, Le, Manning (b66) 2020
Zellers, Bisk, Schwartz, Choi (b72) 2018
Hinton, Deng, Yu, Dahl, Mohamed, Jaitly, Senior, Vanhoucke, Nguyen, Sainath (b1) 2012; 29
Brown, Mann, Ryder, Subbiah, Kaplan, Dhariwal, Neelakantan, Shyam, Sastry, Askell (b46) 2020
S. Nagel, URL
Rocktäschel, Grefenstette, Hermann, Kočiskỳ, Blunsom (b37) 2015
Sennrich, Haddow, Birch (b45) 2015
J. Pennington, R. Socher, C.D. Manning, Glove: Global vectors for word representation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1532–1543.
Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin (b16) 2017
.
Tang, Xu, Matsumoto, Ono (b76) 2016
Wang, Singh, Michael, Hill, Levy, Bowman (b73) 2018
F. (b82) 2020
Simonyan, Zisserman (b20) 2014
Liu, Saleh, Pot, Goodrich, Sepassi, Kaiser, Shazeer (b36) 2018
Moro, Raganato, Navigli (b29) 2014; 2
Hochreiter, Bengio, Frasconi, Schmidhuber (b32) 2001
Mangal, Modak, Joshi (b80) 2019
Sutskever, Vinyals, Le (b13) 2014
K. Papineni, S. Roukos, T. Ward, W.-J. Zhu, Bleu: A method for automatic evaluation of machine translation, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002, pp. 311–318.
Zhang, Xu, Wang (b44) 2019
Taylor (b49) 1953; 30
Zhu, Zeng, Huang (b39) 2018
Trinh, Le (b52) 2018
C. Callison-Burch, M. Osborne, P. Koehn, Re-evaluating the role of BLEU in machine translation research, in: 11th Conference of the European Chapter of the Association for Computational Linguistics, 2006, pp. 249–256.
Yang, Dai, Yang, Carbonell, Salakhutdinov, Le (b64) 2019; 32
Ba, Kiros, Hinton (b18) 2016
Rajpurkar, Zhang, Lopyrev, Liang (b70) 2016
Liu, Ott, Goyal, Du, Joshi, Chen, Levy, Lewis, Zettlemoyer, Stoyanov (b51) 2019
Hou, Huang, Shang, Jiang, Chen, Liu (b60) 2020
K. Papineni, S. Roukos, T. Ward, J. Henderson, F. Reeder, Corpus-based comprehensive and diagnostic MT evaluation: initial Arabic, Chinese, French, and Spanish results, in: Proceedings of the Second International Conference on Human Language Technology Research, 2002, pp. 132–137.
Gregor, Danihelka, Graves, Rezende, Wierstra (b10) 2015
Jawahar, Muller, Fethi, Martin, de la Clergerie, Sagot, Seddah (b30) 2018
Liu (b42) 2019
Raffel, Shazeer, Roberts, Lee, Narang, Matena, Zhou, Li, Liu (b65) 2020
LeCun, Bottou, Bengio, Haffner (b5) 1998; 86
Agarap (b7) 2018
A. Gokaslan, V. Cohen, URL
Huang, Xu, Yu (b9) 2015
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
Fedus, Zoph, Shazeer (b62) 2021
M. Baroni, G. Dinu, G. Kruszewski, Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors, in: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2014, pp. 238–247.
Lee-Thorp, Ainslie, Eckstein, Ontanon (b67) 2021
Krizhevsky, Sutskever, Hinton (b4) 2012; 25
Cho, Van Merriënboer, Gulcehre, Bahdanau, Bougares, Schwenk, Bengio (b14) 2014
Radford, Wu, Child, Luan, Amodei, Sutskever (b38) 2019; 1
Radford, Narasimhan, Salimans, Sutskever (b33) 2018
Zhou, Dong, Xu, Xu (b79) 2018
Kotecha, Young (b81) 2018
Bahdanau, Cho, Bengio (b12) 2014
Senior, Evans, Jumper, Kirkpatrick, Sifre, Green, Qin, Žídek, Nelson, Bridgland (b6) 2020; 577
Pan, Yang (b19) 2009; 22
Mikolov, Sutskever, Chen, Corrado, Dean (b22) 2013
Hochreiter (b31) 1991; 91
Lai, Xie, Liu, Yang, Hovy (b71) 2017
Reimers, Gurevych (b55) 2019
Y. Zhu, R. Kiros, R. Zemel, R. Salakhutdinov, R. Urtasun, A. Torralba, S. Fidler, Aligning books and movies: Towards story-like visual explanations by watching movies and reading books, in: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 19–27.
McCann, Bradbury, Xiong, Socher (b26) 2017
Wu, Schuster, Chen, Le, Norouzi, Macherey, Krikun, Cao, Gao, Macherey (b50) 2016
Harmon, Klabjan (b77) 2018
L. Yang, M. Zhang, C. Li, M. Bendersky, M. Najork, Beyond 512 tokens: Siamese multi-depth transformer-based hierarchical encoder for long-form document matching, in: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020, pp. 1725–1734.
He (10.1016/j.cosrev.2021.100433_b63) 2021
Mikolov (10.1016/j.cosrev.2021.100433_b22) 2013
Chiu (10.1016/j.cosrev.2021.100433_b78) 2018
Lan (10.1016/j.cosrev.2021.100433_b59) 2019
Fedus (10.1016/j.cosrev.2021.100433_b62) 2021
Kaplan (10.1016/j.cosrev.2021.100433_b47) 2020
Liu (10.1016/j.cosrev.2021.100433_b42) 2019
Raffel (10.1016/j.cosrev.2021.100433_b65) 2020
Wang (10.1016/j.cosrev.2021.100433_b73) 2018
Rocktäschel (10.1016/j.cosrev.2021.100433_b37) 2015
Mangal (10.1016/j.cosrev.2021.100433_b80) 2019
Sennrich (10.1016/j.cosrev.2021.100433_b45) 2015
Zellers (10.1016/j.cosrev.2021.100433_b72) 2018
10.1016/j.cosrev.2021.100433_b17
Liu (10.1016/j.cosrev.2021.100433_b51) 2019
Deerwester (10.1016/j.cosrev.2021.100433_b25) 1990; 41
10.1016/j.cosrev.2021.100433_b54
10.1016/j.cosrev.2021.100433_b57
Sutskever (10.1016/j.cosrev.2021.100433_b13) 2014
McCann (10.1016/j.cosrev.2021.100433_b26) 2017
Bengio (10.1016/j.cosrev.2021.100433_b75) 1994; 5
10.1016/j.cosrev.2021.100433_b53
Hinton (10.1016/j.cosrev.2021.100433_b58) 2015
Liu (10.1016/j.cosrev.2021.100433_b36) 2018
Simonyan (10.1016/j.cosrev.2021.100433_b20) 2014
Hochreiter (10.1016/j.cosrev.2021.100433_b31) 1991; 91
Dahl (10.1016/j.cosrev.2021.100433_b2) 2011; 20
Pan (10.1016/j.cosrev.2021.100433_b19) 2009; 22
McCann (10.1016/j.cosrev.2021.100433_b74) 2018
Devlin (10.1016/j.cosrev.2021.100433_b48) 2018
Taylor (10.1016/j.cosrev.2021.100433_b49) 1953; 30
Peters (10.1016/j.cosrev.2021.100433_b27) 2018
Zhu (10.1016/j.cosrev.2021.100433_b39) 2018
Sanh (10.1016/j.cosrev.2021.100433_b56) 2019
Gregor (10.1016/j.cosrev.2021.100433_b10) 2015
Clark (10.1016/j.cosrev.2021.100433_b66) 2020
Hinton (10.1016/j.cosrev.2021.100433_b1) 2012; 29
10.1016/j.cosrev.2021.100433_b69
Hochreiter (10.1016/j.cosrev.2021.100433_b32) 2001
Wu (10.1016/j.cosrev.2021.100433_b50) 2016
10.1016/j.cosrev.2021.100433_b24
10.1016/j.cosrev.2021.100433_b68
10.1016/j.cosrev.2021.100433_b23
Agarap (10.1016/j.cosrev.2021.100433_b7) 2018
10.1016/j.cosrev.2021.100433_b61
Graves (10.1016/j.cosrev.2021.100433_b11) 2005; 18
10.1016/j.cosrev.2021.100433_b3
Brown (10.1016/j.cosrev.2021.100433_b46) 2020
Radford (10.1016/j.cosrev.2021.100433_b33) 2018
Luong (10.1016/j.cosrev.2021.100433_b15) 2015
Rajpurkar (10.1016/j.cosrev.2021.100433_b70) 2016
Lai (10.1016/j.cosrev.2021.100433_b71) 2017
Bahdanau (10.1016/j.cosrev.2021.100433_b12) 2014
LeCun (10.1016/j.cosrev.2021.100433_b5) 1998; 86
Alberti (10.1016/j.cosrev.2021.100433_b40) 2019
Reimers (10.1016/j.cosrev.2021.100433_b55) 2019
Hou (10.1016/j.cosrev.2021.100433_b60) 2020
Jawahar (10.1016/j.cosrev.2021.100433_b30) 2018
Lu (10.1016/j.cosrev.2021.100433_b8) 2018
Moro (10.1016/j.cosrev.2021.100433_b29) 2014; 2
10.1016/j.cosrev.2021.100433_b35
10.1016/j.cosrev.2021.100433_b34
Trinh (10.1016/j.cosrev.2021.100433_b52) 2018
Tang (10.1016/j.cosrev.2021.100433_b76) 2016
Krizhevsky (10.1016/j.cosrev.2021.100433_b4) 2012; 25
Lee-Thorp (10.1016/j.cosrev.2021.100433_b67) 2021
F. (10.1016/j.cosrev.2021.100433_b82) 2020
Huang (10.1016/j.cosrev.2021.100433_b9) 2015
Kotecha (10.1016/j.cosrev.2021.100433_b81) 2018
Zhang (10.1016/j.cosrev.2021.100433_b44) 2019
Cho (10.1016/j.cosrev.2021.100433_b14) 2014
Vaswani (10.1016/j.cosrev.2021.100433_b16) 2017
Liu (10.1016/j.cosrev.2021.100433_b43) 2019
Harmon (10.1016/j.cosrev.2021.100433_b77) 2018
Senior (10.1016/j.cosrev.2021.100433_b6) 2020; 577
Radford (10.1016/j.cosrev.2021.100433_b38) 2019; 1
Ba (10.1016/j.cosrev.2021.100433_b18) 2016
Mikolov (10.1016/j.cosrev.2021.100433_b21) 2013
Yang (10.1016/j.cosrev.2021.100433_b64) 2019; 32
Weaver (10.1016/j.cosrev.2021.100433_b28) 1949
Zhou (10.1016/j.cosrev.2021.100433_b79) 2018
10.1016/j.cosrev.2021.100433_b41
References_xml – volume: 20
  start-page: 30
  year: 2011
  end-page: 42
  ident: b2
  article-title: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
  publication-title: IEEE Trans. Audio Speech Lang. Process.
  contributor:
    fullname: Acero
– year: 2021
  ident: b63
  article-title: DeBERTa: Decoding-enhanced BERT with disentangled attention
  contributor:
    fullname: Chen
– year: 2019
  ident: b43
  article-title: Text summarization with pretrained encoders
  contributor:
    fullname: Lapata
– start-page: 5998
  year: 2017
  end-page: 6008
  ident: b16
  article-title: Attention is all you need
  publication-title: Advances in Neural Information Processing Systems
  contributor:
    fullname: Polosukhin
– year: 2018
  ident: b8
  article-title: A neural interlingua for multilingual machine translation
  contributor:
    fullname: Sun
– year: 2018
  ident: b7
  article-title: Statistical analysis on E-commerce reviews, with sentiment classification using bidirectional recurrent neural network (RNN)
  contributor:
    fullname: Agarap
– volume: 30
  start-page: 415
  year: 1953
  end-page: 433
  ident: b49
  article-title: “Cloze procedure”: A new tool for measuring readability
  publication-title: J. Q.
  contributor:
    fullname: Taylor
– volume: 32
  year: 2019
  ident: b64
  article-title: Xlnet: Generalized autoregressive pretraining for language understanding
  publication-title: Adv. Neural Inf. Process. Syst.
  contributor:
    fullname: Le
– volume: 2
  start-page: 231
  year: 2014
  end-page: 244
  ident: b29
  article-title: Entity linking meets word sense disambiguation: A unified approach
  publication-title: Trans. Assoc. Comput. Linguist.
  contributor:
    fullname: Navigli
– year: 2018
  ident: b48
  article-title: Bert: Pre-training of deep bidirectional transformers for language understanding
  contributor:
    fullname: Toutanova
– year: 2020
  ident: b46
  article-title: Language models are few-shot learners
  contributor:
    fullname: Askell
– year: 2018
  ident: b33
  article-title: Improving language understanding by generative pre-training
  contributor:
    fullname: Sutskever
– year: 2018
  ident: b81
  article-title: Generating music using an LSTM network
  contributor:
    fullname: Young
– year: 2021
  ident: b62
  article-title: Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity
  contributor:
    fullname: Shazeer
– year: 2016
  ident: b50
  article-title: Google’s neural machine translation system: Bridging the gap between human and machine translation
  contributor:
    fullname: Macherey
– year: 2016
  ident: b18
  article-title: Layer normalization
  contributor:
    fullname: Hinton
– volume: 91
  year: 1991
  ident: b31
  article-title: Untersuchungen zu dynamischen neuronalen netzen
  publication-title: Diploma Tech. Univ. München
  contributor:
    fullname: Hochreiter
– year: 2021
  ident: b67
  article-title: Fnet: Mixing tokens with Fourier transforms
  contributor:
    fullname: Ontanon
– volume: 41
  start-page: 391
  year: 1990
  end-page: 407
  ident: b25
  article-title: Indexing by latent semantic analysis
  publication-title: J. Am. Soc. Inf. Sci.
  contributor:
    fullname: Harshman
– year: 2020
  ident: b47
  article-title: Scaling laws for neural language models
  contributor:
    fullname: Amodei
– volume: 29
  start-page: 82
  year: 2012
  end-page: 97
  ident: b1
  article-title: Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
  publication-title: IEEE Signal Process. Mag.
  contributor:
    fullname: Sainath
– volume: 22
  start-page: 1345
  year: 2009
  end-page: 1359
  ident: b19
  article-title: A survey on transfer learning
  publication-title: IEEE Trans. Knowl. Data Eng.
  contributor:
    fullname: Yang
– year: 2019
  ident: b56
  article-title: Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter
  contributor:
    fullname: Wolf
– year: 2019
  ident: b42
  article-title: Fine-tune BERT for extractive summarization
  contributor:
    fullname: Liu
– year: 2019
  ident: b55
  article-title: Sentence-bert: Sentence embeddings using siamese bert-networks
  contributor:
    fullname: Gurevych
– year: 2019
  ident: b51
  article-title: Roberta: A robustly optimized bert pretraining approach
  contributor:
    fullname: Stoyanov
– year: 2015
  ident: b58
  article-title: Distilling the knowledge in a neural network
  contributor:
    fullname: Dean
– start-page: 4774
  year: 2018
  end-page: 4778
  ident: b78
  article-title: State-of-the-art speech recognition with sequence-to-sequence models
  publication-title: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing
  contributor:
    fullname: Gonina
– year: 2015
  ident: b15
  article-title: Effective approaches to attention-based neural machine translation
  contributor:
    fullname: Manning
– year: 2018
  ident: b73
  article-title: Glue: A multi-task benchmark and analysis platform for natural language understanding
  contributor:
    fullname: Bowman
– volume: 86
  start-page: 2278
  year: 1998
  end-page: 2324
  ident: b5
  article-title: Gradient-based learning applied to document recognition
  publication-title: Proc. IEEE
  contributor:
    fullname: Haffner
– year: 2018
  ident: b36
  article-title: Generating wikipedia by summarizing long sequences
  contributor:
    fullname: Shazeer
– volume: 18
  start-page: 602
  year: 2005
  end-page: 610
  ident: b11
  article-title: Framewise phoneme classification with bidirectional LSTM and other neural network architectures
  publication-title: Neural Netw.
  contributor:
    fullname: Schmidhuber
– volume: 5
  start-page: 157
  year: 1994
  end-page: 166
  ident: b75
  article-title: Learning long-term dependencies with gradient descent is difficult
  publication-title: IEEE Trans. Neural Netw.
  contributor:
    fullname: Frasconi
– year: 2018
  ident: b77
  article-title: Dynamic prediction length for time series with sequence to sequence networks
  contributor:
    fullname: Klabjan
– volume: 1
  start-page: 9
  year: 2019
  ident: b38
  article-title: Language models are unsupervised multitask learners
  publication-title: OpenAI Blog
  contributor:
    fullname: Sutskever
– start-page: 503
  year: 2016
  end-page: 510
  ident: b76
  article-title: Sequence-to-sequence model with attention for time series classification
  publication-title: 2016 IEEE 16th International Conference on Data Mining Workshops
  contributor:
    fullname: Ono
– year: 2018
  ident: b72
  article-title: Swag: A large-scale adversarial dataset for grounded commonsense inference
  contributor:
    fullname: Choi
– year: 2020
  ident: b65
  article-title: Exploring the limits of transfer learning with a unified text-to-text transformer
  contributor:
    fullname: Liu
– year: 2014
  ident: b14
  article-title: Learning phrase representations using RNN encoder-decoder for statistical machine translation
  contributor:
    fullname: Bengio
– year: 2015
  ident: b45
  article-title: Neural machine translation of rare words with subword units
  contributor:
    fullname: Birch
– start-page: 1462
  year: 2015
  end-page: 1471
  ident: b10
  article-title: Draw: A recurrent neural network for image generation
  publication-title: International Conference on Machine Learning
  contributor:
    fullname: Wierstra
– volume: 25
  start-page: 1097
  year: 2012
  end-page: 1105
  ident: b4
  article-title: Imagenet classification with deep convolutional neural networks
  publication-title: Adv. Neural Inf. Process. Syst.
  contributor:
    fullname: Hinton
– year: 2019
  ident: b80
  article-title: Lstm based music generation system
  contributor:
    fullname: Joshi
– year: 2019
  ident: b40
  article-title: A bert baseline for the natural questions
  contributor:
    fullname: Collins
– year: 2017
  ident: b26
  article-title: Learned in translation: Contextualized word vectors
  contributor:
    fullname: Socher
– year: 2018
  ident: b52
  article-title: A simple method for commonsense reasoning
  contributor:
    fullname: Le
– start-page: 1
  year: 2018
  end-page: 16
  ident: b30
  article-title: ELMoLex: Connecting ELMo and lexicon features for dependency parsing
  publication-title: CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text To Universal Dependencies
  contributor:
    fullname: Seddah
– year: 2014
  ident: b12
  article-title: Neural machine translation by jointly learning to align and translate
  contributor:
    fullname: Bengio
– year: 2019
  ident: b44
  article-title: Pretraining-based natural language generation for text summarization
  contributor:
    fullname: Wang
– year: 2014
  ident: b20
  article-title: Very deep convolutional networks for large-scale image recognition
  contributor:
    fullname: Zisserman
– year: 2015
  ident: b9
  article-title: Bidirectional LSTM–CRF models for sequence tagging
  contributor:
    fullname: Yu
– volume: 577
  start-page: 706
  year: 2020
  end-page: 710
  ident: b6
  article-title: Improved protein structure prediction using potentials from deep learning
  publication-title: Nature
  contributor:
    fullname: Bridgland
– year: 2018
  ident: b74
  article-title: The natural language decathlon: Multitask learning as question answering
  contributor:
    fullname: Socher
– year: 2015
  ident: b37
  article-title: Reasoning about entailment with neural attention
  contributor:
    fullname: Blunsom
– start-page: 3111
  year: 2013
  end-page: 3119
  ident: b22
  article-title: Distributed representations of words and phrases and their compositionality
  publication-title: Advances in Neural Information Processing Systems
  contributor:
    fullname: Dean
– year: 1949
  ident: b28
  article-title: Translation
  publication-title: Machine Translation of Languages: Fourteen Essays
  contributor:
    fullname: Weaver
– year: 2013
  ident: b21
  article-title: Efficient estimation of word representations in vector space
  contributor:
    fullname: Dean
– year: 2019
  ident: b59
  article-title: Albert: A lite bert for self-supervised learning of language representations
  contributor:
    fullname: Soricut
– year: 2020
  ident: b82
  article-title: Building a recurrent neural network - step by step - v1. [online] datascience-enthusiast.com. Available at:
  contributor:
    fullname: F.
– year: 2020
  ident: b66
  article-title: Electra: Pre-training text encoders as discriminators rather than generators
  contributor:
    fullname: Manning
– start-page: 210
  year: 2018
  end-page: 220
  ident: b79
  article-title: A comparison of modeling units in sequence-to-sequence speech recognition with the transformer on mandarin chinese
  publication-title: International Conference on Neural Information Processing
  contributor:
    fullname: Xu
– year: 2018
  ident: b39
  article-title: Sdnet: Contextualized attention-based deep network for conversational question answering
  contributor:
    fullname: Huang
– year: 2001
  ident: b32
  article-title: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies
  contributor:
    fullname: Schmidhuber
– year: 2017
  ident: b71
  article-title: Race: Large-scale reading comprehension dataset from examinations
  contributor:
    fullname: Hovy
– start-page: 3104
  year: 2014
  end-page: 3112
  ident: b13
  article-title: Sequence to sequence learning with neural networks
  publication-title: Advances in Neural Information Processing Systems
  contributor:
    fullname: Le
– year: 2018
  ident: b27
  article-title: Deep contextualized word representations
  contributor:
    fullname: Zettlemoyer
– year: 2020
  ident: b60
  article-title: Dynabert: Dynamic bert with adaptive width and depth
  contributor:
    fullname: Liu
– year: 2016
  ident: b70
  article-title: Squad: 100,000+ questions for machine comprehension of text
  contributor:
    fullname: Liang
– ident: 10.1016/j.cosrev.2021.100433_b68
  doi: 10.3115/1073083.1073135
– year: 2021
  ident: 10.1016/j.cosrev.2021.100433_b63
  contributor:
    fullname: He
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b72
  contributor:
    fullname: Zellers
– ident: 10.1016/j.cosrev.2021.100433_b34
  doi: 10.1109/ICCV.2015.11
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b81
  contributor:
    fullname: Kotecha
– volume: 25
  start-page: 1097
  year: 2012
  ident: 10.1016/j.cosrev.2021.100433_b4
  article-title: Imagenet classification with deep convolutional neural networks
  publication-title: Adv. Neural Inf. Process. Syst.
  contributor:
    fullname: Krizhevsky
– start-page: 3104
  year: 2014
  ident: 10.1016/j.cosrev.2021.100433_b13
  article-title: Sequence to sequence learning with neural networks
  contributor:
    fullname: Sutskever
– year: 2019
  ident: 10.1016/j.cosrev.2021.100433_b51
  contributor:
    fullname: Liu
– ident: 10.1016/j.cosrev.2021.100433_b3
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b36
  contributor:
    fullname: Liu
– year: 2020
  ident: 10.1016/j.cosrev.2021.100433_b60
  contributor:
    fullname: Hou
– year: 2014
  ident: 10.1016/j.cosrev.2021.100433_b14
  contributor:
    fullname: Cho
– year: 1949
  ident: 10.1016/j.cosrev.2021.100433_b28
  article-title: Translation
  contributor:
    fullname: Weaver
– year: 2001
  ident: 10.1016/j.cosrev.2021.100433_b32
  contributor:
    fullname: Hochreiter
– ident: 10.1016/j.cosrev.2021.100433_b69
– year: 2017
  ident: 10.1016/j.cosrev.2021.100433_b26
  contributor:
    fullname: McCann
– volume: 30
  start-page: 415
  issue: 4
  year: 1953
  ident: 10.1016/j.cosrev.2021.100433_b49
  article-title: “Cloze procedure”: A new tool for measuring readability
  publication-title: J. Q.
  contributor:
    fullname: Taylor
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b39
  contributor:
    fullname: Zhu
– year: 2014
  ident: 10.1016/j.cosrev.2021.100433_b20
  contributor:
    fullname: Simonyan
– ident: 10.1016/j.cosrev.2021.100433_b41
  doi: 10.1145/3331184.3331341
– ident: 10.1016/j.cosrev.2021.100433_b54
– volume: 18
  start-page: 602
  issue: 5–6
  year: 2005
  ident: 10.1016/j.cosrev.2021.100433_b11
  article-title: Framewise phoneme classification with bidirectional LSTM and other neural network architectures
  publication-title: Neural Netw.
  doi: 10.1016/j.neunet.2005.06.042
  contributor:
    fullname: Graves
– year: 2017
  ident: 10.1016/j.cosrev.2021.100433_b71
  contributor:
    fullname: Lai
– year: 2019
  ident: 10.1016/j.cosrev.2021.100433_b44
  contributor:
    fullname: Zhang
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b27
  contributor:
    fullname: Peters
– ident: 10.1016/j.cosrev.2021.100433_b61
  doi: 10.1145/3340531.3411908
– ident: 10.1016/j.cosrev.2021.100433_b35
  doi: 10.3115/1289189.1289272
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b77
  contributor:
    fullname: Harmon
– ident: 10.1016/j.cosrev.2021.100433_b57
  doi: 10.1145/1150402.1150464
– year: 2016
  ident: 10.1016/j.cosrev.2021.100433_b70
  contributor:
    fullname: Rajpurkar
– volume: 29
  start-page: 82
  issue: 6
  year: 2012
  ident: 10.1016/j.cosrev.2021.100433_b1
  article-title: Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
  publication-title: IEEE Signal Process. Mag.
  doi: 10.1109/MSP.2012.2205597
  contributor:
    fullname: Hinton
– start-page: 3111
  year: 2013
  ident: 10.1016/j.cosrev.2021.100433_b22
  article-title: Distributed representations of words and phrases and their compositionality
  contributor:
    fullname: Mikolov
– volume: 32
  year: 2019
  ident: 10.1016/j.cosrev.2021.100433_b64
  article-title: Xlnet: Generalized autoregressive pretraining for language understanding
  publication-title: Adv. Neural Inf. Process. Syst.
  contributor:
    fullname: Yang
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b7
  contributor:
    fullname: Agarap
– year: 2015
  ident: 10.1016/j.cosrev.2021.100433_b9
  contributor:
    fullname: Huang
– start-page: 1
  year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b30
  article-title: ELMoLex: Connecting ELMo and lexicon features for dependency parsing
  contributor:
    fullname: Jawahar
– year: 2020
  ident: 10.1016/j.cosrev.2021.100433_b65
  contributor:
    fullname: Raffel
– year: 2013
  ident: 10.1016/j.cosrev.2021.100433_b21
  contributor:
    fullname: Mikolov
– year: 2020
  ident: 10.1016/j.cosrev.2021.100433_b46
  contributor:
    fullname: Brown
– year: 2019
  ident: 10.1016/j.cosrev.2021.100433_b40
  contributor:
    fullname: Alberti
– ident: 10.1016/j.cosrev.2021.100433_b53
– ident: 10.1016/j.cosrev.2021.100433_b24
  doi: 10.3115/v1/D14-1162
– start-page: 5998
  year: 2017
  ident: 10.1016/j.cosrev.2021.100433_b16
  article-title: Attention is all you need
  contributor:
    fullname: Vaswani
– year: 2019
  ident: 10.1016/j.cosrev.2021.100433_b43
  contributor:
    fullname: Liu
– year: 2019
  ident: 10.1016/j.cosrev.2021.100433_b80
  contributor:
    fullname: Mangal
– volume: 1
  start-page: 9
  issue: 8
  year: 2019
  ident: 10.1016/j.cosrev.2021.100433_b38
  article-title: Language models are unsupervised multitask learners
  publication-title: OpenAI Blog
  contributor:
    fullname: Radford
– year: 2019
  ident: 10.1016/j.cosrev.2021.100433_b42
  contributor:
    fullname: Liu
– ident: 10.1016/j.cosrev.2021.100433_b23
  doi: 10.3115/v1/P14-1023
– year: 2019
  ident: 10.1016/j.cosrev.2021.100433_b56
  contributor:
    fullname: Sanh
– start-page: 210
  year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b79
  article-title: A comparison of modeling units in sequence-to-sequence speech recognition with the transformer on mandarin chinese
  contributor:
    fullname: Zhou
– volume: 22
  start-page: 1345
  issue: 10
  year: 2009
  ident: 10.1016/j.cosrev.2021.100433_b19
  article-title: A survey on transfer learning
  publication-title: IEEE Trans. Knowl. Data Eng.
  doi: 10.1109/TKDE.2009.191
  contributor:
    fullname: Pan
– start-page: 4774
  year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b78
  article-title: State-of-the-art speech recognition with sequence-to-sequence models
  contributor:
    fullname: Chiu
– year: 2019
  ident: 10.1016/j.cosrev.2021.100433_b55
  contributor:
    fullname: Reimers
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b74
  contributor:
    fullname: McCann
– volume: 41
  start-page: 391
  issue: 6
  year: 1990
  ident: 10.1016/j.cosrev.2021.100433_b25
  article-title: Indexing by latent semantic analysis
  publication-title: J. Am. Soc. Inf. Sci.
  doi: 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
  contributor:
    fullname: Deerwester
– year: 2021
  ident: 10.1016/j.cosrev.2021.100433_b67
  contributor:
    fullname: Lee-Thorp
– year: 2020
  ident: 10.1016/j.cosrev.2021.100433_b82
  contributor:
    fullname: F.
– year: 2020
  ident: 10.1016/j.cosrev.2021.100433_b47
  contributor:
    fullname: Kaplan
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b33
  contributor:
    fullname: Radford
– volume: 577
  start-page: 706
  issue: 7792
  year: 2020
  ident: 10.1016/j.cosrev.2021.100433_b6
  article-title: Improved protein structure prediction using potentials from deep learning
  publication-title: Nature
  doi: 10.1038/s41586-019-1923-7
  contributor:
    fullname: Senior
– year: 2015
  ident: 10.1016/j.cosrev.2021.100433_b58
  contributor:
    fullname: Hinton
– year: 2016
  ident: 10.1016/j.cosrev.2021.100433_b18
  contributor:
    fullname: Ba
– volume: 20
  start-page: 30
  issue: 1
  year: 2011
  ident: 10.1016/j.cosrev.2021.100433_b2
  article-title: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition
  publication-title: IEEE Trans. Audio Speech Lang. Process.
  doi: 10.1109/TASL.2011.2134090
  contributor:
    fullname: Dahl
– year: 2019
  ident: 10.1016/j.cosrev.2021.100433_b59
  contributor:
    fullname: Lan
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b73
  contributor:
    fullname: Wang
– start-page: 503
  year: 2016
  ident: 10.1016/j.cosrev.2021.100433_b76
  article-title: Sequence-to-sequence model with attention for time series classification
  contributor:
    fullname: Tang
– start-page: 1462
  year: 2015
  ident: 10.1016/j.cosrev.2021.100433_b10
  article-title: Draw: A recurrent neural network for image generation
  contributor:
    fullname: Gregor
– year: 2014
  ident: 10.1016/j.cosrev.2021.100433_b12
  contributor:
    fullname: Bahdanau
– year: 2015
  ident: 10.1016/j.cosrev.2021.100433_b15
  contributor:
    fullname: Luong
– ident: 10.1016/j.cosrev.2021.100433_b17
  doi: 10.1109/CVPR.2016.90
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b48
  contributor:
    fullname: Devlin
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b8
  contributor:
    fullname: Lu
– volume: 86
  start-page: 2278
  issue: 11
  year: 1998
  ident: 10.1016/j.cosrev.2021.100433_b5
  article-title: Gradient-based learning applied to document recognition
  publication-title: Proc. IEEE
  doi: 10.1109/5.726791
  contributor:
    fullname: LeCun
– volume: 2
  start-page: 231
  year: 2014
  ident: 10.1016/j.cosrev.2021.100433_b29
  article-title: Entity linking meets word sense disambiguation: A unified approach
  publication-title: Trans. Assoc. Comput. Linguist.
  doi: 10.1162/tacl_a_00179
  contributor:
    fullname: Moro
– year: 2018
  ident: 10.1016/j.cosrev.2021.100433_b52
  contributor:
    fullname: Trinh
– volume: 91
  issue: 1
  year: 1991
  ident: 10.1016/j.cosrev.2021.100433_b31
  article-title: Untersuchungen zu dynamischen neuronalen netzen
  publication-title: Diploma Tech. Univ. München
  contributor:
    fullname: Hochreiter
– volume: 5
  start-page: 157
  issue: 2
  year: 1994
  ident: 10.1016/j.cosrev.2021.100433_b75
  article-title: Learning long-term dependencies with gradient descent is difficult
  publication-title: IEEE Trans. Neural Netw.
  doi: 10.1109/72.279181
  contributor:
    fullname: Bengio
– year: 2016
  ident: 10.1016/j.cosrev.2021.100433_b50
  contributor:
    fullname: Wu
– year: 2021
  ident: 10.1016/j.cosrev.2021.100433_b62
  contributor:
    fullname: Fedus
– year: 2020
  ident: 10.1016/j.cosrev.2021.100433_b66
  contributor:
    fullname: Clark
– year: 2015
  ident: 10.1016/j.cosrev.2021.100433_b37
  contributor:
    fullname: Rocktäschel
– year: 2015
  ident: 10.1016/j.cosrev.2021.100433_b45
  contributor:
    fullname: Sennrich
SSID ssj0070056
Score 2.3584971
SecondaryResourceType review_article
Snippet The amount of data and computing power has drastically increased over the last decade, which leads to the development of several new fronts in the field of...
SourceID crossref
elsevier
SourceType Aggregation Database
Publisher
StartPage 100433
SubjectTerms Attention mechanism
Embedding
Natural Language Processing
NLP
Pre-training model
Title Comprehensive analysis of embeddings and pre-training in NLP
URI https://dx.doi.org/10.1016/j.cosrev.2021.100433
Volume 42
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEB5qe_HiW6yPsgevsbvZJLsFL6VYqtUiarG3kH0EK5iWWq_-dmeTLCiIB08hAxvCt8l8s8w3MwDnspe7tmMcIzeZBcgQNFAMTylZTxgZ2sjqzGV07ybJaBrdzOJZAwa-FsbJKmvfX_n00lvXlm6NZnc5n3cfWVxOk8MjTJlv4hvQQjoKZRNa_evxaOIdsnDtLsu2qcIJLrjwFXSlzEsv3t2wlxC5zikGIs5_Z6hvrDPcga06XCT96o12oWGLPdj2oxhI_Wfuw6UzrexLJUcnWd1qhCxyYt-UNWWGCc2GONmHnwtB5gWZ3N4fwHR49TQYBfVghEBjhL8OkqinE4TeYGxgONfSINMxHWeJorkIc66FMEjc3GiZID6xpTanPOaKMy4V5YfQLBaFPQISxobGJjIsi1ik8kQxafGRlOpEmLwXtiHwYKTLqv9F6oVhr2kFXurASyvw2iA8YumPfUzRRf-58vjfK09g091VFYKn0FyvPuwZhgpr1YGNi0_WqT8Idx0_PI-_ALp1vgM
link.rule.ids 315,783,787,4509,24128,27936,27937,45597,45691
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEB5qe9CLb7E-9-A1dJNNsgl4KUVJbRsEW-htyT6CFWxLrf_f2SQLCuLB64QN4cvuzLfMNzMAd0la2rZjDJlbUngYIagnfbylFCnXSWBCowqb0Z3kcTYLn-bRvAUDVwtjZZWN7699euWtG0uvQbO3Xix6L35UTZPDK0yVb2I70EE2kOLp7PSHoyx3DpnbdpdV21RuBReMuwq6SualVh922EuAsc4qBkLGfo9Q36LO4yHsN3SR9OsvOoKWWR7DgRvFQJqTeQL31rQxr7UcnRRNqxGyKol5l0ZXGSY0a2JlH24uBFksST5-PoXZ48N0kHnNYARPIcPfenGYqhih18gNNGMq0RjpfBUVsaQlD0qmONcYuJlWSYz4RIaakrKISeazRFJ2Bu3lamnOgQSRppEOtV-EfijLWPqJwVdSqmKuyzTogufAEOu6_4VwwrA3UYMnLHiiBq8L3CEmfvxHgS76z5UX_155C7vZdDIW42E-uoQ9-6SuFryC9nbzaa6RNmzlTbMtvgBmh75U
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Comprehensive+analysis+of+embeddings+and+pre-training+in+NLP&rft.jtitle=Computer+science+review&rft.au=Tripathy%2C+Jatin+Karthik&rft.au=Sethuraman%2C+Sibi+Chakkaravarthy&rft.au=Cruz%2C+Meenalosini+Vimal&rft.au=Namburu%2C+Anupama&rft.date=2021-11-01&rft.pub=Elsevier+Inc&rft.issn=1574-0137&rft.eissn=1876-7745&rft.volume=42&rft_id=info:doi/10.1016%2Fj.cosrev.2021.100433&rft.externalDocID=S1574013721000733
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1574-0137&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1574-0137&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1574-0137&client=summon