AB-LaBSE: Uyghur Sentiment Analysis via the Pre-Training Model with BiLSTM

In recent years, more and more attention has been paid to text sentiment analysis, which has gradually become a research hotspot in information extraction, data mining, Natural Language Processing (NLP), and other fields. With the gradual popularization of the Internet, sentiment analysis of Uyghur...

Full description

Saved in:
Bibliographic Details
Published inApplied sciences Vol. 12; no. 3; p. 1182
Main Authors Pei, Yijie, Chen, Siqi, Ke, Zunwang, Silamu, Wushour, Guo, Qinglang
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.02.2022
Subjects
Online AccessGet full text

Cover

Loading…
Abstract In recent years, more and more attention has been paid to text sentiment analysis, which has gradually become a research hotspot in information extraction, data mining, Natural Language Processing (NLP), and other fields. With the gradual popularization of the Internet, sentiment analysis of Uyghur texts has great research and application value in online public opinion. For low-resource languages, most state-of-the-art systems require tens of thousands of annotated sentences to get high performance. However, there is minimal annotated data available about Uyghur sentiment analysis tasks. There are also specificities in each task—differences in words and word order across languages make it a challenging problem. In this paper, we present an effective solution to providing a meaningful and easy-to-use feature extractor for sentiment analysis tasks: using the pre-trained language model with BiLSTM layer. Firstly, data augmentation is carried out by AEDA (An Easier Data Augmentation), and the augmented dataset is constructed to improve the performance of text classification tasks. Then, a pretraining model LaBSE is used to encode the input data. Then, BiLSTM is used to learn more context information. Finally, the validity of the model is verified via two categories datasets for sentiment analysis and five categories datasets for emotion analysis. We evaluated our approach on two datasets, which showed wonderful performance compared to some strong baselines. We close with an overview of the resources for sentiment analysis tasks and some of the open research questions. Therefore, we propose a combined deep learning and cross-language pretraining model for two low resource expectations.
AbstractList In recent years, more and more attention has been paid to text sentiment analysis, which has gradually become a research hotspot in information extraction, data mining, Natural Language Processing (NLP), and other fields. With the gradual popularization of the Internet, sentiment analysis of Uyghur texts has great research and application value in online public opinion. For low-resource languages, most state-of-the-art systems require tens of thousands of annotated sentences to get high performance. However, there is minimal annotated data available about Uyghur sentiment analysis tasks. There are also specificities in each task—differences in words and word order across languages make it a challenging problem. In this paper, we present an effective solution to providing a meaningful and easy-to-use feature extractor for sentiment analysis tasks: using the pre-trained language model with BiLSTM layer. Firstly, data augmentation is carried out by AEDA (An Easier Data Augmentation), and the augmented dataset is constructed to improve the performance of text classification tasks. Then, a pretraining model LaBSE is used to encode the input data. Then, BiLSTM is used to learn more context information. Finally, the validity of the model is verified via two categories datasets for sentiment analysis and five categories datasets for emotion analysis. We evaluated our approach on two datasets, which showed wonderful performance compared to some strong baselines. We close with an overview of the resources for sentiment analysis tasks and some of the open research questions. Therefore, we propose a combined deep learning and cross-language pretraining model for two low resource expectations.
Author Ke, Zunwang
Chen, Siqi
Guo, Qinglang
Pei, Yijie
Silamu, Wushour
Author_xml – sequence: 1
  givenname: Yijie
  orcidid: 0000-0002-0974-4378
  surname: Pei
  fullname: Pei, Yijie
– sequence: 2
  givenname: Siqi
  surname: Chen
  fullname: Chen, Siqi
– sequence: 3
  givenname: Zunwang
  orcidid: 0000-0002-2589-8377
  surname: Ke
  fullname: Ke, Zunwang
– sequence: 4
  givenname: Wushour
  surname: Silamu
  fullname: Silamu, Wushour
– sequence: 5
  givenname: Qinglang
  surname: Guo
  fullname: Guo, Qinglang
BookMark eNptUdlqGzEUFSWBpk6e-gOCPpZJtM2ivtkhTVJsUrDzLLRc2TKTkSvJKf77TOoWQsmFu3A553CXT-hkiAMg9JmSS84ludK7HWWEU9qxD-iMkbapuKDtyZv6I7rIeUtGk5R3lJyhH9NZNdez5c03_HhYb_YJL2Eo4WkMeDro_pBDxs9B47IB_DNBtUo6DGFY40V00OPfoWzwLMyXq8U5OvW6z3DxN0_Q4_eb1fVdNX-4vb-ezivLG1Eqa7SgXStrQ43kghBrCPMaiG-811RKMrqTApgl0korXGta6YwFaGoLnk_Q_VHXRb1VuxSedDqoqIP604hprXQqwfagOpDON7XxDojoPEjqTVtTa5jTQjs-an05au1S_LWHXNQ27tO4d1as4Q1lXNRsRNEjyqaYcwKvbCi6hDiU8Rq9okS9fkC9-cDI-fof59-k76FfAHNKh58
CitedBy_id crossref_primary_10_1109_ACCESS_2023_3289295
crossref_primary_10_3390_app12041840
crossref_primary_10_1007_s11042_023_16062_w
crossref_primary_10_1371_journal_pone_0308317
crossref_primary_10_3390_electronics11121906
crossref_primary_10_1016_j_jbi_2022_104145
crossref_primary_10_1016_j_knosys_2023_110838
crossref_primary_10_3390_electronics11213513
crossref_primary_10_1080_24751839_2023_2173843
crossref_primary_10_3390_data8030046
Cites_doi 10.3115/v1/P15-2060
10.18653/v1/D16-1058
10.1007/s12559-021-09831-y
10.18653/v1/2021.acl-long.265
10.18653/v1/D17-1047
10.1609/aaai.v35i16.17659
10.3115/v1/W14-4012
10.18653/v1/2020.coling-main.305
10.1007/s11042-019-07788-7
10.18653/v1/2020.acl-main.341
10.18653/v1/2020.acl-main.747
10.1007/978-981-10-8569-7_23
10.18653/v1/2021.findings-emnlp.234
10.1109/ACCESS.2020.2978511
10.1007/s12652-020-01791-9
10.1016/j.neunet.2005.06.042
10.18653/v1/D19-1670
10.1162/tacl_a_00288
10.1609/aaai.v35i15.17597
10.18653/v1/D19-1410
10.18653/v1/N18-1202
10.1007/978-981-13-2354-6_7
10.18653/v1/2020.findings-emnlp.156
10.1016/j.asej.2014.04.011
10.18653/v1/P16-1162
ContentType Journal Article
Copyright 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
ABUWG
AFKRA
AZQEC
BENPR
CCPQU
DWQXO
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQQKQ
PQUKI
DOA
DOI 10.3390/app12031182
DatabaseName CrossRef
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials
ProQuest Central
ProQuest One
ProQuest Central
ProQuest Central Premium
ProQuest One Academic
Publicly Available Content (ProQuest)
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic
ProQuest One Academic UKI Edition
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Publicly Available Content Database
ProQuest Central
ProQuest One Academic Middle East (New)
ProQuest One Academic UKI Edition
ProQuest Central Essentials
ProQuest Central Korea
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList
Publicly Available Content Database
CrossRef
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Sciences (General)
EISSN 2076-3417
ExternalDocumentID oai_doaj_org_article_8e9df65bfde048fe91fb751cb2da4ad3
10_3390_app12031182
GroupedDBID .4S
2XV
5VS
7XC
8CJ
8FE
8FG
8FH
AADQD
AAFWJ
AAYXX
ADBBV
ADMLS
AFKRA
AFPKN
AFZYC
ALMA_UNASSIGNED_HOLDINGS
APEBS
ARCSS
BCNDV
BENPR
CCPQU
CITATION
CZ9
D1I
D1J
D1K
GROUPED_DOAJ
IAO
IGS
ITC
K6-
K6V
KC.
KQ8
L6V
LK5
LK8
M7R
MODMG
M~E
OK1
P62
PHGZM
PHGZT
PIMPY
PROAC
TUS
ABUWG
AZQEC
DWQXO
PKEHL
PQEST
PQQKQ
PQUKI
PUEGO
ID FETCH-LOGICAL-c364t-cba418795b1b93400cb02fae0f6ffa1990199d94e2c09c9c4d7b79dbcee65cef3
IEDL.DBID DOA
ISSN 2076-3417
IngestDate Wed Aug 27 01:31:58 EDT 2025
Mon Jun 30 11:14:09 EDT 2025
Thu Apr 24 22:54:51 EDT 2025
Tue Jul 01 00:51:29 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Language English
License https://creativecommons.org/licenses/by/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c364t-cba418795b1b93400cb02fae0f6ffa1990199d94e2c09c9c4d7b79dbcee65cef3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-0974-4378
0000-0002-2589-8377
OpenAccessLink https://doaj.org/article/8e9df65bfde048fe91fb751cb2da4ad3
PQID 2636123452
PQPubID 2032433
ParticipantIDs doaj_primary_oai_doaj_org_article_8e9df65bfde048fe91fb751cb2da4ad3
proquest_journals_2636123452
crossref_citationtrail_10_3390_app12031182
crossref_primary_10_3390_app12031182
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-02-01
PublicationDateYYYYMMDD 2022-02-01
PublicationDate_xml – month: 02
  year: 2022
  text: 2022-02-01
  day: 01
PublicationDecade 2020
PublicationPlace Basel
PublicationPlace_xml – name: Basel
PublicationTitle Applied sciences
PublicationYear 2022
Publisher MDPI AG
Publisher_xml – name: MDPI AG
References ref_36
ref_13
ref_35
ref_12
ref_34
ref_11
ref_10
ref_32
ref_31
ref_30
Khanchandani (ref_2) 2022; 14
ref_19
ref_18
ref_17
ref_16
ref_37
Medhat (ref_1) 2014; 5
Graves (ref_5) 2005; 18
Li (ref_14) 2020; 8
Rehman (ref_8) 2019; 78
Artetxe (ref_33) 2019; 7
ref_25
ref_24
ref_23
ref_22
ref_21
ref_20
Ahmad (ref_3) 2018; 9
Ain (ref_15) 2017; 8
Sangeetha (ref_28) 2021; 12
ref_29
ref_27
ref_26
ref_9
ref_4
ref_7
ref_6
References_xml – ident: ref_6
  doi: 10.3115/v1/P15-2060
– ident: ref_26
  doi: 10.18653/v1/D16-1058
– volume: 14
  start-page: 425
  year: 2022
  ident: ref_2
  article-title: Incremental Word Vectors for Time-Evolving Sentiment Lexicon Induction
  publication-title: Cogn. Comput.
  doi: 10.1007/s12559-021-09831-y
– volume: 8
  start-page: 424
  year: 2017
  ident: ref_15
  article-title: Sentiment analysis using deep learning techniques: A review
  publication-title: Int. J. Adv. Comput. Sci. Appl.
– ident: ref_11
– ident: ref_36
  doi: 10.18653/v1/2021.acl-long.265
– volume: 9
  start-page: 393
  year: 2018
  ident: ref_3
  article-title: SVM optimization for sentiment analysis
  publication-title: Int. J. Adv. Comput. Sci. Appl.
– ident: ref_27
  doi: 10.18653/v1/D17-1047
– ident: ref_12
  doi: 10.1609/aaai.v35i16.17659
– ident: ref_7
  doi: 10.3115/v1/W14-4012
– ident: ref_19
  doi: 10.18653/v1/2020.coling-main.305
– volume: 78
  start-page: 26597
  year: 2019
  ident: ref_8
  article-title: A hybrid CNN-LSTM model for improving accuracy of movie reviews sentiment analysis
  publication-title: Multimed. Tools Appl.
  doi: 10.1007/s11042-019-07788-7
– ident: ref_37
– ident: ref_18
– ident: ref_30
  doi: 10.18653/v1/2020.acl-main.341
– ident: ref_23
– ident: ref_21
– ident: ref_35
  doi: 10.18653/v1/2020.acl-main.747
– ident: ref_4
  doi: 10.1007/978-981-10-8569-7_23
– ident: ref_16
  doi: 10.18653/v1/2021.findings-emnlp.234
– volume: 8
  start-page: 46868
  year: 2020
  ident: ref_14
  article-title: Enhancing BERT Representation With Context-aware Embedding For Aspect-Based Sentiment Analysis
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2020.2978511
– volume: 12
  start-page: 4117
  year: 2021
  ident: ref_28
  article-title: Sentiment analysis of student feedback using multi-head attention fusion model of word and context embedding for LSTM
  publication-title: J. Ambient. Intell. Humaniz. Comput.
  doi: 10.1007/s12652-020-01791-9
– volume: 18
  start-page: 602
  year: 2005
  ident: ref_5
  article-title: Framewise phoneme classification with bidirectional LSTM and other neural network architectures
  publication-title: Neural Netw.
  doi: 10.1016/j.neunet.2005.06.042
– ident: ref_25
– ident: ref_31
  doi: 10.18653/v1/D19-1670
– ident: ref_29
– volume: 7
  start-page: 597
  year: 2019
  ident: ref_33
  article-title: Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond
  publication-title: Trans. Assoc. Comput. Linguist.
  doi: 10.1162/tacl_a_00288
– ident: ref_13
  doi: 10.1609/aaai.v35i15.17597
– ident: ref_34
  doi: 10.18653/v1/D19-1410
– ident: ref_10
  doi: 10.18653/v1/N18-1202
– ident: ref_17
– ident: ref_24
  doi: 10.1007/978-981-13-2354-6_7
– ident: ref_9
  doi: 10.18653/v1/2020.findings-emnlp.156
– ident: ref_22
– volume: 5
  start-page: 1093
  year: 2014
  ident: ref_1
  article-title: Sentiment analysis algorithms and applications: A survey
  publication-title: Ain Shams Eng. J.
  doi: 10.1016/j.asej.2014.04.011
– ident: ref_20
– ident: ref_32
  doi: 10.18653/v1/P16-1162
SSID ssj0000913810
Score 2.2900245
Snippet In recent years, more and more attention has been paid to text sentiment analysis, which has gradually become a research hotspot in information extraction,...
SourceID doaj
proquest
crossref
SourceType Open Website
Aggregation Database
Enrichment Source
Index Database
StartPage 1182
SubjectTerms Accuracy
BiLSTM
cross-lingual pre-trained language model
data augmentation
Datasets
Deep learning
Dictionaries
Language
low-resource
Machine learning
Methods
Neural networks
Semantics
Sentiment analysis
Sparsity
SummonAdditionalLinks – databaseName: ProQuest Central
  dbid: BENPR
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LSysxFA4-NroQ9Xq59UUWLlQITh6TmbgRRyoiKnLbgrshT68grbetgv_enGlaFcXtTFYn531Ovg-hvcJaVuZGEOqdB1BtT5TIOClKKxWXZUz5oaF_fSMveuLyLr9LDbdRWquc-sTGUbuBhR75EZMckEJEzk6e_hNgjYLpaqLQmEeL0QWXsfharNo3t39nXRZAvSxpNnmYx2N9D3NhyqIm05J9CkUNYv8Xh9xEmfNVtJLSQ3w6uc81NOf762j5A2jgOlpL5jjC-wkz-uAXujytyJWuOu1j3Hu9__c8xB1YA4LWH57ijuCXB41jvodvh550EzUEBjK0RwztWFw9XHW61xuod97unl2QRJNALJdiTKzRouEMN9QoHm3SmowF7bMgQ9AUBl9KOSU8s5myygpXmEI5E8OjzK0P_Dda6A_6_g_CzholS6G1F1TkGTdBi2C0pwV3NCjaQodTidU2YYgDlcVjHWsJEG_9QbwttDc7_DSBzvj-WAWinx0BvOvmw2B4XyfzqUuvXJC5Cc5HlxO8osEUObWGOS204y20Pb24OhnhqH5Xmc2ff2-hJQavGppl7G20MB4--52Ya4zNblKoN9lM00k
  priority: 102
  providerName: ProQuest
Title AB-LaBSE: Uyghur Sentiment Analysis via the Pre-Training Model with BiLSTM
URI https://www.proquest.com/docview/2636123452
https://doaj.org/article/8e9df65bfde048fe91fb751cb2da4ad3
Volume 12
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1JS8NAFH64XPQgrliXMgcPKgQzSyYZb0ZaS1ER24K3MKsKUqWL4L93JkmlouDFa3iQ4S3zlpn5PoCjVGuSJYpF2BobQLVtJFhMozTTXFCe-ZI_DPRvbnlnwLoPycMc1Ve4E1bBA1eKO8usMI4nyhnrnc1ZgZ1KE6wVMZJJU-J8-pw310yVe7DAAbqqepBHfV8fzoMx8R6MM_ItBZVI_T824jK7tNdhrS4L0UW1nA1YsMNNWJ0DC9yEjToMx-i4xoo-2YLuRR5dy7zXOkeDj8en6Qj1wvWfMPJDM7wR9P4ska_z0N3IRv2aEgIFErQXFMawKH--7vVvtmHQbvUvO1FNjxBpytkk0kqykitcYSWoj0WtYuKkjR13TuJw4CWEEcwSHQstNDOpSoVRPi3yRFtHd2Bp-Dq0u4CMVoJnTErLMEtiqpxkTkmLU2qwE7gBpzONFbrGDg8UFi-F7yGCeos59Tbg6Ev4rYLM-F0sD6r_Egk41-UHb_2itn7xl_UbcDAzXFEH37ggnAZQGZaQvf_4xz6skPDmobyqfQBLk9HUHvpKZKKasJi1r5qwnLdu7-6bpQt-AgJB3vw
linkProvider Directory of Open Access Journals
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1bTxNBFD5BfFAeDKDGIuo8YKImG3cuO7tjYgxVaoGWmLRNeFvnCiSkhbZo-FP-RuZsdytG4xuvOyf7cObc5syc7wPYya1lRWZEQr3zCKrtEyVSnuSFlYrLIpb82NDvH8nuSBwcZ8cr8KuZhcFnlU1MrAK1m1jskb9nkiNSiMjYp4vLBFmj8Ha1odBYmMWhv_4Zj2yzj_tf4v6-ZqyzN_zcTWpWgcRyKeaJNVpUFNuGGsWjCVuTsqB9GmQImuI9kVJOCc9sqqyywuUmV87EbCIz6wOP_70H9wXnCj2q6Hxd9nQQY7Og6WIMMK6neAtNWfQbWrA_El_FD_BX-K9yWmcdHtXFKNldWM8GrPjxJqzdgijchI3a-WfkTY1Q_fYxHOy2k55uD_Y-kNH1yenVlAzw0RE2GkmDckJ-nGkSq0vybeqTYU1EQZB67Zxg85e0z3qDYf8JjO5EfU9hdTwZ-2dAnDVKFkJrL6jIUm6CFsFoT3PuaFC0Be8ajZW2RixH4ozzMp5cUL3lLfW2YGcpfLEA6vi3WBtVvxRBdO3qw2R6UtbOWhZeuSAzE5yPAS54RYPJM2oNc1pox1uw3WxcWbv8rPxtoFv_X34FD7rDfq_s7R8dPoeHDOcpqmfg27A6n175F7HKmZuXlWkR-H7XtnwDWq0QlA
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LbxMxEB6VVEJwQLSASCngQ5EAadX1Y71rJIS6NFEfaRSRROpt8bNUqpI2SUH9a_w67I03FIG49bpr-TD-PDMej78PYCfXmhSZYgm2xgZSbZsIltIkLzQXlBc-5Q8F_ZM-Pxizo9PsdA1-Nm9hQltl4xNrR22mOtTIdwmngSmEZWTXxbaIwX730-VVEhSkwk1rI6exhMixvfnhj2_zj4f7fq3fENLtjD4fJFFhINGUs0WilWS13LbCSlAPZ61S4qRNHXdO4nBnJIQRzBKdCi00M7nKhVE-svBMW0f9vPdgPfenorQF62WnP_iyqvAExs0Cp8tHgZSKNNxJY-J3ES7IH2GwVgv4KxjUEa77GB7F1BTtLbG0AWt2sgkPbxEWbsJGdAVz9DbyVb97Akd7ZdKT5bDzAY1vzr5dz9AwtCCFsiNqOE_Q93OJfK6JBjObjKIsBQpCbBcolIJRed4bjk6ewvhODPgMWpPpxD4HZLQSvGBSWoZZllLlJHNKWpxTg53AbXjfWKzSkb88yGhcVP4cE8xb3TJvG3ZWgy-XtB3_HlYG06-GBK7t-sN0dlbFrVsVVhjHM-WM9e7OWYGdyjOsFTGSSUPbsN0sXBUdwLz6Ddet__9-Dfc9jqveYf_4BTwg4XFF3RO-Da3F7Nq-9CnPQr2K2ELw9a7h_AtWcBYm
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=AB-LaBSE%3A+Uyghur+Sentiment+Analysis+via+the+Pre-Training+Model+with+BiLSTM&rft.jtitle=Applied+sciences&rft.au=Yijie+Pei&rft.au=Siqi+Chen&rft.au=Zunwang+Ke&rft.au=Wushour+Silamu&rft.date=2022-02-01&rft.pub=MDPI+AG&rft.eissn=2076-3417&rft.volume=12&rft.issue=3&rft.spage=1182&rft_id=info:doi/10.3390%2Fapp12031182&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_8e9df65bfde048fe91fb751cb2da4ad3
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2076-3417&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2076-3417&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2076-3417&client=summon