AB-LaBSE: Uyghur Sentiment Analysis via the Pre-Training Model with BiLSTM
In recent years, more and more attention has been paid to text sentiment analysis, which has gradually become a research hotspot in information extraction, data mining, Natural Language Processing (NLP), and other fields. With the gradual popularization of the Internet, sentiment analysis of Uyghur...
Saved in:
Published in | Applied sciences Vol. 12; no. 3; p. 1182 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Basel
MDPI AG
01.02.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | In recent years, more and more attention has been paid to text sentiment analysis, which has gradually become a research hotspot in information extraction, data mining, Natural Language Processing (NLP), and other fields. With the gradual popularization of the Internet, sentiment analysis of Uyghur texts has great research and application value in online public opinion. For low-resource languages, most state-of-the-art systems require tens of thousands of annotated sentences to get high performance. However, there is minimal annotated data available about Uyghur sentiment analysis tasks. There are also specificities in each task—differences in words and word order across languages make it a challenging problem. In this paper, we present an effective solution to providing a meaningful and easy-to-use feature extractor for sentiment analysis tasks: using the pre-trained language model with BiLSTM layer. Firstly, data augmentation is carried out by AEDA (An Easier Data Augmentation), and the augmented dataset is constructed to improve the performance of text classification tasks. Then, a pretraining model LaBSE is used to encode the input data. Then, BiLSTM is used to learn more context information. Finally, the validity of the model is verified via two categories datasets for sentiment analysis and five categories datasets for emotion analysis. We evaluated our approach on two datasets, which showed wonderful performance compared to some strong baselines. We close with an overview of the resources for sentiment analysis tasks and some of the open research questions. Therefore, we propose a combined deep learning and cross-language pretraining model for two low resource expectations. |
---|---|
AbstractList | In recent years, more and more attention has been paid to text sentiment analysis, which has gradually become a research hotspot in information extraction, data mining, Natural Language Processing (NLP), and other fields. With the gradual popularization of the Internet, sentiment analysis of Uyghur texts has great research and application value in online public opinion. For low-resource languages, most state-of-the-art systems require tens of thousands of annotated sentences to get high performance. However, there is minimal annotated data available about Uyghur sentiment analysis tasks. There are also specificities in each task—differences in words and word order across languages make it a challenging problem. In this paper, we present an effective solution to providing a meaningful and easy-to-use feature extractor for sentiment analysis tasks: using the pre-trained language model with BiLSTM layer. Firstly, data augmentation is carried out by AEDA (An Easier Data Augmentation), and the augmented dataset is constructed to improve the performance of text classification tasks. Then, a pretraining model LaBSE is used to encode the input data. Then, BiLSTM is used to learn more context information. Finally, the validity of the model is verified via two categories datasets for sentiment analysis and five categories datasets for emotion analysis. We evaluated our approach on two datasets, which showed wonderful performance compared to some strong baselines. We close with an overview of the resources for sentiment analysis tasks and some of the open research questions. Therefore, we propose a combined deep learning and cross-language pretraining model for two low resource expectations. |
Author | Ke, Zunwang Chen, Siqi Guo, Qinglang Pei, Yijie Silamu, Wushour |
Author_xml | – sequence: 1 givenname: Yijie orcidid: 0000-0002-0974-4378 surname: Pei fullname: Pei, Yijie – sequence: 2 givenname: Siqi surname: Chen fullname: Chen, Siqi – sequence: 3 givenname: Zunwang orcidid: 0000-0002-2589-8377 surname: Ke fullname: Ke, Zunwang – sequence: 4 givenname: Wushour surname: Silamu fullname: Silamu, Wushour – sequence: 5 givenname: Qinglang surname: Guo fullname: Guo, Qinglang |
BookMark | eNptUdlqGzEUFSWBpk6e-gOCPpZJtM2ivtkhTVJsUrDzLLRc2TKTkSvJKf77TOoWQsmFu3A553CXT-hkiAMg9JmSS84ludK7HWWEU9qxD-iMkbapuKDtyZv6I7rIeUtGk5R3lJyhH9NZNdez5c03_HhYb_YJL2Eo4WkMeDro_pBDxs9B47IB_DNBtUo6DGFY40V00OPfoWzwLMyXq8U5OvW6z3DxN0_Q4_eb1fVdNX-4vb-ezivLG1Eqa7SgXStrQ43kghBrCPMaiG-811RKMrqTApgl0korXGta6YwFaGoLnk_Q_VHXRb1VuxSedDqoqIP604hprXQqwfagOpDON7XxDojoPEjqTVtTa5jTQjs-an05au1S_LWHXNQ27tO4d1as4Q1lXNRsRNEjyqaYcwKvbCi6hDiU8Rq9okS9fkC9-cDI-fof59-k76FfAHNKh58 |
CitedBy_id | crossref_primary_10_1109_ACCESS_2023_3289295 crossref_primary_10_3390_app12041840 crossref_primary_10_1007_s11042_023_16062_w crossref_primary_10_1371_journal_pone_0308317 crossref_primary_10_3390_electronics11121906 crossref_primary_10_1016_j_jbi_2022_104145 crossref_primary_10_1016_j_knosys_2023_110838 crossref_primary_10_3390_electronics11213513 crossref_primary_10_1080_24751839_2023_2173843 crossref_primary_10_3390_data8030046 |
Cites_doi | 10.3115/v1/P15-2060 10.18653/v1/D16-1058 10.1007/s12559-021-09831-y 10.18653/v1/2021.acl-long.265 10.18653/v1/D17-1047 10.1609/aaai.v35i16.17659 10.3115/v1/W14-4012 10.18653/v1/2020.coling-main.305 10.1007/s11042-019-07788-7 10.18653/v1/2020.acl-main.341 10.18653/v1/2020.acl-main.747 10.1007/978-981-10-8569-7_23 10.18653/v1/2021.findings-emnlp.234 10.1109/ACCESS.2020.2978511 10.1007/s12652-020-01791-9 10.1016/j.neunet.2005.06.042 10.18653/v1/D19-1670 10.1162/tacl_a_00288 10.1609/aaai.v35i15.17597 10.18653/v1/D19-1410 10.18653/v1/N18-1202 10.1007/978-981-13-2354-6_7 10.18653/v1/2020.findings-emnlp.156 10.1016/j.asej.2014.04.011 10.18653/v1/P16-1162 |
ContentType | Journal Article |
Copyright | 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | AAYXX CITATION ABUWG AFKRA AZQEC BENPR CCPQU DWQXO PHGZM PHGZT PIMPY PKEHL PQEST PQQKQ PQUKI DOA |
DOI | 10.3390/app12031182 |
DatabaseName | CrossRef ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials ProQuest Central ProQuest One ProQuest Central ProQuest Central Premium ProQuest One Academic Publicly Available Content (ProQuest) ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef Publicly Available Content Database ProQuest Central ProQuest One Academic Middle East (New) ProQuest One Academic UKI Edition ProQuest Central Essentials ProQuest Central Korea ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) |
DatabaseTitleList | Publicly Available Content Database CrossRef |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Sciences (General) |
EISSN | 2076-3417 |
ExternalDocumentID | oai_doaj_org_article_8e9df65bfde048fe91fb751cb2da4ad3 10_3390_app12031182 |
GroupedDBID | .4S 2XV 5VS 7XC 8CJ 8FE 8FG 8FH AADQD AAFWJ AAYXX ADBBV ADMLS AFKRA AFPKN AFZYC ALMA_UNASSIGNED_HOLDINGS APEBS ARCSS BCNDV BENPR CCPQU CITATION CZ9 D1I D1J D1K GROUPED_DOAJ IAO IGS ITC K6- K6V KC. KQ8 L6V LK5 LK8 M7R MODMG M~E OK1 P62 PHGZM PHGZT PIMPY PROAC TUS ABUWG AZQEC DWQXO PKEHL PQEST PQQKQ PQUKI PUEGO |
ID | FETCH-LOGICAL-c364t-cba418795b1b93400cb02fae0f6ffa1990199d94e2c09c9c4d7b79dbcee65cef3 |
IEDL.DBID | DOA |
ISSN | 2076-3417 |
IngestDate | Wed Aug 27 01:31:58 EDT 2025 Mon Jun 30 11:14:09 EDT 2025 Thu Apr 24 22:54:51 EDT 2025 Tue Jul 01 00:51:29 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 3 |
Language | English |
License | https://creativecommons.org/licenses/by/4.0 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c364t-cba418795b1b93400cb02fae0f6ffa1990199d94e2c09c9c4d7b79dbcee65cef3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0002-0974-4378 0000-0002-2589-8377 |
OpenAccessLink | https://doaj.org/article/8e9df65bfde048fe91fb751cb2da4ad3 |
PQID | 2636123452 |
PQPubID | 2032433 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_8e9df65bfde048fe91fb751cb2da4ad3 proquest_journals_2636123452 crossref_citationtrail_10_3390_app12031182 crossref_primary_10_3390_app12031182 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2022-02-01 |
PublicationDateYYYYMMDD | 2022-02-01 |
PublicationDate_xml | – month: 02 year: 2022 text: 2022-02-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | Basel |
PublicationPlace_xml | – name: Basel |
PublicationTitle | Applied sciences |
PublicationYear | 2022 |
Publisher | MDPI AG |
Publisher_xml | – name: MDPI AG |
References | ref_36 ref_13 ref_35 ref_12 ref_34 ref_11 ref_10 ref_32 ref_31 ref_30 Khanchandani (ref_2) 2022; 14 ref_19 ref_18 ref_17 ref_16 ref_37 Medhat (ref_1) 2014; 5 Graves (ref_5) 2005; 18 Li (ref_14) 2020; 8 Rehman (ref_8) 2019; 78 Artetxe (ref_33) 2019; 7 ref_25 ref_24 ref_23 ref_22 ref_21 ref_20 Ahmad (ref_3) 2018; 9 Ain (ref_15) 2017; 8 Sangeetha (ref_28) 2021; 12 ref_29 ref_27 ref_26 ref_9 ref_4 ref_7 ref_6 |
References_xml | – ident: ref_6 doi: 10.3115/v1/P15-2060 – ident: ref_26 doi: 10.18653/v1/D16-1058 – volume: 14 start-page: 425 year: 2022 ident: ref_2 article-title: Incremental Word Vectors for Time-Evolving Sentiment Lexicon Induction publication-title: Cogn. Comput. doi: 10.1007/s12559-021-09831-y – volume: 8 start-page: 424 year: 2017 ident: ref_15 article-title: Sentiment analysis using deep learning techniques: A review publication-title: Int. J. Adv. Comput. Sci. Appl. – ident: ref_11 – ident: ref_36 doi: 10.18653/v1/2021.acl-long.265 – volume: 9 start-page: 393 year: 2018 ident: ref_3 article-title: SVM optimization for sentiment analysis publication-title: Int. J. Adv. Comput. Sci. Appl. – ident: ref_27 doi: 10.18653/v1/D17-1047 – ident: ref_12 doi: 10.1609/aaai.v35i16.17659 – ident: ref_7 doi: 10.3115/v1/W14-4012 – ident: ref_19 doi: 10.18653/v1/2020.coling-main.305 – volume: 78 start-page: 26597 year: 2019 ident: ref_8 article-title: A hybrid CNN-LSTM model for improving accuracy of movie reviews sentiment analysis publication-title: Multimed. Tools Appl. doi: 10.1007/s11042-019-07788-7 – ident: ref_37 – ident: ref_18 – ident: ref_30 doi: 10.18653/v1/2020.acl-main.341 – ident: ref_23 – ident: ref_21 – ident: ref_35 doi: 10.18653/v1/2020.acl-main.747 – ident: ref_4 doi: 10.1007/978-981-10-8569-7_23 – ident: ref_16 doi: 10.18653/v1/2021.findings-emnlp.234 – volume: 8 start-page: 46868 year: 2020 ident: ref_14 article-title: Enhancing BERT Representation With Context-aware Embedding For Aspect-Based Sentiment Analysis publication-title: IEEE Access doi: 10.1109/ACCESS.2020.2978511 – volume: 12 start-page: 4117 year: 2021 ident: ref_28 article-title: Sentiment analysis of student feedback using multi-head attention fusion model of word and context embedding for LSTM publication-title: J. Ambient. Intell. Humaniz. Comput. doi: 10.1007/s12652-020-01791-9 – volume: 18 start-page: 602 year: 2005 ident: ref_5 article-title: Framewise phoneme classification with bidirectional LSTM and other neural network architectures publication-title: Neural Netw. doi: 10.1016/j.neunet.2005.06.042 – ident: ref_25 – ident: ref_31 doi: 10.18653/v1/D19-1670 – ident: ref_29 – volume: 7 start-page: 597 year: 2019 ident: ref_33 article-title: Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond publication-title: Trans. Assoc. Comput. Linguist. doi: 10.1162/tacl_a_00288 – ident: ref_13 doi: 10.1609/aaai.v35i15.17597 – ident: ref_34 doi: 10.18653/v1/D19-1410 – ident: ref_10 doi: 10.18653/v1/N18-1202 – ident: ref_17 – ident: ref_24 doi: 10.1007/978-981-13-2354-6_7 – ident: ref_9 doi: 10.18653/v1/2020.findings-emnlp.156 – ident: ref_22 – volume: 5 start-page: 1093 year: 2014 ident: ref_1 article-title: Sentiment analysis algorithms and applications: A survey publication-title: Ain Shams Eng. J. doi: 10.1016/j.asej.2014.04.011 – ident: ref_20 – ident: ref_32 doi: 10.18653/v1/P16-1162 |
SSID | ssj0000913810 |
Score | 2.2900245 |
Snippet | In recent years, more and more attention has been paid to text sentiment analysis, which has gradually become a research hotspot in information extraction,... |
SourceID | doaj proquest crossref |
SourceType | Open Website Aggregation Database Enrichment Source Index Database |
StartPage | 1182 |
SubjectTerms | Accuracy BiLSTM cross-lingual pre-trained language model data augmentation Datasets Deep learning Dictionaries Language low-resource Machine learning Methods Neural networks Semantics Sentiment analysis Sparsity |
SummonAdditionalLinks | – databaseName: ProQuest Central dbid: BENPR link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LSysxFA4-NroQ9Xq59UUWLlQITh6TmbgRRyoiKnLbgrshT68grbetgv_enGlaFcXtTFYn531Ovg-hvcJaVuZGEOqdB1BtT5TIOClKKxWXZUz5oaF_fSMveuLyLr9LDbdRWquc-sTGUbuBhR75EZMckEJEzk6e_hNgjYLpaqLQmEeL0QWXsfharNo3t39nXRZAvSxpNnmYx2N9D3NhyqIm05J9CkUNYv8Xh9xEmfNVtJLSQ3w6uc81NOf762j5A2jgOlpL5jjC-wkz-uAXujytyJWuOu1j3Hu9__c8xB1YA4LWH57ijuCXB41jvodvh550EzUEBjK0RwztWFw9XHW61xuod97unl2QRJNALJdiTKzRouEMN9QoHm3SmowF7bMgQ9AUBl9KOSU8s5myygpXmEI5E8OjzK0P_Dda6A_6_g_CzholS6G1F1TkGTdBi2C0pwV3NCjaQodTidU2YYgDlcVjHWsJEG_9QbwttDc7_DSBzvj-WAWinx0BvOvmw2B4XyfzqUuvXJC5Cc5HlxO8osEUObWGOS204y20Pb24OhnhqH5Xmc2ff2-hJQavGppl7G20MB4--52Ya4zNblKoN9lM00k priority: 102 providerName: ProQuest |
Title | AB-LaBSE: Uyghur Sentiment Analysis via the Pre-Training Model with BiLSTM |
URI | https://www.proquest.com/docview/2636123452 https://doaj.org/article/8e9df65bfde048fe91fb751cb2da4ad3 |
Volume | 12 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1JS8NAFH64XPQgrliXMgcPKgQzSyYZb0ZaS1ER24K3MKsKUqWL4L93JkmlouDFa3iQ4S3zlpn5PoCjVGuSJYpF2BobQLVtJFhMozTTXFCe-ZI_DPRvbnlnwLoPycMc1Ve4E1bBA1eKO8usMI4nyhnrnc1ZgZ1KE6wVMZJJU-J8-pw310yVe7DAAbqqepBHfV8fzoMx8R6MM_ItBZVI_T824jK7tNdhrS4L0UW1nA1YsMNNWJ0DC9yEjToMx-i4xoo-2YLuRR5dy7zXOkeDj8en6Qj1wvWfMPJDM7wR9P4ska_z0N3IRv2aEgIFErQXFMawKH--7vVvtmHQbvUvO1FNjxBpytkk0kqykitcYSWoj0WtYuKkjR13TuJw4CWEEcwSHQstNDOpSoVRPi3yRFtHd2Bp-Dq0u4CMVoJnTErLMEtiqpxkTkmLU2qwE7gBpzONFbrGDg8UFi-F7yGCeos59Tbg6Ev4rYLM-F0sD6r_Egk41-UHb_2itn7xl_UbcDAzXFEH37ggnAZQGZaQvf_4xz6skPDmobyqfQBLk9HUHvpKZKKasJi1r5qwnLdu7-6bpQt-AgJB3vw |
linkProvider | Directory of Open Access Journals |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1bTxNBFD5BfFAeDKDGIuo8YKImG3cuO7tjYgxVaoGWmLRNeFvnCiSkhbZo-FP-RuZsdytG4xuvOyf7cObc5syc7wPYya1lRWZEQr3zCKrtEyVSnuSFlYrLIpb82NDvH8nuSBwcZ8cr8KuZhcFnlU1MrAK1m1jskb9nkiNSiMjYp4vLBFmj8Ha1odBYmMWhv_4Zj2yzj_tf4v6-ZqyzN_zcTWpWgcRyKeaJNVpUFNuGGsWjCVuTsqB9GmQImuI9kVJOCc9sqqyywuUmV87EbCIz6wOP_70H9wXnCj2q6Hxd9nQQY7Og6WIMMK6neAtNWfQbWrA_El_FD_BX-K9yWmcdHtXFKNldWM8GrPjxJqzdgijchI3a-WfkTY1Q_fYxHOy2k55uD_Y-kNH1yenVlAzw0RE2GkmDckJ-nGkSq0vybeqTYU1EQZB67Zxg85e0z3qDYf8JjO5EfU9hdTwZ-2dAnDVKFkJrL6jIUm6CFsFoT3PuaFC0Be8ajZW2RixH4ozzMp5cUL3lLfW2YGcpfLEA6vi3WBtVvxRBdO3qw2R6UtbOWhZeuSAzE5yPAS54RYPJM2oNc1pox1uw3WxcWbv8rPxtoFv_X34FD7rDfq_s7R8dPoeHDOcpqmfg27A6n175F7HKmZuXlWkR-H7XtnwDWq0QlA |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LbxMxEB6VVEJwQLSASCngQ5EAadX1Y71rJIS6NFEfaRSRROpt8bNUqpI2SUH9a_w67I03FIG49bpr-TD-PDMej78PYCfXmhSZYgm2xgZSbZsIltIkLzQXlBc-5Q8F_ZM-Pxizo9PsdA1-Nm9hQltl4xNrR22mOtTIdwmngSmEZWTXxbaIwX730-VVEhSkwk1rI6exhMixvfnhj2_zj4f7fq3fENLtjD4fJFFhINGUs0WilWS13LbCSlAPZ61S4qRNHXdO4nBnJIQRzBKdCi00M7nKhVE-svBMW0f9vPdgPfenorQF62WnP_iyqvAExs0Cp8tHgZSKNNxJY-J3ES7IH2GwVgv4KxjUEa77GB7F1BTtLbG0AWt2sgkPbxEWbsJGdAVz9DbyVb97Akd7ZdKT5bDzAY1vzr5dz9AwtCCFsiNqOE_Q93OJfK6JBjObjKIsBQpCbBcolIJRed4bjk6ewvhODPgMWpPpxD4HZLQSvGBSWoZZllLlJHNKWpxTg53AbXjfWKzSkb88yGhcVP4cE8xb3TJvG3ZWgy-XtB3_HlYG06-GBK7t-sN0dlbFrVsVVhjHM-WM9e7OWYGdyjOsFTGSSUPbsN0sXBUdwLz6Ddet__9-Dfc9jqveYf_4BTwg4XFF3RO-Da3F7Nq-9CnPQr2K2ELw9a7h_AtWcBYm |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=AB-LaBSE%3A+Uyghur+Sentiment+Analysis+via+the+Pre-Training+Model+with+BiLSTM&rft.jtitle=Applied+sciences&rft.au=Yijie+Pei&rft.au=Siqi+Chen&rft.au=Zunwang+Ke&rft.au=Wushour+Silamu&rft.date=2022-02-01&rft.pub=MDPI+AG&rft.eissn=2076-3417&rft.volume=12&rft.issue=3&rft.spage=1182&rft_id=info:doi/10.3390%2Fapp12031182&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_8e9df65bfde048fe91fb751cb2da4ad3 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2076-3417&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2076-3417&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2076-3417&client=summon |