Identification of Affective States Based on Automatic Analysis of Texts of Comments in Social Networks
The paper considers the problem of classifying 3553 English-language comments from the social network Reddit based on various approaches to the vectorization of comment texts, including bag of words, TF–IDF, bigrams analysis based on pointwise mutual information (PMI) and sentiments, and the deep mo...
Saved in:
Published in | Automation and remote control Vol. 83; no. 12; pp. 1877 - 1885 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
Moscow
Pleiades Publishing
01.12.2022
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | The paper considers the problem of classifying 3553 English-language comments from the social network Reddit based on various approaches to the vectorization of comment texts, including bag of words, TF–IDF, bigrams analysis based on pointwise mutual information (PMI) and sentiments, and the deep model BERT of the language representation. The use of a hybrid approach based on text vectorization using BERT and bigrams analysis have made it possible to improve the quality of comments classification up to 91%. Based on a cluster analysis of 1857 English-language comments describing anxiety, clusters were identified using BERT+k-means. The study proposes a hybrid approach based on the use of the LDA topic modeling method, the VADER sentiments analysis method, pointwise mutual information, and parts of speech analysis and permitting one to select bigrams and trigrams to describe clusters of comments. To visualize the extracted patterns in the form of trigrams, a knowledge graph was constructed that describes the subject area, and a comparison of the words of the selected target trigrams with the words of a custom dictionary describing various affective disorders has made it possible to determine the types of psychosocial stressors associated with affective disorders. |
---|---|
AbstractList | The paper considers the problem of classifying 3553 English-language comments from the social network Reddit based on various approaches to the vectorization of comment texts, including bag of words, TF–IDF, bigrams analysis based on pointwise mutual information (PMI) and sentiments, and the deep model BERT of the language representation. The use of a hybrid approach based on text vectorization using BERT and bigrams analysis have made it possible to improve the quality of comments classification up to 91%. Based on a cluster analysis of 1857 English-language comments describing anxiety, clusters were identified using BERT+k-means. The study proposes a hybrid approach based on the use of the LDA topic modeling method, the VADER sentiments analysis method, pointwise mutual information, and parts of speech analysis and permitting one to select bigrams and trigrams to describe clusters of comments. To visualize the extracted patterns in the form of trigrams, a knowledge graph was constructed that describes the subject area, and a comparison of the words of the selected target trigrams with the words of a custom dictionary describing various affective disorders has made it possible to determine the types of psychosocial stressors associated with affective disorders. |
Author | Dyulicheva, Yu. Yu |
Author_xml | – sequence: 1 givenname: Yu. Yu surname: Dyulicheva fullname: Dyulicheva, Yu. Yu email: dyulichevayuyu@cfuv.ru organization: Vernadsky Crimean Federal University |
BookMark | eNp9kE1LAzEQhoNUsFb_gKeA59VJ0v06rsWPQtFDe1-y2Ymk7m5qkqr99-62gqDQ0wzM8wwz7zkZdbZDQq4Y3DAmprdLAIgZS3POgXEAHp-QMUsgiwQIPiLjYR4NwBk5934NwBhwMSZ6XmMXjDZKBmM7ajUttEYVzAfSZZABPb2THmvaD4ttsG3PKVp0stl54wd-hV9h38xs2_bLPDUdXVplZEOfMXxa9-YvyKmWjcfLnzohq4f71ewpWrw8zmfFIlIiESFK4ypj0ymIXOuM81rwKgdVMaF5DRViroDFSZaAYqhzYBIR6gQVxsAAKzEh14e1G2fft-hDubZb19_qS55maSLyNIaeyg6UctZ7h7pUJuzfD06apmRQDqGW_0PtVf5H3TjTSrc7LomD5Hu4e0X3e9UR6xuDBInC |
CitedBy_id | crossref_primary_10_3390_math11194121 |
Cites_doi | 10.21123/bsj.2020.17.4.1328 10.1145/3184558.3191627 10.18637/jss.v061.i06 10.1016/j.procs.2017.08.290 10.26555/jifo.v15i1.a20111 10.17323/1814-9545-2021-4-243-265 10.18653/v1/D19-6213 10.2196/preprints.26769 10.48550/arXiv.1810.04805 10.1177/0261927X09351676 10.1108/00220410410560573 10.1007/978-3-030-30796-7_10 10.1016/j.bspc.2020.102355 10.1155/2021/5531327 10.3115/v1/W14-3207 10.15622/ia.2021.3.1 10.36713/epra8524 |
ContentType | Journal Article |
Copyright | Pleiades Publishing, Ltd. 2022 Copyright Springer Nature B.V. 2022 |
Copyright_xml | – notice: Pleiades Publishing, Ltd. 2022 – notice: Copyright Springer Nature B.V. 2022 |
DBID | AAYXX CITATION |
DOI | 10.1134/S00051179220120025 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Mathematics |
EISSN | 1608-3032 |
EndPage | 1885 |
ExternalDocumentID | 10_1134_S00051179220120025 |
GroupedDBID | -Y2 -~X .4S .86 .DC .VR 06D 0R~ 0VY 1N0 23N 2J2 2JN 2JY 2KG 2KM 2LR 2P1 2VQ 2~H 30V 4.4 408 409 40D 40E 5GY 5VS 67Z 6NX 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AAPKM AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDBE ABDBF ABDZT ABECU ABEFU ABFSG ABFTD ABFTV ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABRTQ ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABWNU ABXPI ACAOD ACBXY ACDTI ACGFS ACHSB ACHXU ACIWK ACKNC ACMDZ ACMLO ACOKC ACOMO ACPIV ACSNA ACSTC ACUHS ACZOJ ADHHG ADHIR ADHKG ADKNI ADKPE ADMLS ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEFQL AEGAL AEGNC AEJHL AEJRE AEMSY AENEX AEOHA AEPYU AETLH AEVLU AEXYK AEZWR AFBBN AFDZB AFFNX AFGCZ AFHIU AFLOW AFOHR AFQWF AFWTZ AFZKB AGAYW AGDGC AGJBK AGMZJ AGQMX AGQPQ AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHPBZ AHSBF AHWEU AHYZX AI. AIAKS AIGIU AIIXL AILAN AITGF AIXLP AJBLW AJRNO ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARCSS ARMRJ ASPBG ATHPR AVWKF AXYYD AZFZN B-. B0M BA0 BAPOH BDATZ BGNMA BSONS CAG COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP EAD EAP EBLON EBS EDO EIOEI EJD EMK EPL ESBYG ESX FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNWQR GQ7 GQ8 GXS H13 HF~ HG6 HLICF HMJXF HQYDN HRMNR HVGLF HZ~ H~9 I-F IHE IJ- IKXTQ IWAJR IXC IXD IXE IZIGR IZQ I~X I~Z J-C JBSCW JCJTX JZLTJ KDC KOV LAK LLZTM M4Y MA- MK~ N2Q NB0 NPVJJ NQJWS NU0 O9- O93 O9J OAM OVD P2P P9O PF0 PT4 QOS R89 R9I RNI RNS ROL RPX RSV RZC RZE S16 S1Z S27 S3B SAP SDH SHX SISQX SJYHP SMT SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 TEORI TN5 TSG TSK TSV TUC TUS U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW VH1 W23 W48 WH7 WK8 YLTOR ZMTXR ZY4 ~8M ~A9 AAYXX CITATION |
ID | FETCH-LOGICAL-c363t-75b8144039ff822d32b90cb13f2d0bee9c0156860c1ef901aee0d6ece5010eb3 |
IEDL.DBID | U2A |
ISSN | 0005-1179 |
IngestDate | Fri Jul 25 10:57:51 EDT 2025 Thu Apr 24 23:12:34 EDT 2025 Tue Aug 05 12:02:44 EDT 2025 Mon Jul 21 06:06:35 EDT 2025 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 12 |
Keywords | sentiment analysis mental health LDA BERT BoW VADER knowledge graph bigram TF–IDF |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c363t-75b8144039ff822d32b90cb13f2d0bee9c0156860c1ef901aee0d6ece5010eb3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
OpenAccessLink | https://link.springer.com/content/pdf/10.1134/S00051179220120025.pdf |
PQID | 2787639750 |
PQPubID | 2043520 |
PageCount | 9 |
ParticipantIDs | proquest_journals_2787639750 crossref_citationtrail_10_1134_S00051179220120025 crossref_primary_10_1134_S00051179220120025 springer_journals_10_1134_S00051179220120025 |
PublicationCentury | 2000 |
PublicationDate | 2022-12-01 |
PublicationDateYYYYMMDD | 2022-12-01 |
PublicationDate_xml | – month: 12 year: 2022 text: 2022-12-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | Moscow |
PublicationPlace_xml | – name: Moscow – name: New York |
PublicationTitle | Automation and remote control |
PublicationTitleAbbrev | Autom Remote Control |
PublicationYear | 2022 |
Publisher | Pleiades Publishing Springer Nature B.V |
Publisher_xml | – name: Pleiades Publishing – name: Springer Nature B.V |
References | 2358_CR4 2358_CR5 2358_CR13 2358_CR16 2358_CR3 2358_CR15 2358_CR8 2358_CR10 2358_CR21 2358_CR9 2358_CR20 2358_CR12 2358_CR7 Y.R. Tausczik (2358_CR19) 2010; 29 2358_CR1 S.I. Moyeen (2358_CR11) 2021; 6 C. Hutto (2358_CR17) 2014; 8 M. Charrad (2358_CR18) 2014; 61 L. Gillam (2358_CR6) 2005; 11 K.S. Jones (2358_CR14) 2004; 60 S.T. Rabani (2358_CR2) 2020; 17 |
References_xml | – volume: 17 start-page: 1328 issue: 4 year: 2020 ident: 2358_CR2 publication-title: Baghdad Sci. J. doi: 10.21123/bsj.2020.17.4.1328 – ident: 2358_CR9 doi: 10.1145/3184558.3191627 – volume: 61 start-page: 1 issue: 6 year: 2014 ident: 2358_CR18 publication-title: J. Stat. Software doi: 10.18637/jss.v061.i06 – ident: 2358_CR1 doi: 10.1016/j.procs.2017.08.290 – ident: 2358_CR8 doi: 10.26555/jifo.v15i1.a20111 – ident: 2358_CR16 doi: 10.17323/1814-9545-2021-4-243-265 – ident: 2358_CR13 doi: 10.18653/v1/D19-6213 – ident: 2358_CR5 – ident: 2358_CR7 doi: 10.2196/preprints.26769 – ident: 2358_CR15 doi: 10.48550/arXiv.1810.04805 – volume: 8 start-page: 216 issue: 1 year: 2014 ident: 2358_CR17 publication-title: Eight Int. AAAI Conf. Weblogs Soc. Media – volume: 29 start-page: 24 issue: 1 year: 2010 ident: 2358_CR19 publication-title: J. Lang. Soc. Psychol. doi: 10.1177/0261927X09351676 – volume: 60 start-page: 493 issue: 5 year: 2004 ident: 2358_CR14 publication-title: J. Doc. doi: 10.1108/00220410410560573 – volume: 11 start-page: 55 issue: 1 year: 2005 ident: 2358_CR6 publication-title: Terminology – ident: 2358_CR21 doi: 10.1007/978-3-030-30796-7_10 – ident: 2358_CR4 doi: 10.1016/j.bspc.2020.102355 – ident: 2358_CR20 doi: 10.1155/2021/5531327 – ident: 2358_CR10 – ident: 2358_CR3 doi: 10.3115/v1/W14-3207 – ident: 2358_CR12 doi: 10.15622/ia.2021.3.1 – volume: 6 start-page: 220 issue: 9 year: 2021 ident: 2358_CR11 publication-title: EPRA Int. J. Res. Dev. (IJRD) doi: 10.36713/epra8524 |
SSID | ssj0011023 |
Score | 2.2777195 |
Snippet | The paper considers the problem of classifying 3553 English-language comments from the social network Reddit based on various approaches to the vectorization... |
SourceID | proquest crossref springer |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 1877 |
SubjectTerms | Affect (Psychology) CAE) and Design Calculus of Variations and Optimal Control; Optimization Classification Cluster analysis Computer-Aided Engineering (CAD Control Disorders English language Knowledge representation Mathematics Mathematics and Statistics Mechanical Engineering Mechatronics Robotics Social networks Systems Theory Texts Thematic Issue Words (language) |
Title | Identification of Affective States Based on Automatic Analysis of Texts of Comments in Social Networks |
URI | https://link.springer.com/article/10.1134/S00051179220120025 https://www.proquest.com/docview/2787639750 |
Volume | 83 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8MwDLaAXeCAeIrBmHLgBpW6pOnj2MHGBNpOmzRO1dokEhLqEO3-P3babrwlTm0VJ4fEceza3xeAK6M05ee0E6hAOZ4U3MHIFjee5Bg8hApViALF8cQfzbyHuZzXoLCiqXZvUpLWUlf3jniE6UUFQv3hnACfeFZvQ0tS7I5aPOPxOndAZASV0ysdkm-gMj-O8fk42viYX9Ki9rQZHsB-7SayuFrXQ9jS-RHsfSAPxK_xmnG1OAZTIW5N_QuOLQ2LbakGWjNWeZSsjyeWYtgYr8ql7cgaThKSn6Kdti8EGqH6Cvacswq-yyZVtXhxAtPhYHo7cuo7FJxM-KJ0ApmGlL8VkTHoCyjB08jN0p4wXLmp1lFGWOrQd7OeNugbLLR2la8zLTFQw0D7FHbyZa7PgHEijpdSG6Ejzw9VuMiIsAkthHJlahZt6DUzmWQ1vzhdc_GS2DhDeMn32W_D9brPa8Wu8ad0p1mgpN5pRcID4tSL0PFpw02zaJvm30c7_5_4BexyQj7YSpYO7JRvK32J_kiZdqEV9-_6Q3rePz0OulYd3wG5xtPj |
linkProvider | Springer Nature |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED5BGYAB8RSFAh7YIFJqx3mMAYEKtJ1SqVvUxLaEhFJE0v_PnZO0vCW2RLE9-HH3Xe6-zwCXRmnKz2knUIFyPCm4g5EtHjzJMXgIFW4hChRHY38w8R6nctqQwsq22r1NSVpLXd874hGnFzcQ7h_OifCJvnodNhAMhFTINeHxMndAYgQ16JUOtW-pMj-O8dkdrTDml7So9Tb3u7DTwEQW1-u6B2u62IftD-KB-DZaKq6WB2Bqxq1pfsGxuWGxLdVAa8ZqRMlu0GMphh_jRTW3HVmrSULtE7TT9oFII1RfwZ4LVtN32biuFi8PIbm_S24HTnOHgpMLX1ROILOQ8rciMgaxgBI8i9w86wvDlZtpHeXEpQ59N-9rg9hgprWrfJ1riYEaBtpH0CnmhT4Gxkk4XkpthI48P1ThLCfBJrQQypWZmXWh385kmjf64nTNxUtq4wzhpd9nvwtXyz6vtbrGn6177QKlzUkrUx6Qpl6EwKcL1-2irT7_PtrJ_5pfwOYgGQ3T4cP46RS2OLEgbFVLDzrV20KfITapsnO7Fd8Ble_T0A |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3JTsMwEB1BkRAcEKsoFPCBG0SkdpzlGJaqLK04tBK3qIltCQmlFU3_n5k4acsqcUuUsQ_2eJbMvGeAc6M01ee0E6hAOZ4U3MHMFg-e5Jg8hApViBLFXt_vDr2HF_myhOIvu93rkqTFNBBLU15cTZSp7iDxCN-LyoS6xDmBP9Fvr8IamuM26fWQx_M6AhET2ABYOiRfw2Z-nOOza1rEm19KpKXn6WzDVhUystju8Q6s6HwXNpeIBPGtN2dfne6BsehbU_2OY2PD4rJtAy0bs9Elu0bvpRh-jGfFuBzIan4Skh-gzS4fCEBCvRbsNWcWysv6tnN8ug-Dzt3gputU9yk4mfBF4QQyDamWKyJjMC5QgqeRm6VtYbhyU62jjHDVoe9mbW0wThhp7SpfZ1pi0oZJ9wE08nGuD4FxIpGXUhuhI88PVTjKiLwJrYVyZWpGTWjXK5lkFdc4XXnxlpQ5h_CS76vfhIv5mIll2vhTulVvUFKdumnCA-LXizAIasJlvWmLz7_PdvQ_8TNYf77tJE_3_cdj2OAEiCgbXFrQKN5n-gTDlCI9LTXxA0NP2Aw |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Identification+of+Affective+States+Based+on+Automatic+Analysis+of+Texts+of+Comments+in+Social+Networks&rft.jtitle=Automation+and+remote+control&rft.au=Dyulicheva%2C+Yu.+Yu&rft.date=2022-12-01&rft.issn=0005-1179&rft.eissn=1608-3032&rft.volume=83&rft.issue=12&rft.spage=1877&rft.epage=1885&rft_id=info:doi/10.1134%2FS00051179220120025&rft.externalDBID=n%2Fa&rft.externalDocID=10_1134_S00051179220120025 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0005-1179&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0005-1179&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0005-1179&client=summon |