Multilingual Sentiment Analysis Using Emoticons and Keywords
Nowadays the World Wide Web has evolved into a leading communication channel and information exchange medium. Especially after the introduction of the so-called web 2.0 and the explosion that followed regarding user generated content, the amount of data available over the internet has attracted the...
Saved in:
Published in | 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) Vol. 2; pp. 102 - 109 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.08.2014
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/WI-IAT.2014.86 |
Cover
Abstract | Nowadays the World Wide Web has evolved into a leading communication channel and information exchange medium. Especially after the introduction of the so-called web 2.0 and the explosion that followed regarding user generated content, the amount of data available over the internet has attracted the interest of both the scientific and business community. Their efforts focus on identifying the inner structures of data and the knowledge that can be derived by analyzing them. Web 2.0 is the subject of study and research in a number of areas. One of these areas is sentiment analysis, where the main goal is to study and draw conclusions about subjectivity, polarity and the feeling that is expressed in user generated content, which mainly consist of free text documents. The goal of this paper is to apply sentiment analysis on multilingual data, focusing on documents written in Greek. We developed an integrated framework that accepts user generated documents and then identifies the polarity of the text (neutral, negative or positive) and the sentiment expressed through it (joy, love, anger or sadness). We followed a semi-supervised approach which led to the development of two techniques for the automatic collection of training data without any human intervention. Our approach involves the detection and use of self-defining features that are available within the data. We take into account two emotionally rich features: a) emoticons and b) lists of emotionally intense keywords. These features are evaluated on data coming from a popular forum, using various classifiers and feature vectors. Our experimental results point to various conclusions about the effectiveness, advantages and limitations of applying such methods on Greek data. Using keywords we achieved 90% mean accuracy on identifying the subjectivity level and 93% on correctly identifying the polarity level, whereas using emoticons the mean accuracy for each of these levels was 74% and 77% respectively. |
---|---|
AbstractList | Nowadays the World Wide Web has evolved into a leading communication channel and information exchange medium. Especially after the introduction of the so-called web 2.0 and the explosion that followed regarding user generated content, the amount of data available over the internet has attracted the interest of both the scientific and business community. Their efforts focus on identifying the inner structures of data and the knowledge that can be derived by analyzing them. Web 2.0 is the subject of study and research in a number of areas. One of these areas is sentiment analysis, where the main goal is to study and draw conclusions about subjectivity, polarity and the feeling that is expressed in user generated content, which mainly consist of free text documents. The goal of this paper is to apply sentiment analysis on multilingual data, focusing on documents written in Greek. We developed an integrated framework that accepts user generated documents and then identifies the polarity of the text (neutral, negative or positive) and the sentiment expressed through it (joy, love, anger or sadness). We followed a semi-supervised approach which led to the development of two techniques for the automatic collection of training data without any human intervention. Our approach involves the detection and use of self-defining features that are available within the data. We take into account two emotionally rich features: a) emoticons and b) lists of emotionally intense keywords. These features are evaluated on data coming from a popular forum, using various classifiers and feature vectors. Our experimental results point to various conclusions about the effectiveness, advantages and limitations of applying such methods on Greek data. Using keywords we achieved 90% mean accuracy on identifying the subjectivity level and 93% on correctly identifying the polarity level, whereas using emoticons the mean accuracy for each of these levels was 74% and 77% respectively. |
Author | Vavliakis, Konstantinos N. Mitkas, Pericles A. Solakidis, Georgios S. |
Author_xml | – sequence: 1 givenname: Georgios S. surname: Solakidis fullname: Solakidis, Georgios S. email: gsolakid@auth.gr organization: Dept. of Electr. & Comput. Eng., Aristotle Univ. of Thessaloniki, Thessaloniki, Greece – sequence: 2 givenname: Konstantinos N. surname: Vavliakis fullname: Vavliakis, Konstantinos N. email: kvavliak@issel.ee.auth.gr organization: Dept. of Electr. & Comput. Eng., Aristotle Univ. of Thessaloniki, Thessaloniki, Greece – sequence: 3 givenname: Pericles A. surname: Mitkas fullname: Mitkas, Pericles A. email: mitkas@auth.gr organization: Dept. of Electr. & Comput. Eng., Aristotle Univ. of Thessaloniki, Thessaloniki, Greece |
BookMark | eNotjc1KxDAYRSMoqGO3btzkBVrz5T_gpgyjFkdcOIPLIU0yEmlTaTpI396Cbu5ZHDj3Gp2nIQWEboFUAMTcfzRlU-8qSoBXWp6hwigNXBnDgTN9iYqcvwghICXTil-hh9dTN8Uups-T7fB7SFPsl8F1st2cY8b7vDi86YcpuiFlbJPHL2H-GUafb9DF0XY5FP9cof3jZrd-LrdvT8263paWcjGVznnpvWQgjTKaGuG00st_S7yjigsJLaHaQtsyAoIHqoIPrTxqYZ10grMVuvvrxhDC4XuMvR3ngzRUSWDsF6FWSBk |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/WI-IAT.2014.86 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISBN | 9781479941438 1479941433 |
EndPage | 109 |
ExternalDocumentID | 6927613 |
Genre | orig-research |
GroupedDBID | 6IE 6IL ACM ALMA_UNASSIGNED_HOLDINGS APO CBEJK GUFHI LHSKQ RIE RIL |
ID | FETCH-LOGICAL-a245t-ccd6dd63169798295c878638b0dc274561b028a1bb30154e27edeb6f85ac6c543 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 27 04:35:59 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a245t-ccd6dd63169798295c878638b0dc274561b028a1bb30154e27edeb6f85ac6c543 |
PageCount | 8 |
ParticipantIDs | ieee_primary_6927613 |
PublicationCentury | 2000 |
PublicationDate | 2014-Aug. |
PublicationDateYYYYMMDD | 2014-08-01 |
PublicationDate_xml | – month: 08 year: 2014 text: 2014-Aug. |
PublicationDecade | 2010 |
PublicationTitle | 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) |
PublicationTitleAbbrev | WI-IAT |
PublicationYear | 2014 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0001663874 ssj0001663873 ssj0001651103 |
Score | 1.6836398 |
Snippet | Nowadays the World Wide Web has evolved into a leading communication channel and information exchange medium. Especially after the introduction of the... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 102 |
SubjectTerms | Accuracy Automatic Collection of Training Data Emoticons Forum Greek Keywords Semi Supervised Learning Sentiment analysis Support vector machines Tagging Training Training data Vectors |
Title | Multilingual Sentiment Analysis Using Emoticons and Keywords |
URI | https://ieeexplore.ieee.org/document/6927613 |
Volume | 2 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH5sO3mauom_ycGj6da0TRrwIrKxKRPBDXcbTfJ6UTpxLaJ_vUnbbTI8eGtKKeEl4X0v7_veA7hKMUkdzKCSo6RhjD5VUgeUxYnLAwklytYJk0c-moX382jegOuNFgYRS_IZeu6xzOWbpS7cVVmPS2aj7qAJTbvNKq3W9j6FW-hQZxSrsd1ZYncc1nUb_b7svYzp-Hbq2F2h56TUv7qrlM5l2IbJeloVp-TVK3Ll6e-dio3_nfc-dLcyPvK0cVAH0MDsENrrPg6kPtYduClVuE6XXiRv5Nnxh9wPybpgCSl5BWTgeHs2fF6RJDPkAb8-beC66sJsOJjejWjdVYEmLIxyqrXhxvDA51LImMlIxyK2tlF9o22IavGUspgj8ZUKHL5CJtCg4mkcJZrrKAyOoJUtMzwGok1qjzxnTAv7oQXA_YiZkCEzAZcqECfQccZYvFeFMxa1HU7_fn0Ge24tKnbdObTyjwIvrMfP1WW51D83vadZ |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED6VMsBUoEW88cCI08Zx7FhiQahVSx9CohXdqvjRBZQimgjBr8dO0hZVDGxxFEXW2dZ95_u-O4CbuYnnDmZgwYzANDI-lkIFmESxywNxyfPWCcMR607o4zScVuB2rYUxxuTkM-O5xzyXrxcqc1dlTSaIjbqDHdi1fp-GhVprc6PCLHgoc4rF2O4tvj2mZeVGvyWaLz3cux87fhf1nJj6V3-V3L10ajBcTaxglbx6WSo99b1Vs_G_Mz-AxkbIh57WLuoQKiY5gtqqkwMqD3Yd7nIdrlOmZ_EbenYMIvdDtCpZgnJmAWo75p4NoJcoTjTqm69PG7ouGzDptMcPXVz2VcAxoWGKldJMaxb4THARERGqiEfWNrKllQ1SLaKSFnXEvpSBQ1iGcKONZPMojBVTIQ2OoZosEnMCSOm5PfSMEMXthxYCt0KiKTFEB0zIgJ9C3Rlj9l6UzpiVdjj7-_U17HXHw8Fs0Bv1z2HfrUvBtbuAavqRmUvr_1N5lS_7Dy86qqY |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2014+IEEE%2FWIC%2FACM+International+Joint+Conferences+on+Web+Intelligence+%28WI%29+and+Intelligent+Agent+Technologies+%28IAT%29&rft.atitle=Multilingual+Sentiment+Analysis+Using+Emoticons+and+Keywords&rft.au=Solakidis%2C+Georgios+S.&rft.au=Vavliakis%2C+Konstantinos+N.&rft.au=Mitkas%2C+Pericles+A.&rft.date=2014-08-01&rft.pub=IEEE&rft.volume=2&rft.spage=102&rft.epage=109&rft_id=info:doi/10.1109%2FWI-IAT.2014.86&rft.externalDocID=6927613 |