Multilingual Sentiment Analysis Using Emoticons and Keywords

Nowadays the World Wide Web has evolved into a leading communication channel and information exchange medium. Especially after the introduction of the so-called web 2.0 and the explosion that followed regarding user generated content, the amount of data available over the internet has attracted the...

Full description

Saved in:
Bibliographic Details
Published in2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) Vol. 2; pp. 102 - 109
Main Authors Solakidis, Georgios S., Vavliakis, Konstantinos N., Mitkas, Pericles A.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.08.2014
Subjects
Online AccessGet full text
DOI10.1109/WI-IAT.2014.86

Cover

Abstract Nowadays the World Wide Web has evolved into a leading communication channel and information exchange medium. Especially after the introduction of the so-called web 2.0 and the explosion that followed regarding user generated content, the amount of data available over the internet has attracted the interest of both the scientific and business community. Their efforts focus on identifying the inner structures of data and the knowledge that can be derived by analyzing them. Web 2.0 is the subject of study and research in a number of areas. One of these areas is sentiment analysis, where the main goal is to study and draw conclusions about subjectivity, polarity and the feeling that is expressed in user generated content, which mainly consist of free text documents. The goal of this paper is to apply sentiment analysis on multilingual data, focusing on documents written in Greek. We developed an integrated framework that accepts user generated documents and then identifies the polarity of the text (neutral, negative or positive) and the sentiment expressed through it (joy, love, anger or sadness). We followed a semi-supervised approach which led to the development of two techniques for the automatic collection of training data without any human intervention. Our approach involves the detection and use of self-defining features that are available within the data. We take into account two emotionally rich features: a) emoticons and b) lists of emotionally intense keywords. These features are evaluated on data coming from a popular forum, using various classifiers and feature vectors. Our experimental results point to various conclusions about the effectiveness, advantages and limitations of applying such methods on Greek data. Using keywords we achieved 90% mean accuracy on identifying the subjectivity level and 93% on correctly identifying the polarity level, whereas using emoticons the mean accuracy for each of these levels was 74% and 77% respectively.
AbstractList Nowadays the World Wide Web has evolved into a leading communication channel and information exchange medium. Especially after the introduction of the so-called web 2.0 and the explosion that followed regarding user generated content, the amount of data available over the internet has attracted the interest of both the scientific and business community. Their efforts focus on identifying the inner structures of data and the knowledge that can be derived by analyzing them. Web 2.0 is the subject of study and research in a number of areas. One of these areas is sentiment analysis, where the main goal is to study and draw conclusions about subjectivity, polarity and the feeling that is expressed in user generated content, which mainly consist of free text documents. The goal of this paper is to apply sentiment analysis on multilingual data, focusing on documents written in Greek. We developed an integrated framework that accepts user generated documents and then identifies the polarity of the text (neutral, negative or positive) and the sentiment expressed through it (joy, love, anger or sadness). We followed a semi-supervised approach which led to the development of two techniques for the automatic collection of training data without any human intervention. Our approach involves the detection and use of self-defining features that are available within the data. We take into account two emotionally rich features: a) emoticons and b) lists of emotionally intense keywords. These features are evaluated on data coming from a popular forum, using various classifiers and feature vectors. Our experimental results point to various conclusions about the effectiveness, advantages and limitations of applying such methods on Greek data. Using keywords we achieved 90% mean accuracy on identifying the subjectivity level and 93% on correctly identifying the polarity level, whereas using emoticons the mean accuracy for each of these levels was 74% and 77% respectively.
Author Vavliakis, Konstantinos N.
Mitkas, Pericles A.
Solakidis, Georgios S.
Author_xml – sequence: 1
  givenname: Georgios S.
  surname: Solakidis
  fullname: Solakidis, Georgios S.
  email: gsolakid@auth.gr
  organization: Dept. of Electr. & Comput. Eng., Aristotle Univ. of Thessaloniki, Thessaloniki, Greece
– sequence: 2
  givenname: Konstantinos N.
  surname: Vavliakis
  fullname: Vavliakis, Konstantinos N.
  email: kvavliak@issel.ee.auth.gr
  organization: Dept. of Electr. & Comput. Eng., Aristotle Univ. of Thessaloniki, Thessaloniki, Greece
– sequence: 3
  givenname: Pericles A.
  surname: Mitkas
  fullname: Mitkas, Pericles A.
  email: mitkas@auth.gr
  organization: Dept. of Electr. & Comput. Eng., Aristotle Univ. of Thessaloniki, Thessaloniki, Greece
BookMark eNotjc1KxDAYRSMoqGO3btzkBVrz5T_gpgyjFkdcOIPLIU0yEmlTaTpI396Cbu5ZHDj3Gp2nIQWEboFUAMTcfzRlU-8qSoBXWp6hwigNXBnDgTN9iYqcvwghICXTil-hh9dTN8Uups-T7fB7SFPsl8F1st2cY8b7vDi86YcpuiFlbJPHL2H-GUafb9DF0XY5FP9cof3jZrd-LrdvT8263paWcjGVznnpvWQgjTKaGuG00st_S7yjigsJLaHaQtsyAoIHqoIPrTxqYZ10grMVuvvrxhDC4XuMvR3ngzRUSWDsF6FWSBk
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/WI-IAT.2014.86
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781479941438
1479941433
EndPage 109
ExternalDocumentID 6927613
Genre orig-research
GroupedDBID 6IE
6IL
ACM
ALMA_UNASSIGNED_HOLDINGS
APO
CBEJK
GUFHI
LHSKQ
RIE
RIL
ID FETCH-LOGICAL-a245t-ccd6dd63169798295c878638b0dc274561b028a1bb30154e27edeb6f85ac6c543
IEDL.DBID RIE
IngestDate Wed Aug 27 04:35:59 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a245t-ccd6dd63169798295c878638b0dc274561b028a1bb30154e27edeb6f85ac6c543
PageCount 8
ParticipantIDs ieee_primary_6927613
PublicationCentury 2000
PublicationDate 2014-Aug.
PublicationDateYYYYMMDD 2014-08-01
PublicationDate_xml – month: 08
  year: 2014
  text: 2014-Aug.
PublicationDecade 2010
PublicationTitle 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)
PublicationTitleAbbrev WI-IAT
PublicationYear 2014
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001663874
ssj0001663873
ssj0001651103
Score 1.6836398
Snippet Nowadays the World Wide Web has evolved into a leading communication channel and information exchange medium. Especially after the introduction of the...
SourceID ieee
SourceType Publisher
StartPage 102
SubjectTerms Accuracy
Automatic Collection of Training Data
Emoticons
Forum
Greek
Keywords
Semi Supervised Learning
Sentiment analysis
Support vector machines
Tagging
Training
Training data
Vectors
Title Multilingual Sentiment Analysis Using Emoticons and Keywords
URI https://ieeexplore.ieee.org/document/6927613
Volume 2
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH5sO3mauom_ycGj6da0TRrwIrKxKRPBDXcbTfJ6UTpxLaJ_vUnbbTI8eGtKKeEl4X0v7_veA7hKMUkdzKCSo6RhjD5VUgeUxYnLAwklytYJk0c-moX382jegOuNFgYRS_IZeu6xzOWbpS7cVVmPS2aj7qAJTbvNKq3W9j6FW-hQZxSrsd1ZYncc1nUb_b7svYzp-Hbq2F2h56TUv7qrlM5l2IbJeloVp-TVK3Ll6e-dio3_nfc-dLcyPvK0cVAH0MDsENrrPg6kPtYduClVuE6XXiRv5Nnxh9wPybpgCSl5BWTgeHs2fF6RJDPkAb8-beC66sJsOJjejWjdVYEmLIxyqrXhxvDA51LImMlIxyK2tlF9o22IavGUspgj8ZUKHL5CJtCg4mkcJZrrKAyOoJUtMzwGok1qjzxnTAv7oQXA_YiZkCEzAZcqECfQccZYvFeFMxa1HU7_fn0Ge24tKnbdObTyjwIvrMfP1WW51D83vadZ
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED6VMsBUoEW88cCI08Zx7FhiQahVSx9CohXdqvjRBZQimgjBr8dO0hZVDGxxFEXW2dZ95_u-O4CbuYnnDmZgwYzANDI-lkIFmESxywNxyfPWCcMR607o4zScVuB2rYUxxuTkM-O5xzyXrxcqc1dlTSaIjbqDHdi1fp-GhVprc6PCLHgoc4rF2O4tvj2mZeVGvyWaLz3cux87fhf1nJj6V3-V3L10ajBcTaxglbx6WSo99b1Vs_G_Mz-AxkbIh57WLuoQKiY5gtqqkwMqD3Yd7nIdrlOmZ_EbenYMIvdDtCpZgnJmAWo75p4NoJcoTjTqm69PG7ouGzDptMcPXVz2VcAxoWGKldJMaxb4THARERGqiEfWNrKllQ1SLaKSFnXEvpSBQ1iGcKONZPMojBVTIQ2OoZosEnMCSOm5PfSMEMXthxYCt0KiKTFEB0zIgJ9C3Rlj9l6UzpiVdjj7-_U17HXHw8Fs0Bv1z2HfrUvBtbuAavqRmUvr_1N5lS_7Dy86qqY
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2014+IEEE%2FWIC%2FACM+International+Joint+Conferences+on+Web+Intelligence+%28WI%29+and+Intelligent+Agent+Technologies+%28IAT%29&rft.atitle=Multilingual+Sentiment+Analysis+Using+Emoticons+and+Keywords&rft.au=Solakidis%2C+Georgios+S.&rft.au=Vavliakis%2C+Konstantinos+N.&rft.au=Mitkas%2C+Pericles+A.&rft.date=2014-08-01&rft.pub=IEEE&rft.volume=2&rft.spage=102&rft.epage=109&rft_id=info:doi/10.1109%2FWI-IAT.2014.86&rft.externalDocID=6927613