Outlier Detection on Semantic Space for Sentiment Analysis With Convolutional Neural Networks

Sentiment analysis is a text categorization problem that consists in automatically assigning text documents to pre- defined classes that represent sentiments or a positive/negative opinion about a subject. To solve this task, machine learning techniques can be used. However, in order to achieve good...

Full description

Saved in:
Bibliographic Details
Published in2018 International Joint Conference on Neural Networks (IJCNN) pp. 1 - 8
Main Authors Schmitt, Murilo Falleiros Lemos, Spinosa, Eduardo J.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2018
Subjects
Online AccessGet full text
ISSN2161-4407
DOI10.1109/IJCNN.2018.8489200

Cover

Abstract Sentiment analysis is a text categorization problem that consists in automatically assigning text documents to pre- defined classes that represent sentiments or a positive/negative opinion about a subject. To solve this task, machine learning techniques can be used. However, in order to achieve good gen- eralization, these techniques require a thorough pre-processing and an apropriate data representation. To deal with these fundamental issues, this work proposes the use of convolutional neural networks and density-based clustering algorithms. The word representations used in this work were obtained from vectors previously trained in an unsupervised way, denominated word embeddings. These representations are able to capture syntactic and semantic information of words, which leads to similar words to be projected closer together in the semantic space. In this scenario, in order to improve the performance of the convolutional neural network, the use of a clustering algorithm in the semantic space to extract additional information from the data is proposed. A density-based clustering algorithm was used to detect and remove outliers from the documents to be classified before these documents were used to train the con- volutional neural network. We conducted experiments with two different embeddings across three datasets in order to validate the effectiveness of our method. Results show that removing outliers from documents is capable of slightly improving the accuracy of the model and reducing computational cost for the non-static training approach. (0)
AbstractList Sentiment analysis is a text categorization problem that consists in automatically assigning text documents to pre- defined classes that represent sentiments or a positive/negative opinion about a subject. To solve this task, machine learning techniques can be used. However, in order to achieve good gen- eralization, these techniques require a thorough pre-processing and an apropriate data representation. To deal with these fundamental issues, this work proposes the use of convolutional neural networks and density-based clustering algorithms. The word representations used in this work were obtained from vectors previously trained in an unsupervised way, denominated word embeddings. These representations are able to capture syntactic and semantic information of words, which leads to similar words to be projected closer together in the semantic space. In this scenario, in order to improve the performance of the convolutional neural network, the use of a clustering algorithm in the semantic space to extract additional information from the data is proposed. A density-based clustering algorithm was used to detect and remove outliers from the documents to be classified before these documents were used to train the con- volutional neural network. We conducted experiments with two different embeddings across three datasets in order to validate the effectiveness of our method. Results show that removing outliers from documents is capable of slightly improving the accuracy of the model and reducing computational cost for the non-static training approach. (0)
Author Schmitt, Murilo Falleiros Lemos
Spinosa, Eduardo J.
Author_xml – sequence: 1
  givenname: Murilo Falleiros Lemos
  surname: Schmitt
  fullname: Schmitt, Murilo Falleiros Lemos
  organization: Department of Informatics Federal, University of Paran'a Curitiba, Brazil
– sequence: 2
  givenname: Eduardo J.
  surname: Spinosa
  fullname: Spinosa, Eduardo J.
  organization: Department of Informatics Federal, University of Paran'a Curitiba, Brazil
BookMark eNotkMtOwzAURA0Cibb0B2DjH0i413Ece1mFV1HVLgpihSrXuRGGPKrEBfXvCVBpNCMdaWYxY3bWtA0xdoUQI4K5mT_ly2UsAHWspTYC4IRNTaYxBQMKUKpTNhKoMJISsgs27vsPAJEYk4zY22ofKk8dv6VALvi24YPWVNsmeMfXO-uIl203oAHUg_FZY6tD73v-6sM7z9vmq632v01b8SXtu78I32332V-y89JWPU2POWEv93fP-WO0WD3M89ki8pilITJYSLUtpC7KMiVUmTCFBShROEMFEmKaWQeJs04Kg3qrrM5EWqAC5RyVyYRd_-96ItrsOl_b7rA5vpH8AOf3Vtg
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/IJCNN.2018.8489200
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781509060146
1509060146
EISSN 2161-4407
EndPage 8
ExternalDocumentID 8489200
Genre orig-research
GroupedDBID 29I
29O
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i175t-91d46bd48dff5e16729da00f12c9ed1e1157ac03cac42918b6a8725d1606ccef3
IEDL.DBID RIE
IngestDate Wed Aug 27 02:52:48 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-91d46bd48dff5e16729da00f12c9ed1e1157ac03cac42918b6a8725d1606ccef3
PageCount 8
ParticipantIDs ieee_primary_8489200
PublicationCentury 2000
PublicationDate 2018-July
PublicationDateYYYYMMDD 2018-07-01
PublicationDate_xml – month: 07
  year: 2018
  text: 2018-July
PublicationDecade 2010
PublicationTitle 2018 International Joint Conference on Neural Networks (IJCNN)
PublicationTitleAbbrev IJCNN
PublicationYear 2018
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0023993
ssj0002685453
Score 1.7020029
Snippet Sentiment analysis is a text categorization problem that consists in automatically assigning text documents to pre- defined classes that represent sentiments...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Clustering algorithms
Convolutional neural networks
Deep Learning
Outlier Detection
Semantics
Sentiment analysis
Task analysis
Training
Title Outlier Detection on Semantic Space for Sentiment Analysis With Convolutional Neural Networks
URI https://ieeexplore.ieee.org/document/8489200
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELbaTkwFWsRbHhhJmqfjzIWqVGpBKhVdUOXHRVRAikrCwK_nnBcCMSBFSuQhse4c32f7vu8IuYiTELirmOWHkbQCGQhLhsAsnQjfFTHI0DXc4emMjRfBZBkuW-Sy4cIAQJF8BrZ5LM7y9UblZqtswAMeo1fbpI3DrORqNfspHuMIBppZ2FA2_Zok48SDm8lwNjOZXNyu3vKjnEoRTUZdMq37USaRPNt5Jm31-Uui8b8d3SX9b94evWsi0h5pQbpPunXhBlr9xz3yeJtniD239AqyIhUrpXjN4RXNvFZ0jutooIhmsSnNCvl_WouX0Id19kTxex_VmBUv1Ah8FLcio_y9Txaj6_vh2KrqLFhrBA8Zznc6YFIHXCfoOZch3tbCcRLXUzFoF4wej1COr4TC6OVyyQSPvFC7uPhRChL_gHTSTQqHhHoiYlJGPJEAAUP4kcTaF54QgXZAsvCI9Iy1Vm-llMaqMtTx380nZMd4rMyOPSWdbJvDGWKATJ4Xzv8CHnKy1g
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4gHvSECsa3e_BooY_tsj2jBBTQBIhcDNnHNBK1GCwe_PXOtgWj8WDSpM0e2s3Mdufb3e-bIeQiikMQnuZOEDaVwxSTjgqBOyaWgScjUKFntcP9Ae-M2c0knJTI5VoLAwAZ-Qzq9jE7yzdzvbRbZQ3BRIRe3SCbGPdZmKu11jsqPhfYuJ6HrWgzWMlk3KjRvWkNBpbLJerFe34UVMniSbtC-que5DSS5_oyVXX9-StJ43-7ukNq38o9er-OSbukBMkeqaxKN9DiT66Sx7tliuhzQa8gzchYCcVrCK9o6JmmQ1xJA0U8i01JmhUAoKv0JfRhlj5R_N5HMWrlC7UpPrJbxil_r5Fx-3rU6jhFpQVnhvAhxRnPMK4MEyZG33kcEbeRrht7vo7AeGAz8kjtBlpqjF-eUFyKph8aD5c_WkMc7JNyMk_ggFBfNrlSTRErAMYRgMSRCaQvJTMuKB4ekqq11vQtT6YxLQx19HfzOdnqjPq9aa87uD0m29Z7OVf2hJTTxRJOERGk6iwbCF-oXbYj
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+International+Joint+Conference+on+Neural+Networks+%28IJCNN%29&rft.atitle=Outlier+Detection+on+Semantic+Space+for+Sentiment+Analysis+With+Convolutional+Neural+Networks&rft.au=Schmitt%2C+Murilo+Falleiros+Lemos&rft.au=Spinosa%2C+Eduardo+J.&rft.date=2018-07-01&rft.pub=IEEE&rft.eissn=2161-4407&rft.spage=1&rft.epage=8&rft_id=info:doi/10.1109%2FIJCNN.2018.8489200&rft.externalDocID=8489200