Outlier Detection on Semantic Space for Sentiment Analysis With Convolutional Neural Networks
Sentiment analysis is a text categorization problem that consists in automatically assigning text documents to pre- defined classes that represent sentiments or a positive/negative opinion about a subject. To solve this task, machine learning techniques can be used. However, in order to achieve good...
Saved in:
Published in | 2018 International Joint Conference on Neural Networks (IJCNN) pp. 1 - 8 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.07.2018
|
Subjects | |
Online Access | Get full text |
ISSN | 2161-4407 |
DOI | 10.1109/IJCNN.2018.8489200 |
Cover
Abstract | Sentiment analysis is a text categorization problem that consists in automatically assigning text documents to pre- defined classes that represent sentiments or a positive/negative opinion about a subject. To solve this task, machine learning techniques can be used. However, in order to achieve good gen- eralization, these techniques require a thorough pre-processing and an apropriate data representation. To deal with these fundamental issues, this work proposes the use of convolutional neural networks and density-based clustering algorithms. The word representations used in this work were obtained from vectors previously trained in an unsupervised way, denominated word embeddings. These representations are able to capture syntactic and semantic information of words, which leads to similar words to be projected closer together in the semantic space. In this scenario, in order to improve the performance of the convolutional neural network, the use of a clustering algorithm in the semantic space to extract additional information from the data is proposed. A density-based clustering algorithm was used to detect and remove outliers from the documents to be classified before these documents were used to train the con- volutional neural network. We conducted experiments with two different embeddings across three datasets in order to validate the effectiveness of our method. Results show that removing outliers from documents is capable of slightly improving the accuracy of the model and reducing computational cost for the non-static training approach. (0) |
---|---|
AbstractList | Sentiment analysis is a text categorization problem that consists in automatically assigning text documents to pre- defined classes that represent sentiments or a positive/negative opinion about a subject. To solve this task, machine learning techniques can be used. However, in order to achieve good gen- eralization, these techniques require a thorough pre-processing and an apropriate data representation. To deal with these fundamental issues, this work proposes the use of convolutional neural networks and density-based clustering algorithms. The word representations used in this work were obtained from vectors previously trained in an unsupervised way, denominated word embeddings. These representations are able to capture syntactic and semantic information of words, which leads to similar words to be projected closer together in the semantic space. In this scenario, in order to improve the performance of the convolutional neural network, the use of a clustering algorithm in the semantic space to extract additional information from the data is proposed. A density-based clustering algorithm was used to detect and remove outliers from the documents to be classified before these documents were used to train the con- volutional neural network. We conducted experiments with two different embeddings across three datasets in order to validate the effectiveness of our method. Results show that removing outliers from documents is capable of slightly improving the accuracy of the model and reducing computational cost for the non-static training approach. (0) |
Author | Schmitt, Murilo Falleiros Lemos Spinosa, Eduardo J. |
Author_xml | – sequence: 1 givenname: Murilo Falleiros Lemos surname: Schmitt fullname: Schmitt, Murilo Falleiros Lemos organization: Department of Informatics Federal, University of Paran'a Curitiba, Brazil – sequence: 2 givenname: Eduardo J. surname: Spinosa fullname: Spinosa, Eduardo J. organization: Department of Informatics Federal, University of Paran'a Curitiba, Brazil |
BookMark | eNotkMtOwzAURA0Cibb0B2DjH0i413Ece1mFV1HVLgpihSrXuRGGPKrEBfXvCVBpNCMdaWYxY3bWtA0xdoUQI4K5mT_ly2UsAHWspTYC4IRNTaYxBQMKUKpTNhKoMJISsgs27vsPAJEYk4zY22ofKk8dv6VALvi24YPWVNsmeMfXO-uIl203oAHUg_FZY6tD73v-6sM7z9vmq632v01b8SXtu78I32332V-y89JWPU2POWEv93fP-WO0WD3M89ki8pilITJYSLUtpC7KMiVUmTCFBShROEMFEmKaWQeJs04Kg3qrrM5EWqAC5RyVyYRd_-96ItrsOl_b7rA5vpH8AOf3Vtg |
ContentType | Conference Proceeding |
DBID | 6IE 6IH CBEJK RIE RIO |
DOI | 10.1109/IJCNN.2018.8489200 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISBN | 9781509060146 1509060146 |
EISSN | 2161-4407 |
EndPage | 8 |
ExternalDocumentID | 8489200 |
Genre | orig-research |
GroupedDBID | 29I 29O 6IE 6IF 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO RNS |
ID | FETCH-LOGICAL-i175t-91d46bd48dff5e16729da00f12c9ed1e1157ac03cac42918b6a8725d1606ccef3 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 27 02:52:48 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i175t-91d46bd48dff5e16729da00f12c9ed1e1157ac03cac42918b6a8725d1606ccef3 |
PageCount | 8 |
ParticipantIDs | ieee_primary_8489200 |
PublicationCentury | 2000 |
PublicationDate | 2018-July |
PublicationDateYYYYMMDD | 2018-07-01 |
PublicationDate_xml | – month: 07 year: 2018 text: 2018-July |
PublicationDecade | 2010 |
PublicationTitle | 2018 International Joint Conference on Neural Networks (IJCNN) |
PublicationTitleAbbrev | IJCNN |
PublicationYear | 2018 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
SSID | ssj0023993 ssj0002685453 |
Score | 1.7020029 |
Snippet | Sentiment analysis is a text categorization problem that consists in automatically assigning text documents to pre- defined classes that represent sentiments... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 1 |
SubjectTerms | Clustering algorithms Convolutional neural networks Deep Learning Outlier Detection Semantics Sentiment analysis Task analysis Training |
Title | Outlier Detection on Semantic Space for Sentiment Analysis With Convolutional Neural Networks |
URI | https://ieeexplore.ieee.org/document/8489200 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELbaTkwFWsRbHhhJmqfjzIWqVGpBKhVdUOXHRVRAikrCwK_nnBcCMSBFSuQhse4c32f7vu8IuYiTELirmOWHkbQCGQhLhsAsnQjfFTHI0DXc4emMjRfBZBkuW-Sy4cIAQJF8BrZ5LM7y9UblZqtswAMeo1fbpI3DrORqNfspHuMIBppZ2FA2_Zok48SDm8lwNjOZXNyu3vKjnEoRTUZdMq37USaRPNt5Jm31-Uui8b8d3SX9b94evWsi0h5pQbpPunXhBlr9xz3yeJtniD239AqyIhUrpXjN4RXNvFZ0jutooIhmsSnNCvl_WouX0Id19kTxex_VmBUv1Ah8FLcio_y9Txaj6_vh2KrqLFhrBA8Zznc6YFIHXCfoOZch3tbCcRLXUzFoF4wej1COr4TC6OVyyQSPvFC7uPhRChL_gHTSTQqHhHoiYlJGPJEAAUP4kcTaF54QgXZAsvCI9Iy1Vm-llMaqMtTx380nZMd4rMyOPSWdbJvDGWKATJ4Xzv8CHnKy1g |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8JAEN4gHvSECsa3e_BooY_tsj2jBBTQBIhcDNnHNBK1GCwe_PXOtgWj8WDSpM0e2s3Mdufb3e-bIeQiikMQnuZOEDaVwxSTjgqBOyaWgScjUKFntcP9Ae-M2c0knJTI5VoLAwAZ-Qzq9jE7yzdzvbRbZQ3BRIRe3SCbGPdZmKu11jsqPhfYuJ6HrWgzWMlk3KjRvWkNBpbLJerFe34UVMniSbtC-que5DSS5_oyVXX9-StJ43-7ukNq38o9er-OSbukBMkeqaxKN9DiT66Sx7tliuhzQa8gzchYCcVrCK9o6JmmQ1xJA0U8i01JmhUAoKv0JfRhlj5R_N5HMWrlC7UpPrJbxil_r5Fx-3rU6jhFpQVnhvAhxRnPMK4MEyZG33kcEbeRrht7vo7AeGAz8kjtBlpqjF-eUFyKph8aD5c_WkMc7JNyMk_ggFBfNrlSTRErAMYRgMSRCaQvJTMuKB4ekqq11vQtT6YxLQx19HfzOdnqjPq9aa87uD0m29Z7OVf2hJTTxRJOERGk6iwbCF-oXbYj |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+International+Joint+Conference+on+Neural+Networks+%28IJCNN%29&rft.atitle=Outlier+Detection+on+Semantic+Space+for+Sentiment+Analysis+With+Convolutional+Neural+Networks&rft.au=Schmitt%2C+Murilo+Falleiros+Lemos&rft.au=Spinosa%2C+Eduardo+J.&rft.date=2018-07-01&rft.pub=IEEE&rft.eissn=2161-4407&rft.spage=1&rft.epage=8&rft_id=info:doi/10.1109%2FIJCNN.2018.8489200&rft.externalDocID=8489200 |