Comparison of the TF-IDF Method with the Count Vectorizer to Classify Hate Speech

Hate speech is a form of expression used to spread hatred and commit acts of violence and discrimination against a person or group of people for various reasons. Cases of hate speech are very common in social media, one of which is Twitter. The goal to be achieved is to create a system that can clas...

Full description

Saved in:
Bibliographic Details
Published inEngineering, MAthematics and Computer Science (EMACS) Journal Vol. 5; no. 2; pp. 79 - 83
Main Author Suryaningrum, Kristien Margi
Format Journal Article
LanguageEnglish
Published 31.05.2023
Online AccessGet full text
ISSN2686-2573
2686-2573
DOI10.21512/emacsjournal.v5i2.9978

Cover

Loading…
Abstract Hate speech is a form of expression used to spread hatred and commit acts of violence and discrimination against a person or group of people for various reasons. Cases of hate speech are very common in social media, one of which is Twitter. The goal to be achieved is to create a system that can classify a tweet on Twitter into hate speech (HS) or non-hate speech (NONHS) classes. The method used is Support Vector Machine by comparing the features of TF-IDF and Count Vectorizer. And the parameters compared are seen from accuracy, precision, recall, and f1-score. Results obtained, overall, by using the TF-IDF feature, the Support Vector Machine algorithm gets high results compared to the Count Vectorizer feature, with an accuracy value of 88.77%, 87.45% precision, 88.77% recall, and f1-score of 87.81%.
AbstractList Hate speech is a form of expression used to spread hatred and commit acts of violence and discrimination against a person or group of people for various reasons. Cases of hate speech are very common in social media, one of which is Twitter. The goal to be achieved is to create a system that can classify a tweet on Twitter into hate speech (HS) or non-hate speech (NONHS) classes. The method used is Support Vector Machine by comparing the features of TF-IDF and Count Vectorizer. And the parameters compared are seen from accuracy, precision, recall, and f1-score. Results obtained, overall, by using the TF-IDF feature, the Support Vector Machine algorithm gets high results compared to the Count Vectorizer feature, with an accuracy value of 88.77%, 87.45% precision, 88.77% recall, and f1-score of 87.81%.
Author Suryaningrum, Kristien Margi
Author_xml – sequence: 1
  givenname: Kristien Margi
  surname: Suryaningrum
  fullname: Suryaningrum, Kristien Margi
BookMark eNqFkM1KAzEUhYNUsNY-g3mBqZNkMkkWLmS0P1ARsbodMukNE5lOSiZV6tM7rS6KG1f3cOE7HL5LNGh9Cwhdk3RCCSf0BjbadO9-F1rdTD64oxOlhDxDQ5rLPKFcsMFJvkDjrnNVmmWCcUXoED0XfrPVwXW-xd7iWANeTZPF_RQ_Qqz9Gn-6WB_fhd-1Eb-BiT64Lwg4elw0uu-zezzXEfDLFsDUV-jc6qaD8e8dodfpw6qYJ8un2aK4WyamHy4TSSClXFpKcq6EMdryjAJbi35XtlbWKitIprSmFQEuRcZzlnKVVsJIVjHKRkj89Jrguy6ALbfBbXTYlyQtj3LKUznlQU55kNOTt39I46KOzrcxaNf8y38DugVzKA
CitedBy_id crossref_primary_10_1016_j_heliyon_2024_e39953
ContentType Journal Article
DBID AAYXX
CITATION
DOI 10.21512/emacsjournal.v5i2.9978
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
DeliveryMethod fulltext_linktorsrc
EISSN 2686-2573
EndPage 83
ExternalDocumentID 10_21512_emacsjournal_v5i2_9978
GroupedDBID AAYXX
CITATION
M~E
ID FETCH-LOGICAL-c1518-81e0258f216597ccaf542e3d75914d9ff9f7149aa2b1e58745630590b7c83b323
ISSN 2686-2573
IngestDate Thu Apr 24 22:54:11 EDT 2025
Tue Jul 01 03:10:31 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Issue 2
Language English
License https://creativecommons.org/licenses/by-sa/4.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c1518-81e0258f216597ccaf542e3d75914d9ff9f7149aa2b1e58745630590b7c83b323
OpenAccessLink https://journal.binus.ac.id/index.php/EMACS/article/download/9978/4813
PageCount 5
ParticipantIDs crossref_primary_10_21512_emacsjournal_v5i2_9978
crossref_citationtrail_10_21512_emacsjournal_v5i2_9978
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-05-31
PublicationDateYYYYMMDD 2023-05-31
PublicationDate_xml – month: 05
  year: 2023
  text: 2023-05-31
  day: 31
PublicationDecade 2020
PublicationTitle Engineering, MAthematics and Computer Science (EMACS) Journal
PublicationYear 2023
SSID ssib044735912
Score 1.8360838
Snippet Hate speech is a form of expression used to spread hatred and commit acts of violence and discrimination against a person or group of people for various...
SourceID crossref
SourceType Enrichment Source
Index Database
StartPage 79
Title Comparison of the TF-IDF Method with the Count Vectorizer to Classify Hate Speech
Volume 5
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3Pa9swFBZpdumldHSl6dqiQ2_BaSJZkn0MaUJWSKGsGbkZ_5BGYHVK5g3SQ__2PkmWo41A1l2EEegh631-enryex9C11KoVHEWBikr8iAMCxpkQrGgzyIiIslVoUy1z3s-nYd3C7ZotV797JIq6-UvO_NK_ker0Ad61Vmy79BsIxQ64Bn0Cy1oGNp_0vHIJxE0LiRM68vtpDszvNB1kNUy05VV95uJ0C9f5Fp7nIYOc6k23Sm4m5qHXtZBLRen31YqNFHToavvaqs6OzqIxjqAqzqeDUdfdZzhzpu9uXFab1IdgFlbVmVrWWSpM4W-L_3AA6HuztzZJ8IjHsAXb-2T3NFXG1jm4Yh4xtKyyLhtl-4y6MYh0bp9SvOf9cr3frMl6cWxpf75s4T2X1tb88MhHHWMqMQXlGhBiRZ0gD4QIcw1_-x17OxRqGmZY3Nh3ryX_UXQyLrZPSnPwfE8lcdjdFSvPR5avHxELVmeoIctVvBKYdAltljBFitYY8V0G6zgLVZwtcIOK1hjBVusfELzyfhxNA1qOo0gh-lGQTSQ4OBGigw4nCLhy1UsJJIWAt4xLGKlYiXgvJymJBtIpmkQONWpyZnII5pRQk9Ru1yV8gxhksGCCNXPYknDnNJUahJzEqtYKg6yO4i7NUjyuta8pjz5kexRRAf1m4HPttzKviHn7x_yGR1uAX2B2tX6l7wE37LKrgwA3gAh7n8p
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Comparison+of+the+TF-IDF+Method+with+the+Count+Vectorizer+to+Classify+Hate+Speech&rft.jtitle=Engineering%2C+MAthematics+and+Computer+Science+%28EMACS%29+Journal&rft.au=Suryaningrum%2C+Kristien+Margi&rft.date=2023-05-31&rft.issn=2686-2573&rft.eissn=2686-2573&rft.volume=5&rft.issue=2&rft.spage=79&rft.epage=83&rft_id=info:doi/10.21512%2Femacsjournal.v5i2.9978&rft.externalDBID=n%2Fa&rft.externalDocID=10_21512_emacsjournal_v5i2_9978
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2686-2573&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2686-2573&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2686-2573&client=summon