The Improvement of Stress Level Detection in Twitter: Imbalance Classification Using SMOTE

This study developed a model to improve stress level detection using Synthetic Minority Oversampling Technique (SMOTE) imbalanced data classification. SMOTE is a method to address imbalanced datasets to oversample the minority class. The data collected from Twitter may seem vague mainly due to the m...

Full description

Saved in:
Bibliographic Details
Published in2022 IEEE International Conference on Computing (ICOCO) pp. 294 - 298
Main Authors Danuri, Mohd Shahrul Nizam Mohd, Rahman, Rohizah Abd, Mohamed, Ibrahim, Amin, Azzan
Format Conference Proceeding
LanguageEnglish
Published IEEE 14.11.2022
Subjects
Online AccessGet full text
DOI10.1109/ICOCO56118.2022.10031684

Cover

Abstract This study developed a model to improve stress level detection using Synthetic Minority Oversampling Technique (SMOTE) imbalanced data classification. SMOTE is a method to address imbalanced datasets to oversample the minority class. The data collected from Twitter may seem vague mainly due to the massive amount of data. This research used the framework model of Data, Experts Data Annotation, Text Pre-processing, and Text Representation and Classification. The Bag of Word (BoW), Term Frequency-Inverse Document Frequency (TFIDF), and Lemma were used for the text representation. The data were collected only from Twitter under certain circumstances. The Subject Matter Experts (SMEs) on mental health problems have annotated the text from the tweets based on four levels: Normal, Mild, Moderate, and Severe. The data group for the Normal stress level was relatively large compared to the other groups. Due to the imbalanced data group, the SMOTE technique was used for data argumentation. The result showed that the model classification using Support Vector Machine with SMOTE increased by improving the cardinality of the minority class label through the significant Macro Avg Recall and Macro Avg F1-Score analysis results compared to the baseline.
AbstractList This study developed a model to improve stress level detection using Synthetic Minority Oversampling Technique (SMOTE) imbalanced data classification. SMOTE is a method to address imbalanced datasets to oversample the minority class. The data collected from Twitter may seem vague mainly due to the massive amount of data. This research used the framework model of Data, Experts Data Annotation, Text Pre-processing, and Text Representation and Classification. The Bag of Word (BoW), Term Frequency-Inverse Document Frequency (TFIDF), and Lemma were used for the text representation. The data were collected only from Twitter under certain circumstances. The Subject Matter Experts (SMEs) on mental health problems have annotated the text from the tweets based on four levels: Normal, Mild, Moderate, and Severe. The data group for the Normal stress level was relatively large compared to the other groups. Due to the imbalanced data group, the SMOTE technique was used for data argumentation. The result showed that the model classification using Support Vector Machine with SMOTE increased by improving the cardinality of the minority class label through the significant Macro Avg Recall and Macro Avg F1-Score analysis results compared to the baseline.
Author Mohamed, Ibrahim
Danuri, Mohd Shahrul Nizam Mohd
Amin, Azzan
Rahman, Rohizah Abd
Author_xml – sequence: 1
  givenname: Mohd Shahrul Nizam Mohd
  surname: Danuri
  fullname: Danuri, Mohd Shahrul Nizam Mohd
  email: msnizam@kuis.edu.my
  organization: Universiti Islam Antarabangsa Selangor,Faculty of Science and Information Technology,Kajang,MALAYSIA
– sequence: 2
  givenname: Rohizah Abd
  surname: Rahman
  fullname: Rahman, Rohizah Abd
  email: rohizah@ukm.edu.my
  organization: Universiti Kebangsaan Malaysia,Faculty of Information Science and Technology,Bangi,MALAYSIA
– sequence: 3
  givenname: Ibrahim
  surname: Mohamed
  fullname: Mohamed, Ibrahim
  email: ibrahim@ukm.edu.my
  organization: Universiti Kebangsaan Malaysia,Faculty of Information Science and Technology,Bangi,MALAYSIA
– sequence: 4
  givenname: Azzan
  surname: Amin
  fullname: Amin, Azzan
  email: azzan@thelorry.com
  organization: The Lorry Online Sdn Bhd,Shah Alam,MALAYSIA
BookMark eNo1j71OwzAUhY0EAy28AYNfIMV2HMdmQ6HQSEEZmi4slePeC5YSp4qtIt6eip_pnOE7n3QW5DJMAQihnK04Z-a-rtqqLRTneiWYECvOWM6VlhdkwZUqpDZGsWvy1n0ArcfjPJ1ghJDohHSbZoiRNnCCgT5BApf8FKgPtPv0KcH8cF70drDBAa0GG6NH7-wPtIs-vNPta9utb8gV2iHC7V8uye553VWbrGlf6uqxyTznJmWyQOOEkRa4Vs5ZhbpEEH1RytKAFlyJ3PV9AYjnglIXh9wd0GBeChRY5kty9-v1ALA_zn6089f-_2_-Ddr1USU
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICOCO56118.2022.10031684
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1665489960
9781665489966
EndPage 298
ExternalDocumentID 10031684
Genre orig-research
GrantInformation_xml – fundername: Universiti Kebangsaan Malaysia
  funderid: 10.13039/501100004515
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i119t-45f9c294ae186cca6f87fe2b57479e821623cbb5eff23cf485d3cdf9f372f2f73
IEDL.DBID RIE
IngestDate Thu Jan 18 11:14:24 EST 2024
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i119t-45f9c294ae186cca6f87fe2b57479e821623cbb5eff23cf485d3cdf9f372f2f73
PageCount 5
ParticipantIDs ieee_primary_10031684
PublicationCentury 2000
PublicationDate 2022-Nov.-14
PublicationDateYYYYMMDD 2022-11-14
PublicationDate_xml – month: 11
  year: 2022
  text: 2022-Nov.-14
  day: 14
PublicationDecade 2020
PublicationTitle 2022 IEEE International Conference on Computing (ICOCO)
PublicationTitleAbbrev ICOCO
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.8366778
Snippet This study developed a model to improve stress level detection using Synthetic Minority Oversampling Technique (SMOTE) imbalanced data classification. SMOTE is...
SourceID ieee
SourceType Publisher
StartPage 294
SubjectTerms Analytical models
Annotations
Blogs
Data models
Imbalanced Data
Mental health
Online Social Network (OSN)
SMOTE
Social networking (online)
Stress Level Detection
Support vector machines
Title The Improvement of Stress Level Detection in Twitter: Imbalance Classification Using SMOTE
URI https://ieeexplore.ieee.org/document/10031684
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFA5uJ08qTvxNDl5b1zRpG69zY4rbhG0wvIy-9AWG0MnoEPzrfUlXRUHwFkKahnxt349-Xx5jN6g0JMpo8txABhRvAH0HUQYqVkixdmSNdmrk0TgZzuXjQi12YnWvhUFETz7D0DX9v_xibbYuVUZveNfVWZIt1qLnrBZrNeycrr596E16E_IHIkfZEiJshv8onOLtxuCAjZs71nSR13BbQWg-fh3G-O8lHbLOt0SPP38ZnyO2h-UxeyHQeZ0m8Fk_vrZ86sUg_MmRg_g9Vp56VfJVyWfvKyfluaMrwBEcaUZfItORhzxe3PMJ-HQ0mfU7bD7oz3rDYFc8IVhFka4Cqaw2QsscoywhmBKbpRYFKIofNGYiIr_HACi0lhpWZqqITWG1jVNhhU3jE9Yu1yWeMp4WAmRqlcwhk5AlkCc6AZBGGnJAZH7GOm5jlm_1-RjLZk_O_-i_YPsOH6foi-Qla1ebLV6Raa_g2kP6CWpdpQ0
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bS8MwFA46H_RJxYl38-Br65ombePr3Nh0F2EdDF9GT5rAEDqRFsFf70m6KgqCb6GQtOQjOZd-3zmE3GghIRJKoucG3MN4A_Ae1NwTodAYawdGSatGHk-iwZw_LMRiI1Z3WhittSOfad8O3b_8fK0qmyrDE96xfZb4NtlBw89FLddq-DkdeTvsTrtT9AgCS9pizG8m_Gid4ixHf59MmnfWhJEXvyrBVx-_yjH--6MOSPtbpEefvszPIdnSxRF5RthpnShweT-6NnTm5CB0ZOlB9F6XjnxV0FVB0_eVFfPc4QywFEdc0TXJtPQhhxh1jAI6G0_TXpvM-720O_A27RO8VRDI0uPCSMUkz3SQRAhUZJLYaAYCIwipExag56MAhDYGB4YnIg9VbqQJY2aYicNj0irWhT4hNM4Z8NgInkHCIYkgi2QEwBVX6ILw7JS07cYsX-sKGctmT87-eH5NdgfpeLQcDSeP52TPYmX1fQG_IK3yrdKXaOhLuHLwfgI9cKha
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2022+IEEE+International+Conference+on+Computing+%28ICOCO%29&rft.atitle=The+Improvement+of+Stress+Level+Detection+in+Twitter%3A+Imbalance+Classification+Using+SMOTE&rft.au=Danuri%2C+Mohd+Shahrul+Nizam+Mohd&rft.au=Rahman%2C+Rohizah+Abd&rft.au=Mohamed%2C+Ibrahim&rft.au=Amin%2C+Azzan&rft.date=2022-11-14&rft.pub=IEEE&rft.spage=294&rft.epage=298&rft_id=info:doi/10.1109%2FICOCO56118.2022.10031684&rft.externalDocID=10031684