Classification of Tweets using a Machine Learning and Natural Language Processing Algorithm for Disaster Prediction

This research investigates the use of machine learning (ML) and natural language processing (NLP) algorithms for the categorization of tweets to anticipate disasters. This study aims to use the extensive and up-to-date social media data, namely from Twitter, to construct a reliable model for disting...

Full description

Saved in:
Bibliographic Details
Published in2024 3rd International Conference for Innovation in Technology (INOCON) pp. 1 - 5
Main Authors Gill, Kanwarpartap Singh, Anand, Vatsala, Upadhyay, Deepak, Dangi, Sarishma
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.03.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This research investigates the use of machine learning (ML) and natural language processing (NLP) algorithms for the categorization of tweets to anticipate disasters. This study aims to use the extensive and up-to-date social media data, namely from Twitter, to construct a reliable model for distinguishing tweets that pertain to disasters from those that do not. The technique being offered encompasses many key steps, including the gathering of data, pre-processing of the collected data, extraction of relevant features, and the subsequent deployment of several machine learning models. The primary objective is to develop a highly effective and precise system that can classify tweets in real-time, hence enhancing early warning systems and catastrophe management. The efficacy of the model will be assessed using evaluation criteria such as precision, recall, and accuracy. This will position the model as a helpful tool for boosting catastrophe prediction skills. The primary objective of this research is to forecast if a particular tweet pertains to an actual catastrophe or not. If this is the case, make a prediction of 1. If the condition is not met, the anticipated outcome would be a value of zero. The outcomes are also represented in the form of Learning Rate and Confusion Matrices in the proposed research.
DOI:10.1109/INOCON60754.2024.10512145