We Know You Are Living in Bali: Location Prediction of Twitter Users Using BERT Language Model

Twitter user location data provide essential information that can be used for various purposes. However, user location is not easy to identify because many profiles omit this information, or users enter data that do not correspond to their actual locations. Several related works attempted to predict...

Full description

Saved in:
Bibliographic Details
Published inBig data and cognitive computing Vol. 6; no. 3; p. 77
Main Authors Simanjuntak, Lihardo Faisal, Mahendra, Rahmad, Yulianti, Evi
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.09.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Twitter user location data provide essential information that can be used for various purposes. However, user location is not easy to identify because many profiles omit this information, or users enter data that do not correspond to their actual locations. Several related works attempted to predict location on English-language tweets. In this study, we attempted to predict the location of Indonesian tweets. We utilized machine learning approaches, i.e., long-short term memory (LSTM) and bidirectional encoder representations from transformers (BERT) to infer Twitter users’ home locations using display name in profile, user description, and user tweets. By concatenating display name, description, and aggregated tweet, the model achieved the best accuracy of 0.77. The performance of the IndoBERT model outperformed several baseline models.
ISSN:2504-2289
2504-2289
DOI:10.3390/bdcc6030077