Indonesian disaster named entity recognition from multi source information using bidirectional LSTM (BiLSTM)

Precise logistic support is essential after a disaster occurs. It must be timely, accurate, targeted, and based on existing needs. However, obtaining sufficient and accurate information related to logistic distribution locations remains a key problem. Therefore, implementing Named Entity Recognition...

Full description

Saved in:
Bibliographic Details
Published inJournal of open innovation Vol. 10; no. 3; p. 100358
Main Authors Shidik, Guruh Fajar, Saputra, Filmada Ocky, Saraswati, Galuh Wilujeng, Winarsih, Nurul Anisa Sri, Rohman, Muhammad Syaifur, Pramunendar, Ricardus Anggi, Kusuma, Edi Jaya, Ratmana, Danny Oka, Venus, Valentijn, Andono, Pulung Nurtantio, Hasibuan, Zainal Arifin
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.09.2024
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Precise logistic support is essential after a disaster occurs. It must be timely, accurate, targeted, and based on existing needs. However, obtaining sufficient and accurate information related to logistic distribution locations remains a key problem. Therefore, implementing Named Entity Recognition (NER) can address this issue. In recent years, news coverage through Indonesian digital news media and social media accounts has emerged as a promising source for building a disaster data corpus. This study implemented NER to extract and identify named entities from text-based information, particularly from Indonesian digital news media. In addition to using regular entities from the NER standard, this study introduced new entities specialized for disaster-related information, including DISASTER, SCALE, SUPPLIES, CASUALTIES, and OUTSIDE. The new disaster corpus in the Indonesian language for the NER model was obtained with an imbalanced dataset composition. To overcome this problem, random oversampling was applied. This study also utilized the BiLSTM model to recognize each entity in new textual information, evaluating its performance when the proposed Indonesian disaster corpus was used as a training reference in the deep learning model. Several optimization algorithms applied in BiLSTM were evaluated. The results showed improved BiLSTM performance using Adam optimization and a balanced corpus. Performance indicators achieved were 93.4 %, 82.4 %, and 87.5 % for precision, recall, and F1-score, respectively. The BiLSTM network captured long-range dependencies in sequential data provided by NER. Oversampling ensured that the proposed NER model could precisely recognize all entities and reduce biased results. Thus, the BiLSTM method can better identify entities in the textual corpus of Indonesian disaster-related online news.
ISSN:2199-8531
2199-8531
DOI:10.1016/j.joitmc.2024.100358