Extracting interrelated information from road-related social media data

•A three-stage model consisting of filtering social media data (SMD), inferring relation types, and recognizing subject and object entities is devised to extract interrelated information from road-related SMD.•A converter is developed to feed the text-formatted interrelated information extracted fro...

Full description

Saved in:
Bibliographic Details
Published inAdvanced engineering informatics Vol. 54; p. 101780
Main Authors Zhou, Shenghua, Thomas Ng, S., Huang, Guanying, Dao, Jicao, Li, Dezhi
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.10.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:•A three-stage model consisting of filtering social media data (SMD), inferring relation types, and recognizing subject and object entities is devised to extract interrelated information from road-related SMD.•A converter is developed to feed the text-formatted interrelated information extracted from SMD into the virtual road model.•Compared with existing SMD-based sensing approaches (SMDSAs) of road conditions, the proposed SMDSA has advancements in eliminating the reliance on human-made grammar rules and achieving data transformation between SMDSAs and virtual road models.•The interrelated information extraction model based on “Bert + ANN” outperforms the commonly-used deep learning algorithms, such as Text CNN, Bi-LSTM, Piecewise CNN, classic Transformer, and Capsule Net. The social media data (SMD) have been viewed as a potential and promising information source of road conditions. However, most existing SMD-based sensing approaches (SMDSAs) either ignore interrelations among information items (e.g., name, direction, and status of the road) or rely on rigid grammar rules to establish entities’ interrelations. Additionally, current SMDSAs in the transportation domain are unable to link the extracted text-formatted information with domain-specific models (e.g., virtual road model, VRM). In order to fill such gaps, this work proposes an improved SMDSA of road conditions, which involves a three-stage (i.e., SMD classification, relation inference, and entity pair recognition) interrelated information extraction model, as well as a semantic converter to feed the SMD-provided text-formatted information into VRMs. The proposed SMDSA is demonstrated by the newly annotated datasets of tweets in Lexington, USA. The three-stage interrelated information extraction model outperforms conventional rule-based methods and deep-learning algorithms (e.g., Text CNN, Bi-LSTM, Piecewise CNN, and Capsule Net). The SMD-enabled VRM also preliminarily shows its capacity to optimize signal timings during incidents that change the road network topology. This work contributes to circumventing the reliance on human-made rules during SMDSAs’ development, bridging user-generated SMD with operable VRMs for potential real-world road management, and providing a standard tweet dataset annotated with interrelation triplets to help promote SMDSA studies.
ISSN:1474-0346
1873-5320
DOI:10.1016/j.aei.2022.101780