What Is New in Our City? A Framework for Event Extraction Using Social Media Posts

Post streams from public social media platforms such as Instagram and Twitter have become precious but noisy data sources to discover what is happening around us. In this paper, we focus on the problem of detecting and presenting local events in real time using social media content. We propose a nov...

Full description

Saved in:
Bibliographic Details
Published inAdvances in Knowledge Discovery and Data Mining pp. 16 - 32
Main Authors Xia, Chaolun, Hu, Jun, Zhu, Yan, Naaman, Mor
Format Book Chapter
LanguageEnglish
Published Cham Springer International Publishing 2015
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Post streams from public social media platforms such as Instagram and Twitter have become precious but noisy data sources to discover what is happening around us. In this paper, we focus on the problem of detecting and presenting local events in real time using social media content. We propose a novel framework for real-time city event detection and extraction. The proposed framework first applies bursty detection to discover candidate event signals from Instagram and Twitter post streams. Then it integrates the two posts streams to extract features for candidate event signals and classifies them into true events or noise. For the true events, the framework extracts various information to summarize and present them. We also propose a novel method that combines text, image and geolocation information to retrieve relevant photos for detected events. Through the experiments on a large dataset, we show that integrating Instagram and Twitter post streams can improve event detection accuracy, and properly combining text, image and geolocation information is able to retrieve more relevant photos for events. Through case studies, we also show that the framework is able to report detected events with low spatial and temporal deviation.
ISBN:3319180371
9783319180373
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-319-18038-0_2