Scalable framework for AIS data exploration through effective density visualizations

With tens of thousands of vessels around the globe transmitting their positions daily, interpreting such large volumes of data is more than a challenging task. Through the Automatic Identification System (AIS), introduced in 2002, the coordinates and status of the vessels are continuously reported,...

Full description

Saved in:
Bibliographic Details
Published inOCEANS 2023 - Limerick pp. 1 - 6
Main Authors Troupiotis-Kapeliaris, Alexandros, Tsili, Eleni, Kaliorakis, Manolis, Spiliopoulos, Giannis, Zissis, Dimitris
Format Conference Proceeding
LanguageEnglish
Published IEEE 05.06.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:With tens of thousands of vessels around the globe transmitting their positions daily, interpreting such large volumes of data is more than a challenging task. Through the Automatic Identification System (AIS), introduced in 2002, the coordinates and status of the vessels are continuously reported, with a transmission frequency ranging from 3 minutes down to a few seconds depending on their speed. Today, these millions of AIS messages and dozens of gigabytes of new data produced daily allow monitoring the movement of passenger or commercial vessels, as well as more complex activities like fishing and search-and-rescue operations. Studying the properties of AIS data and modeling vessel behavior has been the subject of numerous works the past few years. These attempts to describe vessel activity aim at a better understanding of their movement, often through the use of advanced mechanisms for capturing specific types of events. Although such approaches have been proven effective for a variety of scenarios, the resulting models are not easily comprehensible by the user, with notable examples being the trained neural networks or many of the classification models. Moreover, although recently there have been a few proposed works for extracting common vessel routes through historic data analysis, the end results by design do not provide the full picture regarding all movement in the area, solely including representative pathways. In order to overcome these issues, easily interpretable visualizations of movement at sea would provide a clear understanding of vessel behavior and the occurring trends. An experimental analysis that highlights the utility of vessel density maps in marine activities was presented in 2015 by Shelmerdine [1], with indicative experiments performed on a limited area around Shetland. A more scenario-specific analysis by Vespe et al. [2] focuses on visualizing the impact of piracy events over transport, while Chen et al. [3] attempted to also include the reported speed and course information from the AIS data in their maps. Furthermore, a framework for creating heat maps through a parallel Kernel Density Estimation (KDE) was proposed recently [4]. In their approach, Huang et al. present an efficient pipeline for trajectory compression and visualization in the context of Internet of Things (IoT) applications, with their solution relying on GPU-related accelerations. In this work, we extend our own MT-AIS-Toolbox [5], and present a scalable and effective tool for handling AIS datasets and visualizing vessel activity. For the purpose of creating an efficient and easily configurable solution, a state-of-the-art framework for scalable data processing, namely PySpark, is utilized. The proposed tool is able to manage large volumes of raw AIS messages and produce effective density maps for vessel movement according to the user configurations and needs. Our approach is split into two separate steps: first a dedicated mechanism is responsible for removing unnecessary or erroneous records and limiting the dataset within the spatio-temporal constraints of each use case. Then, the density of the area of interest is extracted, according to the selected metric, and ready-for-display density maps, that depict the vessel traffic, are generated. A few options for density metrics (such as number of different vessels that passed, the time spent at each area, the number of times vessels passed over an area etc.) are provided, with the user also being able to easily define a function that is best suited for their desired results. Additionally, options for comparing and combining different density maps are also included for a more complete analysis. Indicative experiments on a large real-world trajectory dataset were conducted, highlighting the performance capabilities of the proposed framework, in terms of execution time. Finally, as an application, the proposed extended tool has been utilized for data exploration and preparation during the training of machine learning models, as part of an EU-funded project for the digitalization of vessel behavior (i.e. VesselAI).
DOI:10.1109/OCEANSLimerick52467.2023.10244698