On-line synthesis of parsers for string events

A string event is the occurrence of a specific pattern in the textual output of a program. The capture and treatment of string events has several applications, such as log anonymization, error handling and user notification. However, there is no systematic approach to identify and treat string event...

Full description

Saved in:
Bibliographic Details
Published inJournal of computer languages (Online) Vol. 62; p. 101022
Main Authors Saffran, João, Barbosa, Haniel, Pereira, Fernando Magno Quintão, Vladamani, Srinivas
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.02.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A string event is the occurrence of a specific pattern in the textual output of a program. The capture and treatment of string events has several applications, such as log anonymization, error handling and user notification. However, there is no systematic approach to identify and treat string events today. This paper formally defines string events and brings forward the theory and practice of a general framework to handle them. The framework encompasses an example-based user interface to specify string patterns plus a grammar synthesizer that allows efficiently parsing such patterns. We demonstrate the effectiveness of this framework by using it to implement Zhefuscator, a system that redacts occurrences of sensitive information in database logs. Zhefuscator is implemented as an extension to the Java Virtual Machine (JVM). It intercepts patterns of interest on-the-fly and does not require interventions in the source code of the protected program. It can infer log formats and capture string events with minimal performance overhead. As an illustration, it is up to 14x faster than an equivalent brute-force approach, converging to a definitive grammar after observing less than 10 examples from typical logs.
ISSN:2590-1184
2590-1184
DOI:10.1016/j.cola.2021.101022