Single-pass end-to-end neural decompilation using copying mechanism

Traditional decompilers utilize countless hardcoded rules written by subject matter experts, making them inflexible. Some recent systems address this using deep learning. The current consensus is that these systems have to include considerable domain knowledge and iterative heuristic components to s...

Full description

Saved in:
Bibliographic Details
Published inNeural computing & applications Vol. 37; no. 7; pp. 5309 - 5323
Main Authors Szalay, Gergő, Poór, Máté Bálint, Pintér, Balázs, Gregorics, Tibor
Format Journal Article
LanguageEnglish
Published London Springer London 01.03.2025
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Traditional decompilers utilize countless hardcoded rules written by subject matter experts, making them inflexible. Some recent systems address this using deep learning. The current consensus is that these systems have to include considerable domain knowledge and iterative heuristic components to solve parts of the decompilation problem, particularly the problem of predicting identifiers and literals. In this paper, we present a single-pass end-to-end neural decompilation system that utilizes copying mechanism . The copying mechanism is able to copy the literals and (offsets of) variables directly from the assembly code, in a single step , as part of the single forward pass through the model. Additionally, we take a further step toward decompiling real-world code by addressing important programming constructs like switch statements, function definitions, and function calls. We compile a dataset of real-world programming competition code and evaluate our model on it. The method achieves a program accuracy of 73% on the hardest complexity level of our generated dataset and 51% on the real-world examples without any additional error correction (EC) techniques, which surpasses the results of previous works without EC.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-024-10735-9