Labelrepair: Sequence Labelling for Compilation Errors Repair

Manual fixing of compilation errors could be a tedious and time-consuming task for novice programmers, and even for experienced ones. In recent years, an increasing number of automated repair techniques have been proposed to guide novice programmers and improve the efficiency of software development...

Full description

Saved in:

Bibliographic Details
Published in	2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) pp. 860 - 871
Main Authors	Wu, Zhenyu, Yang, Deheng, Lei, Yan, Xie, Huan, Tang, Minghua, Li, Maojin
Format	Conference Proceeding
Language	English
Published	IEEE 12.03.2024
Subjects	Accuracy Codes compilation errors Decoding Labeling Maintenance engineering program repair Semantics sequence labelling Vocabulary
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Manual fixing of compilation errors could be a tedious and time-consuming task for novice programmers, and even for experienced ones. In recent years, an increasing number of automated repair techniques have been proposed to guide novice programmers and improve the efficiency of software development. Among them, learning-based automated repair techniques have achieved promising results in terms of repair accuracy. However, existing approaches neglect the time efficiency of patch generation, and often treat the compilation errors repair as a neural machine translation task. The end-to-end repair model decoding cannot be parallelized during the inference stage and suffers from redundant decoding search space. Furthermore, the large search space brought by the model poses a potential risk of semantic tampering. To this end, we propose Labelrepair, a novel repair technique that treats the repair of compilation errors as a sequence labelling task. Labelrepair discards the decoding model and converts the search for patches to a search for the error mapping actions between broken code and patch code pairs. In this way, the search space for patch tokens is plum-meted from the entire vocabulary to the size of edit action labels, and the logical semantics of the original program are preserved to some extent. The time complexity of inference is reduced from O(n) to O (1) owing to the parallel generation of edit action labels. Through a comprehensive evaluation of Labelrepair on two datasets, we demonstrate that Labelrepair is able to generate patches instantly (0.71ms on average), which is 28 times faster than existing end-to-end repair models. Compared with existing edit-based repair models, Labelrepair achieves the state-of-art repair accuracy.
ISSN:	2640-7574
DOI:	10.1109/SANER60148.2024.00094