CodeGen4Libs: A Two-Stage Approach for Library-Oriented Code Generation

Automated code generation has been extensively studied in recent literature. In this work, we first survey 66 participants to motivate a more pragmatic code generation scenario, i.e., library-oriented code generation, where the generated code should implement the functionally of the natural language...

Full description

Saved in:
Bibliographic Details
Published in2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE) pp. 434 - 445
Main Authors Liu, Mingwei, Yang, Tianyong, Lou, Yiling, Du, Xueying, Wang, Ying, Peng, Xin
Format Conference Proceeding
LanguageEnglish
Published IEEE 11.09.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Automated code generation has been extensively studied in recent literature. In this work, we first survey 66 participants to motivate a more pragmatic code generation scenario, i.e., library-oriented code generation, where the generated code should implement the functionally of the natural language query with the given library. We then revisit existing learning-based code generation techniques and find they have limited effectiveness in such a library-oriented code generation scenario. To address this limitation, we propose a novel library-oriented code generation technique, CodeGen4Libs, which incorporates two stages: import generation and code generation. The import generation stage generates import statements for the natural language query with the given third-party libraries, while the code generation stage generates concrete code based on the generated imports and the query. To evaluate the effectiveness of our approach, we conduct extensive experiments on a dataset of 403,780 data items. Our results demonstrate that CodeGen4Libs outperforms baseline models in both import generation and code generation stages, achieving improvements of up to 97.4% on EM (Exact Match), 54.5% on BLEU, and 53.5% on Hit@All. Overall, our proposed CodeGen4Libs approach shows promising results in generating high-quality code with specific third-party libraries, which can improve the efficiency and effectiveness of software development.
ISSN:2643-1572
DOI:10.1109/ASE56229.2023.00159