An Empirical Study on JIT Defect Prediction Based on BERT-style Model
Previous works on Just-In-Time (JIT) defect prediction tasks have primarily applied pre-trained models directly, neglecting the configurations of their fine-tuning process. In this study, we perform a systematic empirical study to understand the impact of the settings of the fine-tuning process on B...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
17.03.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Previous works on Just-In-Time (JIT) defect prediction tasks have primarily
applied pre-trained models directly, neglecting the configurations of their
fine-tuning process. In this study, we perform a systematic empirical study to
understand the impact of the settings of the fine-tuning process on BERT-style
pre-trained model for JIT defect prediction. Specifically, we explore the
impact of different parameter freezing settings, parameter initialization
settings, and optimizer strategies on the performance of BERT-style models for
JIT defect prediction. Our findings reveal the crucial role of the first
encoder layer in the BERT-style model and the project sensitivity to parameter
initialization settings. Another notable finding is that the addition of a
weight decay strategy in the Adam optimizer can slightly improve model
performance. Additionally, we compare performance using different feature
extractors (FCN, CNN, LSTM, transformer) and find that a simple network can
achieve great performance. These results offer new insights for fine-tuning
pre-trained models for JIT defect prediction. We combine these findings to find
a cost-effective fine-tuning method based on LoRA, which achieve a comparable
performance with only one-third memory consumption than original fine-tuning
process. |
---|---|
DOI: | 10.48550/arxiv.2403.11158 |