TY - GEN
T1 - Towards Efficient Fine-Tuning of Pre-trained Code Models
T2 - 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2023
AU - Shi, Ensheng
AU - Wang, Yanlin
AU - Zhang, Hongyu
AU - Du, Lun
AU - Han, Shi
AU - Zhang, Dongmei
AU - Sun, Hongbin
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/7/12
Y1 - 2023/7/12
N2 - Recently, fine-tuning pre-trained code models such as CodeBERT on downstream tasks has achieved great success in many software testing and analysis tasks. While effective and prevalent, fine-tuning the pre-trained parameters incurs a large computational cost. In this paper, we conduct an extensive experimental study to explore what happens to layer-wise pre-trained representations and their encoded code knowledge during fine-tuning. We then propose efficient alternatives to fine-tune the large pre-trained code model based on the above findings. Our experimental study shows that (1) lexical, syntactic and structural properties of source code are encoded in the lower, intermediate, and higher layers, respectively, while the semantic property spans across the entire model. (2) The process of fine-tuning preserves most of the code properties. Specifically, the basic code properties captured by lower and intermediate layers are still preserved during fine-tuning. Furthermore, we find that only the representations of the top two layers change most during fine-tuning for various downstream tasks. (3) Based on the above findings, we propose Telly to efficiently fine-tune pre-trained code models via layer freezing. The extensive experimental results on five various downstream tasks demonstrate that training parameters and the corresponding time cost are greatly reduced, while performances are similar or better.
AB - Recently, fine-tuning pre-trained code models such as CodeBERT on downstream tasks has achieved great success in many software testing and analysis tasks. While effective and prevalent, fine-tuning the pre-trained parameters incurs a large computational cost. In this paper, we conduct an extensive experimental study to explore what happens to layer-wise pre-trained representations and their encoded code knowledge during fine-tuning. We then propose efficient alternatives to fine-tune the large pre-trained code model based on the above findings. Our experimental study shows that (1) lexical, syntactic and structural properties of source code are encoded in the lower, intermediate, and higher layers, respectively, while the semantic property spans across the entire model. (2) The process of fine-tuning preserves most of the code properties. Specifically, the basic code properties captured by lower and intermediate layers are still preserved during fine-tuning. Furthermore, we find that only the representations of the top two layers change most during fine-tuning for various downstream tasks. (3) Based on the above findings, we propose Telly to efficiently fine-tune pre-trained code models via layer freezing. The extensive experimental results on five various downstream tasks demonstrate that training parameters and the corresponding time cost are greatly reduced, while performances are similar or better.
KW - Efficient Fine-tuning
KW - Empirical study
KW - Pre-Trained Language Models
KW - Probing Techniques
KW - Representational Similarity Analysis
UR - https://www.scopus.com/pages/publications/85167697461
U2 - 10.1145/3597926.3598036
DO - 10.1145/3597926.3598036
M3 - 会议稿件
AN - SCOPUS:85167697461
T3 - ISSTA 2023 - Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis
SP - 39
EP - 51
BT - ISSTA 2023 - Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis
A2 - Just, Rene
A2 - Fraser, Gordon
PB - Association for Computing Machinery, Inc
Y2 - 17 July 2023 through 21 July 2023
ER -