TY - GEN
T1 - SoTaNa
T2 - 2nd IEEE/ACM International Conference on AI Foundation Models and Software Engineering, FORGE 2025
AU - Shi, Ensheng
AU - Wang, Yanlin
AU - Zhang, Fengji
AU - Chen, Bei
AU - Zhang, Hongyu
AU - Wang, Yanli
AU - Guo, Daya
AU - Du, Lun
AU - Han, Shi
AU - Zhang, Dongmei
AU - Sun, Hongbin
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Software development plays a crucial role in driving innovation and efficiency in modern societies. To meet the demands of this dynamic field, there is a growing need for an effective software development assistant. However, existing large language models represented by ChatGPT suffer from limited accessibility, including training data and model weights. Although other large open-source models like LLaMA have shown promise, they still struggle with understanding human intent. In this paper, we present SoTaNa, an open-source software engineering instruction-tuned model. SoTaNa utilizes ChatGPT to generate high-quality instruction-based data for the domain of software engineering and employs a parameter-efficient fine-tuning approach to enhance the open-source foundation model, LLaMA. We evaluate the effectiveness of SoTaNa in answering Stack Overflow questions and demonstrate its capabilities. Additionally, we discuss its capabilities in code summarization and generation, as well as the impact of varying the volume of generated data on model performance. Notably, SoTaNa can run on a single GPU, making it accessible to a broader range of researchers. Our code, model weights, and data are publicly available at https://github.com/DeepSoftwareAnalytics/SoTaNa.
AB - Software development plays a crucial role in driving innovation and efficiency in modern societies. To meet the demands of this dynamic field, there is a growing need for an effective software development assistant. However, existing large language models represented by ChatGPT suffer from limited accessibility, including training data and model weights. Although other large open-source models like LLaMA have shown promise, they still struggle with understanding human intent. In this paper, we present SoTaNa, an open-source software engineering instruction-tuned model. SoTaNa utilizes ChatGPT to generate high-quality instruction-based data for the domain of software engineering and employs a parameter-efficient fine-tuning approach to enhance the open-source foundation model, LLaMA. We evaluate the effectiveness of SoTaNa in answering Stack Overflow questions and demonstrate its capabilities. Additionally, we discuss its capabilities in code summarization and generation, as well as the impact of varying the volume of generated data on model performance. Notably, SoTaNa can run on a single GPU, making it accessible to a broader range of researchers. Our code, model weights, and data are publicly available at https://github.com/DeepSoftwareAnalytics/SoTaNa.
KW - Data Generation
KW - Instruction Fine-tuning
KW - Large Language Models
KW - Software Development Assistant
UR - https://www.scopus.com/pages/publications/105011365779
U2 - 10.1109/Forge66646.2025.00010
DO - 10.1109/Forge66646.2025.00010
M3 - 会议稿件
AN - SCOPUS:105011365779
T3 - Proceedings - 2025 IEEE/ACM 2nd International Conference on AI Foundation Models and Software Engineering, FORGE 2025
SP - 26
EP - 37
BT - Proceedings - 2025 IEEE/ACM 2nd International Conference on AI Foundation Models and Software Engineering, FORGE 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 27 April 2025 through 28 April 2025
ER -