TY - GEN
T1 - Testing Machine Learning Systems in Industry
T2 - 44th ACM/IEEE International Conference on Software Engineering: Software Engineering in Practice, ICSE-SEIP 2022
AU - Li, Shuyue
AU - Guo, Jiaqi
AU - Lou, Jian Guang
AU - Fan, Ming
AU - Liu, Ting
AU - Zhang, Dongmei
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022/10/17
Y1 - 2022/10/17
N2 - Machine learning becomes increasingly prevalent and integrated into a wide range of software systems. These systems, named ML systems, must be adequately tested to gain confidence that they behave correctly. Although many research efforts have been devoted to testing technologies for ML systems, the industrial teams are faced with new challenges on testing the ML systems in real-world settings. To absorb inspirations from the industry on the problems in ML testing, we conducted an empirical study including a survey with 87 responses and interviews with 7 senior ML practitioners from well-known IT companies. Our study uncovers significant industrial concerns on major testing activities, i.e., test data collection, test execution, and test result analysis, and also the good practices and open challenges from the perspective of the industry. (1) Test data collection is conducted in different ways on ML model, data, and code and faced with different challenges. (2) Test execution in ML systems suffers from two major problems: entanglement among the components and the regression on model performance. (3) Test result analysis centers on quantitative methods, e.g., metric-based evaluation, and is combined with some qualitative methods based on practitioners' experience. Based on our findings, we highlight the research opportunities and also provide some implications for practitioners.
AB - Machine learning becomes increasingly prevalent and integrated into a wide range of software systems. These systems, named ML systems, must be adequately tested to gain confidence that they behave correctly. Although many research efforts have been devoted to testing technologies for ML systems, the industrial teams are faced with new challenges on testing the ML systems in real-world settings. To absorb inspirations from the industry on the problems in ML testing, we conducted an empirical study including a survey with 87 responses and interviews with 7 senior ML practitioners from well-known IT companies. Our study uncovers significant industrial concerns on major testing activities, i.e., test data collection, test execution, and test result analysis, and also the good practices and open challenges from the perspective of the industry. (1) Test data collection is conducted in different ways on ML model, data, and code and faced with different challenges. (2) Test execution in ML systems suffers from two major problems: entanglement among the components and the regression on model performance. (3) Test result analysis centers on quantitative methods, e.g., metric-based evaluation, and is combined with some qualitative methods based on practitioners' experience. Based on our findings, we highlight the research opportunities and also provide some implications for practitioners.
KW - machine learning
KW - software testing
KW - survey
UR - https://www.scopus.com/pages/publications/85132832318
U2 - 10.1109/ICSE-SEIP55303.2022.9793981
DO - 10.1109/ICSE-SEIP55303.2022.9793981
M3 - 会议稿件
AN - SCOPUS:85132832318
T3 - Proceedings - International Conference on Software Engineering
SP - 263
EP - 272
BT - Proceedings - 2022 ACM/IEEE 44th International Conference on Software Engineering
PB - IEEE Computer Society
Y2 - 17 May 2022 through 19 May 2022
ER -