TY - JOUR
T1 - An improved data characterization method and its application in classification algorithm recommendation
AU - Wang, Guangtao
AU - Song, Qinbao
AU - Zhu, Xiaoyan
N1 - Publisher Copyright:
© 2015, Springer Science+Business Media New York.
PY - 2015/7/2
Y1 - 2015/7/2
N2 - Picking up appropriate classification algorithms for a given data set is very important and useful in practice. One of the most challenging issues for algorithm selection is how to characterize different data sets. Recently, we extracted the structural information of a data set to characterize itself. Although these kinds of characteristics work well in identifying similar data sets and recommending appropriate classification algorithms, the extraction method can only be applied to binary data sets and its performance is not high. Thus, in this paper, an improved data set characterization method is proposed to address these problems. For the purpose of evaluating the effectiveness of the improved method on algorithm recommendation, the unsupervised learning method EM is employed to build the algorithm recommendation model. Extensive experiments with 17 different types of classification algorithms are conducted upon 84 public UCI data sets; the results demonstrate the effectiveness of the proposed method.
AB - Picking up appropriate classification algorithms for a given data set is very important and useful in practice. One of the most challenging issues for algorithm selection is how to characterize different data sets. Recently, we extracted the structural information of a data set to characterize itself. Although these kinds of characteristics work well in identifying similar data sets and recommending appropriate classification algorithms, the extraction method can only be applied to binary data sets and its performance is not high. Thus, in this paper, an improved data set characterization method is proposed to address these problems. For the purpose of evaluating the effectiveness of the improved method on algorithm recommendation, the unsupervised learning method EM is employed to build the algorithm recommendation model. Extensive experiments with 17 different types of classification algorithms are conducted upon 84 public UCI data sets; the results demonstrate the effectiveness of the proposed method.
KW - Classification
KW - Classification algorithm recommendation
KW - Data set characteristics extraction
UR - https://www.scopus.com/pages/publications/84946500645
U2 - 10.1007/s10489-015-0689-3
DO - 10.1007/s10489-015-0689-3
M3 - 文章
AN - SCOPUS:84946500645
SN - 0924-669X
VL - 43
SP - 892
EP - 912
JO - Applied Intelligence
JF - Applied Intelligence
IS - 4
ER -