TY - JOUR
T1 - PWLM3-based automatic performance model estimation method for HDFS write and read operations
AU - Tian, Feng
AU - Ma, Tian
AU - Dong, Bo
AU - Zheng, Qinghua
N1 - Publisher Copyright:
© 2015 Elsevier B.V. All rights reserved.
PY - 2015/6/5
Y1 - 2015/6/5
N2 - There is a growing need for the development of an automatic performance model estimation method for Hadoop Distributed File System (HDFS) write and read (W/R) operations in order to deal with constant software improvement and updates, parameter configuration changes, hardware heterogeneity, and their Quality of Service (QoS) evaluation. Extant research based on single linear system model has a limited ability to explain the performance variations due to changes in HDFS parameters such as block size. These variations reveal some typical characteristics of nonlinear systems and are an obstacle in achieving effective automatic performance estimation. In order to deal with this challenge, a piecewise-linear multi-model modeling (PWLM3)-based automatic performance model estimation method is proposed for HDFS W/R performance. In the proposed method, a standard model base is built to standardize the model representation of every submodel. Moreover, a cluster quality assessment strategy is applied to evaluate the optimal number of submodels, and a submodel selection strategy is implemented to construct performance model candidates and improve the computation efficiency of the proposed method. In addition, Levenberg-Marquardt (LM) and Universal Global Optimization (UGO) algorithms are adopted to estimate the values of switch points and identify undetermined parameters of performance model candidates. Then the performance model is selected among these candidates according to Root Mean Squared Error (RMSE) indicator. Experimental results demonstrate that the PWLM3-based performance model provides a good understanding and description of nonlinear characteristics of HDFS W/R performance and achieves better identification precision than a single linear system model-based one.
AB - There is a growing need for the development of an automatic performance model estimation method for Hadoop Distributed File System (HDFS) write and read (W/R) operations in order to deal with constant software improvement and updates, parameter configuration changes, hardware heterogeneity, and their Quality of Service (QoS) evaluation. Extant research based on single linear system model has a limited ability to explain the performance variations due to changes in HDFS parameters such as block size. These variations reveal some typical characteristics of nonlinear systems and are an obstacle in achieving effective automatic performance estimation. In order to deal with this challenge, a piecewise-linear multi-model modeling (PWLM3)-based automatic performance model estimation method is proposed for HDFS W/R performance. In the proposed method, a standard model base is built to standardize the model representation of every submodel. Moreover, a cluster quality assessment strategy is applied to evaluate the optimal number of submodels, and a submodel selection strategy is implemented to construct performance model candidates and improve the computation efficiency of the proposed method. In addition, Levenberg-Marquardt (LM) and Universal Global Optimization (UGO) algorithms are adopted to estimate the values of switch points and identify undetermined parameters of performance model candidates. Then the performance model is selected among these candidates according to Root Mean Squared Error (RMSE) indicator. Experimental results demonstrate that the PWLM3-based performance model provides a good understanding and description of nonlinear characteristics of HDFS W/R performance and achieves better identification precision than a single linear system model-based one.
KW - Cloud storage service
KW - HDFS
KW - Nonlinear system
KW - Performance modeling
KW - QoS
UR - https://www.scopus.com/pages/publications/84937871301
U2 - 10.1016/j.future.2015.01.011
DO - 10.1016/j.future.2015.01.011
M3 - 文章
AN - SCOPUS:84937871301
SN - 0167-739X
VL - 50
SP - 127
EP - 139
JO - Future Generation Computer Systems
JF - Future Generation Computer Systems
ER -