PWLM3-based automatic performance model estimation method for HDFS write and read operations

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

There is a growing need for the development of an automatic performance model estimation method for Hadoop Distributed File System (HDFS) write and read (W/R) operations in order to deal with constant software improvement and updates, parameter configuration changes, hardware heterogeneity, and their Quality of Service (QoS) evaluation. Extant research based on single linear system model has a limited ability to explain the performance variations due to changes in HDFS parameters such as block size. These variations reveal some typical characteristics of nonlinear systems and are an obstacle in achieving effective automatic performance estimation. In order to deal with this challenge, a piecewise-linear multi-model modeling (PWLM3)-based automatic performance model estimation method is proposed for HDFS W/R performance. In the proposed method, a standard model base is built to standardize the model representation of every submodel. Moreover, a cluster quality assessment strategy is applied to evaluate the optimal number of submodels, and a submodel selection strategy is implemented to construct performance model candidates and improve the computation efficiency of the proposed method. In addition, Levenberg-Marquardt (LM) and Universal Global Optimization (UGO) algorithms are adopted to estimate the values of switch points and identify undetermined parameters of performance model candidates. Then the performance model is selected among these candidates according to Root Mean Squared Error (RMSE) indicator. Experimental results demonstrate that the PWLM3-based performance model provides a good understanding and description of nonlinear characteristics of HDFS W/R performance and achieves better identification precision than a single linear system model-based one.

Original languageEnglish
Pages (from-to)127-139
Number of pages13
JournalFuture Generation Computer Systems
Volume50
DOIs
StatePublished - 5 Jun 2015

Keywords

  • Cloud storage service
  • HDFS
  • Nonlinear system
  • Performance modeling
  • QoS

Fingerprint

Dive into the research topics of 'PWLM3-based automatic performance model estimation method for HDFS write and read operations'. Together they form a unique fingerprint.

Cite this