跳到主要导航 跳到搜索 跳到主要内容

A general framework for estimating graphlet statistics via random walk

  • Chinese University of Hong Kong
  • University of Science and Technology of China

科研成果: 期刊稿件会议文章同行评审

66 引用 (Scopus)

摘要

Graphlets are induced subgraph patterns and have been frequently applied to characterize the local topology structures of graphs across various domains, e.g., online social networks (OSNs) and biological networks. Discovering and computing graphlet statistics are highly challenging. First, the massive size of real-world graphs makes the exact computation of graphlets extremely expensive. Secondly, the graph topology may not be readily available so one has to resort to web crawling using the available application programming interfaces (APIs). In this work, we propose a general and novel framework to estimate graphlet statistics of "any size". Our framework is based on collecting samples through consecutive steps of random walks. We derive an analytical bound on the sample size (via the Chernoff-Hoeffding technique) to guarantee the convergence of our unbiased estimator. To further improve the accuracy, we introduce two novel optimization techniques to reduce the lower bound on the sample size. Experimental evaluations demonstrate that our methods outperform the state-of-the-art method up to an order of magnitude both in terms of accuracy and time cost.

源语言英语
页(从-至)253-264
页数12
期刊Proceedings of the VLDB Endowment
10
3
DOI
出版状态已出版 - 2016
活动43rd International Conference on Very Large Data Bases, VLDB 2017 - Munich, 德国
期限: 28 8月 20171 9月 2017

学术指纹

探究 'A general framework for estimating graphlet statistics via random walk' 的科研主题。它们共同构成独一无二的指纹。

引用此