Inferring Higher-Order Structure Statistics of Large Networks from Sampled Edges

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Recently exploring locally connected subgraphs (also known as motifs or graphlets) of complex networks attracts a lot of attention. Previous work made the strong assumption that the graph topology of interest is known in advance. In practice, sometimes researchers have to deal with the situation where the graph topology is unknown because it is expensive to collect and store all topological information. Hence, typically what is available to researchers is only a snapshot of the graph, i.e., a subgraph of the graph. Crawling methods such as breadth first sampling can be used to generate the snapshot. However, these methods fail to sample a streaming graph represented as a high speed stream of edges. Therefore, graph mining applications such as network traffic monitoring usually use random edge sampling (i.e., sample each edge with a fixed probability) to collect edges and generate a sampled graph, which we call a ' RESampled graph'. Clearly, a RESampled graph's motif statistics may be quite different from those of the original graph. To resolve this, we propose a framework Minfer, which takes the given RESampled graph and accurately infers the underlying graph's motif statistics. Experiments using large scale datasets show the accuracy and efficiency of our method.

Original languageEnglish
Article number7884989
Pages (from-to)61-74
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume31
Issue number1
DOIs
StatePublished - 1 Jan 2019

Keywords

  • Graphlet
  • graph mining
  • motif
  • subgraph sampling

Fingerprint

Dive into the research topics of 'Inferring Higher-Order Structure Statistics of Large Networks from Sampled Edges'. Together they form a unique fingerprint.

Cite this