Abstract
Recently exploring locally connected subgraphs (also known as motifs or graphlets) of complex networks attracts a lot of attention. Previous work made the strong assumption that the graph topology of interest is known in advance. In practice, sometimes researchers have to deal with the situation where the graph topology is unknown because it is expensive to collect and store all topological information. Hence, typically what is available to researchers is only a snapshot of the graph, i.e., a subgraph of the graph. Crawling methods such as breadth first sampling can be used to generate the snapshot. However, these methods fail to sample a streaming graph represented as a high speed stream of edges. Therefore, graph mining applications such as network traffic monitoring usually use random edge sampling (i.e., sample each edge with a fixed probability) to collect edges and generate a sampled graph, which we call a ' RESampled graph'. Clearly, a RESampled graph's motif statistics may be quite different from those of the original graph. To resolve this, we propose a framework Minfer, which takes the given RESampled graph and accurately infers the underlying graph's motif statistics. Experiments using large scale datasets show the accuracy and efficiency of our method.
| Original language | English |
|---|---|
| Article number | 7884989 |
| Pages (from-to) | 61-74 |
| Number of pages | 14 |
| Journal | IEEE Transactions on Knowledge and Data Engineering |
| Volume | 31 |
| Issue number | 1 |
| DOIs | |
| State | Published - 1 Jan 2019 |
Keywords
- Graphlet
- graph mining
- motif
- subgraph sampling