TY - JOUR
T1 - SNOD
T2 - a fast sampling method of exploring node orbit degrees for large graphs
AU - Wang, Pinghui
AU - Zhao, Junzhou
AU - Zhang, Xiangliang
AU - Tao, Jing
AU - Guan, Xiaohong
N1 - Publisher Copyright:
© 2018, Springer-Verlag London Ltd., part of Springer Nature.
PY - 2019/10/1
Y1 - 2019/10/1
N2 - Exploring small connected and induced subgraph patterns (CIS patterns, or graphlets) has recently attracted considerable attention. Despite recent efforts on computing how frequent a graphlet appears in a large graph (i.e., the total number of CISes isomorphic to the graphlet), little effort has been made to characterize a node’s graphlet orbit degree, i.e., the number of CISes isomorphic to the graphlet that touch the node at a particular orbit, which is an important fine-grained metric for analyzing complex networks such as learning functions/roles of nodes in social and biological networks. Like global graphlet counting, it is computationally intensive to compute node orbit degrees for a large graph. Furthermore, previous methods of computing global graphlet counts are not suited to solve this problem. In this paper, we propose a novel sampling method SNOD to efficiently estimate node orbit degrees for large-scale graphs and quantify the error of our estimates. To the best of our knowledge, we are the first to study this problem and give a fast scalable solution. We conduct experiments on a variety of real-world datasets and demonstrate that our method SNOD is several orders of magnitude faster than state-of-the-art enumeration methods for accurately estimating node orbit degrees for graphs with millions of edges.
AB - Exploring small connected and induced subgraph patterns (CIS patterns, or graphlets) has recently attracted considerable attention. Despite recent efforts on computing how frequent a graphlet appears in a large graph (i.e., the total number of CISes isomorphic to the graphlet), little effort has been made to characterize a node’s graphlet orbit degree, i.e., the number of CISes isomorphic to the graphlet that touch the node at a particular orbit, which is an important fine-grained metric for analyzing complex networks such as learning functions/roles of nodes in social and biological networks. Like global graphlet counting, it is computationally intensive to compute node orbit degrees for a large graph. Furthermore, previous methods of computing global graphlet counts are not suited to solve this problem. In this paper, we propose a novel sampling method SNOD to efficiently estimate node orbit degrees for large-scale graphs and quantify the error of our estimates. To the best of our knowledge, we are the first to study this problem and give a fast scalable solution. We conduct experiments on a variety of real-world datasets and demonstrate that our method SNOD is several orders of magnitude faster than state-of-the-art enumeration methods for accurately estimating node orbit degrees for graphs with millions of edges.
KW - Graph mining
KW - Graphlet
KW - Sampling
UR - https://www.scopus.com/pages/publications/85058628619
U2 - 10.1007/s10115-018-1301-z
DO - 10.1007/s10115-018-1301-z
M3 - 文章
AN - SCOPUS:85058628619
SN - 0219-1377
VL - 61
SP - 301
EP - 326
JO - Knowledge and Information Systems
JF - Knowledge and Information Systems
IS - 1
ER -