Unbiased characterization of node pairs over large graphs

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Characterizing user pair relationships is important for applications such as friend recommendation and interest targeting in online social networks (OSNs). Due to the large-scale nature of such networks, it is infeasible to enumerate all user pairs and thus sampling is used. In this article, we show that it is a great challenge for OSN service providers to characterize user pair relationships, even when they possess the complete graph topology. The reason is that when sampling techniques (i.e., uniform vertex sampling (UVS) and random walk (RW)) are naively applied, they can introduce large biases, particularly for estimating similarity distribution of user pairs with constraints like existence of mutual neighbors, which is important for applications such as identifying network homophily. Estimating statistics of user pairs is more challenging in the absence of the complete topology information, as an unbiased sampling technique like UVS is usually not allowed and exploring the OSN graph topology is expensive. To address these challenges, we present unbiased sampling methods to characterize user pair properties based on UVS and RW techniques.We carry out an evaluation of our methods to show their accuracy and efficiency. Finally, we apply our methods to three OSNs-Foursquare, Douban, and Xiami-and discover that significant homophily is present in these networks.

Original languageEnglish
Article number22
JournalACM Transactions on Knowledge Discovery from Data
Volume9
Issue number3
DOIs
StatePublished - 1 Apr 2015

Keywords

  • Graph sampling
  • Homophily
  • Random walks
  • Social network

Fingerprint

Dive into the research topics of 'Unbiased characterization of node pairs over large graphs'. Together they form a unique fingerprint.

Cite this