Abstract
As the amount of noisy, unorganized, linked data on the Internet increases dramatically, how to efficiently analyze such data becomes a challenging research problem. In this paper, we propose a framework, iOLAP, that offers functionalities for analyzing networked data from Internet, social networks, scientific paper citations, etc. We first identify four main data dimensions that are common in most of networked data, namely people, relation, content, and time. Motivated by the fact that various dimensions of data jointly affect each other, we propose a polyadic factorization approach to directly model all the dimensions simultaneously in a unified framework. We provide detailed theoretical analysis of the new modeling framework. In addition to the theoretical framework, we also present an efficient implementation of the algorithm that takes advantage of the sparseness of data and has time complexity linear in the number of data records in a dataset. We then apply the proposed models to analyzing the blogosphere and personalizing recommendation in paper citations. Extensive experimental studies showed that our framework is able to provide deep insights jointed obtained from various dimensions of networked data.
| Original language | English |
|---|---|
| Article number | 4802391 |
| Pages (from-to) | 372-382 |
| Number of pages | 11 |
| Journal | IEEE Transactions on Multimedia |
| Volume | 11 |
| Issue number | 3 |
| DOIs | |
| State | Published - Apr 2009 |
Keywords
- Information filtering
- knowledge management applications
- modeling structured
- personalization
- textual and multimedia data
- web mining