TY - GEN
T1 - A topic model for linked documents and update rules for its estimation
AU - Guo, Zhen
AU - Zhu, Shenghuo
AU - Zhang, Zhongfei
AU - Chi, Yun
AU - Gong, Yihong
PY - 2010
Y1 - 2010
N2 - The latent topic model plays an important role in the unsupervised learning from a corpus, which provides a probabilistic interpretation of the corpus in terms of the latent topic space. An underpinning assumption which most of the topic models are based on is that the documents are assumed to be independent of each other. However, this assumption does not hold true in reality and the relations among the documents are available in different ways, such as the citation relations among the research papers. To address this limitation, in this paper we present a Bernoulli Process Topic (BPT) model, where the interdependence among the documents is modeled by a random Bernoulli process. In the BPT model a document is modeled as a distribution over topics that is a mixture of the distributions associated with the related documents. Although BPT aims at obtaining a better document modeling by incorporating the relations among the documents, it could also be applied to many applications including detecting the topics from corpora and clustering the documents. We apply the BPT model to several document collections and the experimental comparisons against several state-of-the-art approaches demonstrate the promising performance.
AB - The latent topic model plays an important role in the unsupervised learning from a corpus, which provides a probabilistic interpretation of the corpus in terms of the latent topic space. An underpinning assumption which most of the topic models are based on is that the documents are assumed to be independent of each other. However, this assumption does not hold true in reality and the relations among the documents are available in different ways, such as the citation relations among the research papers. To address this limitation, in this paper we present a Bernoulli Process Topic (BPT) model, where the interdependence among the documents is modeled by a random Bernoulli process. In the BPT model a document is modeled as a distribution over topics that is a mixture of the distributions associated with the related documents. Although BPT aims at obtaining a better document modeling by incorporating the relations among the documents, it could also be applied to many applications including detecting the topics from corpora and clustering the documents. We apply the BPT model to several document collections and the experimental comparisons against several state-of-the-art approaches demonstrate the promising performance.
UR - https://www.scopus.com/pages/publications/77958590049
M3 - 会议稿件
AN - SCOPUS:77958590049
SN - 9781577354642
T3 - Proceedings of the National Conference on Artificial Intelligence
SP - 463
EP - 468
BT - AAAI-10 / IAAI-10 - Proceedings of the 24th AAAI Conference on Artificial Intelligence and the 22nd Innovative Applications of Artificial Intelligence Conference
PB - AI Access Foundation
T2 - 24th AAAI Conference on Artificial Intelligence and the 22nd Innovative Applications of Artificial Intelligence Conference, AAAI-10 / IAAI-10
Y2 - 11 July 2010 through 15 July 2010
ER -