1 / 18

Hierarchical Relational Models for Document Networks

This paper introduces the Relational Topic Model (RTM) for network data, incorporating both links and node attributes. The study covers data examples, graphical models, generative processes, link probability functions, and empirical results such as automatic link suggestion and spatial data modeling. Evaluations show RTM's improvement in precision over LDA+Regression.

deanp
Download Presentation

Hierarchical Relational Models for Document Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hierarchical Relational Models for Document Networks Jonathan Chang and David Blei Facebook and Princeton University The Annals of Applied Statistics, 2010 Presented by Haojun Chen Images and some text are from the original paper.

  2. Introduction • Network data attracted lots of research interests in machine learning and applied statistics. • Previous work focused only for the network structure but ignores the attributes of nodes. For example, in a citation network of articles, text and abstracts of documents should be used for exploiting the latent structure in the data too. • In this paper, Relational Topic Model (RTM) is developed for network data, which accounts for both links and node attributes.

  3. Data Example for RTM

  4. Graphical Model for RTM

  5. Generative Process for RTM

  6. Link Probability Function • Four Link Probability Function: CDF of Normal distribution : Hadamard product

  7. Model Inference, Estimation and Prediction • Variational inference for and • Maximum likelihood estimate for , and • Prediction • Link prediction from words • Word prediction from links

  8. Empirical Results • Data summary • Three experiments • Evaluating the predictive distribution • Automatic link suggestion • Modeling spatial data

  9. Evaluating Predictive Distribution (1/2) Lower is Better

  10. Evaluating Predictive Distribution (2/2)

  11. Automatic Link Suggestion (1/3) • Citation suggestion Suggest citation given the abstract • Cora dataset and number of Topic is set to 10 • RTM improves precision over LDA+Regression by 80% in the first 20 documents retrieved from the model

  12. Automatic Link Suggestion (2/3)

  13. Automatic Link Suggestion (3/3)

  14. Modeling Spatial Data (1/4) • Local News Data: 51 documents and each document for one state • Number of Topic is set to 5 • Word are ranked by the following score:

  15. Modeling Spatial Data (2/4) • Each color depicts a single topic. Each state’s color intensity indicates the magnitude of that topic’s component. • Corresponding words associated with each topic are given in the table. RTM LDA

  16. Modeling Spatial Data (3/4) RTM LDA

  17. Modeling Spatial Data (4/4) RTM LDA

  18. Discussion • Relational Topic Model (RTM) is a hierarchical model of networks and per-node attribute data. • It is demonstrated qualitatively and quantitatively that RTM is effective and useful mechanism for analyzing and using network data.

More Related