140 likes | 244 Views
Cascading Behavior in Large Blog Graphs Patterns and a Model. Leskovec et al. (SDM 2007). Why?. Temporal Aspects How does information spread in Social Network? How does the popularity die? Linearly, exponentially, or …? Topological Aspects Do information cascades have common structures?
E N D
Cascading Behavior in Large Blog GraphsPatterns and a Model Leskovec et al. (SDM 2007)
Why? • Temporal Aspects • How does information spread in Social Network? • How does the popularity die? Linearly, exponentially, or …? • Topological Aspects • Do information cascades have common structures? • Their properties like size distribution
Preliminaries • Trivial vs. Non-trivial Cascades • Cascade Initiator • Stars and Chains • Connector nodes
Dataset • 21.3 million posts, 2.5 million blogs from Aug and Sep 2005 • Start with most cited blog posts in Aug’05 • Traversed conversations forward (inlinks) and backward (outlinks) • Max depth = 100; max breadth = 500 • Collected • Unique post ID • Blog URL • Post Permalink • Post Date • Post Content • Post Links
Temporal Patterns How Popularity dies?
Blog Network Topology Popular blogs that receive lots of inlinks does not necessarily sprout many outlinks.
Post Network Topology 98% of the posts are isolated
Topological Patterns Common Cascade Shapes (Gr has the frequency rank r) 97% are trivial cascades
Topological Patterns Cascade Size Distribution
Observations • Most cascades follow tree like structures. • Linear increase in diameter requires exponential increase in the cascade size. • The probability that a node will be a part of a cascade decreases with the number of cascades it is already a part of.
Generative Model • Susceptible-Infected-Susceptible (SIS) Model • β: “infection probability” of a post • Blog can be either “infected” or “susceptible”
Summary • Temporal patterns • Topological patterns • Generative model
Food for thought • Blogs are sparsely linked. Not many posts link to the original post from which they got the content. How to study information diffusion in these scenarios? • Beyond link analysis • Uniform infecting probability is an unrealistic assumption • Multiple cascades initiating simultaneously • Not many study the “tipping point” in cascades • Does the cascade die its natural death or is there some factor that affects the lifespan of a cascade
Inlink Outlink T-1 T T+1 Backward Forward