1 / 23

Contexts-As-Clustering Making Sense of Social Contexts from Low-level Sensory Data

Contextual and Social Media Understanding and Usage. Contexts-As-Clustering Making Sense of Social Contexts from Low-level Sensory Data. Dinh Phung Curtin University of Technology, Australia (joint work with Brett Adams, Svetha Venkatesh ). Motivation.

vianca
Download Presentation

Contexts-As-Clustering Making Sense of Social Contexts from Low-level Sensory Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Contextual and Social Media Understanding and Usage Contexts-As-ClusteringMaking Sense of Social Contexts from Low-level Sensory Data Dinh Phung Curtin University of Technology, Australia (joint work withBrett Adams, Svetha Venkatesh )

  2. Motivation • Contexts provides fundamental units for context-aware applications • But, what sorts of context? how to extract them? • Sparseness problem: • in user’s behaviours: power on/off inconsistently • during data collection: signal loss, measurement errors, • in structure of social activity

  3. Approaches • Non-parametric clustering: DBSCAN, Affinity Propagation • Scale well with data size, can deal with online and incremental nature • Robust to outliers noise, easy to incorporate constraints • Applications: • Extraction of significant places (social functions) from GPS data • Social rhythms as a combinatorial clustering process • Probabilistic clustering: latent Dirichlet allocation • Can jointly model statistical strengths across multi-modality, co-occurrences, dynamic behaviours, temporal behaviours, group membership, dyadic data, etc. • Applications: • Computable high-order patterns from location data • Topic inference in blogspheres

  4. Case 1: Locations from GPS • Minor signal noise from fix tolerance and lag • Major signal noise from signal loss • Usage inconsistency

  5. Locations • Preprocessing • Removal of points above a speed threshold • Often missing precisely the samples we want! (e.g. buildings) • Interpolation within a day and across days

  6. ε p q D Density-base clustering • Clustering using DBSCAN • Handles arbitrary cluster shapes (GPS is trajectory data) • No initialization required and is deterministic • Excludes noise, outliers and abnormal points • Incremental version: Maximally density connected Directly density reachable Density reachable

  7. Results – stays

  8. Results: co-locations

  9. Applications • Much more super-contexts can be derived: • social tie, co-location patterns, rough measure of entropy of daily day, or implications on social relationship. • Socially collaborativeinference! • Social Context-Aware Media Browsing:

  10. Case 2: Rhythms Extractions Social rhythms: complex set of projections of repeated occurrences on dimensions of people, place and time. Extraction of rhythms as a combinatorial clustering process by folding in certain dimensions!

  11. Rhythms extraction • Rare vs. Frequent rhythms • Functional of (normalized) time experienced at a place • Timed vs Optional rhythms • Social aspect of punctuality • Relational rhythms • People-based clustering

  12. Rhythms extraction • Relational

  13. Jive: integrate rhythms inside blogs

  14. Case 3: Computable Patterns • Repeated patterns found in daily activities over time. • Driven by a social theme – a cognitive aspect of mind: • need to go work everyday, pick up children every Tuesday • going to church on Sunday, going to gyms to keep fit, … • Go beyond simple counts: require order of activities over time to derive patterns. • Challenge: need to exploit and do clustering on ngram-style order statistics.

  15. Approach Translate social data into ‘text’ documents and use Bayesian document modelling tools! Translate social data into ‘text’ documents and use Bayesian document modelling tools!

  16. Social codebook • Social footprint = <start time, duration, location label> • Translate each footprint into a code • start time = 1,2,3,…, 24 (24 hours) • duration = short, medium, long • location label = {set of unique names} • Each day is translated into a document = social page • A collection of social pages = social corpus

  17. Latent Social Dirichlet Alloc. (LSDA) • Extend latent Dirichlet allocation (LDA, Blei ’03) to generate ngrams rather than single words. • Can be viewed as (i) a clustering method, or (ii) discrete dimensionality reduction method: • a word = a social code • a topic = distribution over ngram of codes • Inference can be done efficiently with Gibbs. • A personalized version of Ngram Topic Model (Wang ’07)

  18. Experiments • Data collected over 1.5 years, 8 subjects, 10+ millions GPS samples • noisy, fragmented and very sparse • Family, friends and workmates • Exhibits all types of noise • 1709 footprints found, codebook size = 381 • Small hyper-parameters ( << 1) to favour discriminative patterns and themes.

  19. Top themes and patterns WEEKDAYS

  20. Top themes and patterns WEEKENDS

  21. Conclusion • What is context? • Traditional view: as a representation problem (Dourish ’04) • Context is a form a information • Context is delineable • Context is stable • Context and Activities (contents) are separable • Dourish’s view: context as an interaction problem! • Contexts possess dynamic properties • Contexts driven by contents and vice versa. What to proceed from here? Is context domain-specific? Is there a unified framework?

  22. Rhythms extraction • Ranked timed rhythms

More Related