1 / 19

Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches

John Hannon , Mike Bennett, Barry Smyth CLARITY Centre for Sensor Web Technologies University College Dublin. Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches. Outline. 1. Problem. 2. Related work & Innovation. Method & Experiment. 3. 4.

holmes-pate
Download Presentation

Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. John Hannon, Mike Bennett, Barry Smyth CLARITY Centre for Sensor Web Technologies University College Dublin Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches

  2. Outline 1 Problem 2 Related work & Innovation Method & Experiment 3 4 Result & Analysis

  3. Problem • The paper solves an important recommendation problem— for a given user, UT which other users might be recommended as followers/followees, based on a large dataset of Twitter users and their tweets. • The motivation of the paper is to demonstrate the potential for effective and efficient followee recommendation.

  4. Related Work • Analysis of Twitter’s real-time data. • Kwak et al : reciprocity and homophily among Twitter users, information diffusesion. • User-generated content like review as an additional source is used in recommender system. • The use of user-generated movie reviews from IMDb as part of a movie recommender system. • Research to help users find and contact with people online. • The information such as co-authorships are used to identify similar users. • Freyne and Geyer et al have done much work about relationship building. • Make recommendations to new users during their sign-up process. • Recommend Topics for Self-Descriptions in Online User Profiles

  5. Innovation • Twitter’s potential as a powerful source of profiling data. This is a novel take on profiling and recommendation in itself. • Focus on noisy, unstructured micro-blogging data. • Novel contribution of the paper is that noisy as Twitter data is, it can still provide a useful recommendation signal.

  6. twittomender

  7. Approach • How users are profiled • Content-based techniques which rely on the content of tweets. • Collaborative filtering approaches based on the followees and followers of users. • How these profiles can be used to suggest interesting users to follow. • Lucene platform are used to develop the framework.

  8. Profiling Users on Twitter • 5 basic profiling strategies: (1) Representing users by their own tweets (tweets(UT)); (2) By the tweets of their followees (followeetweets(UT)); (3) By the tweets of their followers (followertweets(UT)); (4) By the ids of their followees (followees(UT)); (5) By the ids of their followers (followers(UT)).

  9. Indexing & Recommendation • Using Lucene’s indexing features we can represent each, UT , as a weighted term-vector, profile (UT, source). • profile (UT ,source) = {w1,…,wn} • Term weighting function: TF-IDF • Query-based retrieval and profile-based recommendation are then implemented using Lucene's standard retrieval function, with the target user's profile document serving as the search query in the case of the latter.

  10. Experiment—dataset • Imported 20,000 users directly using the Twitter API as dataset. The dataset is split into two sets of users –onecontaining1000userstoactas testusers,andalargertraining-set of19,000users;

  11. 9 different profile information • S1: tweets(UT) • S2: followeestweets(UT) • S3: followerstweets(UT) • S4: tweets(UT), followeestweets(UT), followerstweets(UT) • S5: followee(UT) • S6: follower(UT) • S7: followee(UT), follower(UT) • S8: the scoring function is based on a combination of content and collaborative strategies S1 and S6; • S9: the scoring function is based on the position of the user in each of the recommendation lists.

  12. Recommendation Precision • Our basic measure of recommendation performance is the average percentage overlap between a given recommendation list and the target user's actual followees-list; • We can also see that relevant recommendations tend to be clustered towards the top of recommendation lists since the precision of all strategies is seen to decline within increasing recommendation-list size. Interestingly, the collaborative strategies perform better than the content strategies;

  13. Ranking Effectiveness • The position of relevant recommendations is also an important consideration, especially since we know that users focus the lion's share of their attention on items at the top of results or recommendation-lists.

  14. A live-user trial • Shortage of the off-line evaluation ? • It’s unwise to discount the non-overlapping recommendations as definitively not relevant to the target user.

  15. User Recommendation: an average of 6.9 users per recommendation-list. User Search: an average of 4.9 of the suggested users per search.

  16. Conclusion Advantage: • User-generated contents are used as a source of profiling data. • Tweet doesn’t been preprocessed. My idea: • User’s tweet should be preprocessed, such as extracting tag from tweet. The tag may be more important than content. • Besides, other information such as the group user join in is also worthy to take into account. • Users can be divided into celebrity and people. For different kind of users, different strategy should be take into account.

  17. Thank You !

  18. Barry Smyth:Centre Director • his research interests include personalization, recommender systems, case-based reasoning, machine learning, and information retrieval.  • Mr. John HannonPh.D. Student • Mike Bennett is a postdoctoral researcher and interaction designer 

More Related