1 / 19

Applied Algorithm Lab Wooram Heo

Toward the Next Generation of Recommender Systems : A Survey of the State-of-the-Art and Possible Extensions. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005 . Applied Algorithm Lab Wooram Heo. Outline. Recommemder Systems Problem statement Survey of Recommender systems

sal
Download Presentation

Applied Algorithm Lab Wooram Heo

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Toward the Next Generation of Recommender Systems: A Survey of theState-of-the-Art and Possible Extensions IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005 Applied Algorithm Lab WooramHeo

  2. Outline • Recommemder Systems • Problem statement • Survey of Recommender systems • Content-Based Methods • Collabolative Methods • Hybrid Methods

  3. Recommender Systems • Systems for recommending items (e.g. books, movies, CD’s, web pages, newsgroup messages) to users based on examples of their preferences. • Many on-line stores provide recommendations (e.g. Amazon, CDNow). • Recommenders have been shown to substantially increase sales at on-line stores.

  4. Recommender Systems • Examples

  5. Problem statement • Recommendation problem is to estimate ratings for the items that have not been seen by a user • Estimation is usually based on the ratings given by the user to other items and on some other information

  6. Problem statement • : the set of all users • : the set of all possible items that can be recommended • : , where is a nonnegative integers or real numbers within certain range • For each user , we want to choose such item that maximizes the user’s utility. • Utility needs to be extrapolated to the whole space

  7. Recommender System Categories • Content-based recommendations • The user will be recommended items similar to the ones the user preferred in the past • Collaborative recommendations • The user will be recommended items that people with similar tastes and preferences liked in the past • Hybrid approaches • These methods combine collaborative and content-based methods

  8. Content-Based Methods • Recommend items similar to those users preferred in the past • User profiling is the key • E.g. in a movie recommender application, • Specific actors • Directors • Genres • etc

  9. Content-Based Methods • Content-based approach has its roots in information retrieval • Documents, web sites(URLs), and news messages • Designed mostly to recommend text-based items • Content is usually described with keywords

  10. Content-Based Methods • TF-IDF weight for keywords in document is defined as • Content of document is defined as • Cosine similarity measure

  11. Disadvantages • Not all content is well represented by keywords • Multimedia data • Items represented by same set of features are indistinguishable • Overspecialization problem • New user problem • No history available

  12. Collaborative Methods • Use other users recommendations (ratings) to judge item’s utility • Key is to find users/user groups whose interests match with the current user • More users, more ratings: better results • Can account for items dissimilar to the ones seen in the past too

  13. User Database A 9 B 3 C 9 : : Z 5 A B C 9 : : Z 10 A 5 B 3 C : : Z 7 A B C 8 : : Z A 6 B 4 C 2 : : Z A 10 B 4 C 8 . . Z 1 A 9 B 3 C . . Z 5 A 9 B 3 C 9 : : Z 5 A 10 B 4 C 8 . . Z 1 Correlation Match Extract Recommendations C Active User Collaborative Methods

  14. Collaborative Methods • Memory-based algorithms • Value of the unknown rating for user and item is usually computed as an aggregate of the ratings of some other users for the same item • Where denotes the set of users that are the most similar to user c and who have rated item

  15. Collaborative Methods • Similarity between two users • Pearson correlation coefficient • Cosine similarity

  16. Collaborative Methods • Model-based algorithm • Cluster models and Bayesian networks are used to estimate this probability

  17. Collaborative Methods • Model-based approaches use various machine learning techniques • K-means clustering • Gibbs sampling • Bayesian model • Probabilistic relational model • Linear regression • Maximum entropy model • Markov decision process • Probabilistic latent semantic analysis • Latent Dirichlet allocation • etc

  18. Disadvantages • Finding similar users/user groups isn’t very easy • New user problem : No preferences available • New item problem: No ratings available • Sparsity problem

  19. END

More Related