60 likes | 253 Views
Why we search: Visualizing and predicting user behavior E Adar, D S Weld, B N Bershad, S Gribble. Raju Balakrishnan. Problem . Correlation between user behaviors in different sources Time series modeling of user behavior, and attempt to predict the behavior.
E N D
Why we search: Visualizing and predicting user behaviorE Adar, D S Weld, B N Bershad, S Gribble Raju Balakrishnan
Problem • Correlation between user behaviors in different sources • Time series modeling of user behavior, and attempt to predict the behavior. The problem is important on recommendation systems, and predicting market trends.
Solution • Simple Correlation with delays • Dynamic Time Warping • A correlation Visualization Tool for Dynamic Time Warping tool is less interesting for CSE494 • Binning, topical categorization, smoothing of the queries are used for pre-processing
Criticism • No quantitative measure of predictive power of different sources are presented, though it is discussed qualitatively. • Residual predictive power is more important than absolute predictive power for prediction using multiple sources, which is not discussed. • Correlation Vs Causation is not discussed, for example the news sites might be causing the searches, but blog and searches might be just correlated. • Don’t effectiveness the DTW wrt simple correlation. • In general,This paper leaves many questions open, and is a starting work.
Relation to CSE494? • Recommendation Systems. Google News was considering click similarities of users in the same site for the recommendation, predictive power of the other sources between the sources for non-user specific recommendations. • The user query logs or MSN and AOL found to be following the power law distribution, as discussed, for click distribution in eCom portals. • Methods like, vector space similarity, clustering of queries, correlation analysis are being used.
Problem • Solution • Simple Correlation with delays • Dynamic Time Warping • A correlation Visualization Tool for Dynamic Time Warping tool is less interesting for CSE494 • Binning, topical categorization, smoothing of the queries are used for pre-processing Correlation between user behaviors in different sources Time series modeling of user behavior, and attempt to predict the behavior. The problem is important on recommendation systems, and predicting market trends Criticism No quantitative measure of predictive power of different sources are presented, though it is discussed qualitatively. Residual predictive power is more important than absolute predictive power for prediction using multiple sources, which is not discussed. Correlation Vs Causation is not discussed, for example the news sites might be causing the searches, but blog and searches might be just correlated. Don’t effectiveness the DTW wrt simple correlation. In general, This paper leaves many questions open, and is a starting work. Relation with CSE494 Recommendation Systems. Google News was considering click similarities of users in the same site for the recommendation, predictive power of the other sources between the sources for non-user specific recommendations. The user query logs or MSN and AOL found to be following the power law distribution, as discussed, for click distribution in eCom portals. Methods like, vector space similarity, clustering of queries, correlation analysis are being used.