1 / 30

Relational Evaluation Techniques

Relational Evaluation Techniques. Daniel McEnnis. Outline. Definition Component Overview Existing Approaches Descriptions of the Components Applications and Examples. Relational Evaluation Techniques Definition.

quasar
Download Presentation

Relational Evaluation Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Relational Evaluation Techniques Daniel McEnnis

  2. Outline • Definition • Component Overview • Existing Approaches • Descriptions of the Components • Applications and Examples

  3. Relational Evaluation Techniques Definition • Experimental setup for evaluating the performance of algorithms that use data that span more than one table or instance vector • Can use either relational algebra or hypergraph-based descriptions

  4. Components • Data Acquisition • Ground Truth Acquisition • Cross-Validation Technique • Query Type • Scoring Metric • Significance Test

  5. Existing Approaches • Machine Learning • Relational Machine Learning • TREC • Collaborative Filtering • ISMIR • Social Network Analysis

  6. Machine Learning • Predetermined flat data, no sampling • Predetermined ground truth • Typically simple queries • Sophisticated cross-validation • Basic set based metrics • No significance tests

  7. Relational Machine Learning • Predetermined relational data • Predetermined ground truth • Predefined simple query • Sophisticated cross-validation • Basic set-based metrics • No significance tests

  8. TREC • Predetermined flat data • Sophisticated ground truth sampling. • Sophisticated queries • Machine-learning cross-validation • Ranked set-of-sets scoring • Simple significance tests

  9. Collaborative Filtering • Predetermined flat/relational data • Predetermined ground truth • Simple, predefined query • No cross-validation • Sophisticated Scoring metrics • No significance tests

  10. ISMIR • Sampled flat data • Predetermined ground truth • Sophisticated queries • Machine-learning cross validation • Simple set based scoring metrics • Sophisticated significance tests

  11. Social Network Analysis • Sophisticated data sampling • Sophisticated statistical techniques

  12. Sequences of Choices • Plug ‘n play an experiment • Different aspects are evaluated • Some algorithms simply don’t work • Extensive algorithm rewrites sometimes needed

  13. Data Acquisition • Data structure • Where is it? • What sampling technique to use • Random Access • Snowball • Hypergraph Snowball • How much data is needed?

  14. Ground Truth Acquisition • What is being tested? • TREC extended ground truth sampling • Structure of the output

  15. Cross-Validation • Actor Based • Link Based • Graph Based • No Cross Validation

  16. Graph Notation • Actor definition • Link definition • Graph definition • Database table / instance vector equivalence • Foreign key / link equivelance

  17. Actor Cross-Validation • Traditional Machine Learning approach • Divisions by database table • Folds usually random assignment • Works well on flat data • Trouble with relational data

  18. Link Cross Validation • Rare machine learning approach • Divisions by foreign key reference • Less statistical independence than actor • Works for collaborative filtering • Usually random assignment

  19. Graph Cross Validation • Relational Machine Learning • Divisions by predetermined discrete graphs • Statistical independence • Non-learning based approaches • Clustering based fold generation

  20. No Cross Validation • Standard over fitting problems • Useful after implied cross-validation

  21. Query Type • Information Need definition • Actor based query • Set or List based query • Conditional queries

  22. Scoring Metrics • Comparisons against ground truth • Set based metrics • Ranked based metrics • List based metrics

  23. Set Based Metrics • Recall and Precision • F-Measure • Mean Average Performance

  24. Ranked List Metrics • Pearson Correlation • Spearmans Correlation • Mean Absolute Error • Linear Algebra Distance Metrics • Serendipity

  25. Ordered List Metrics • Half Life • Kendall Tau • NDPM • Sequence Alignment Algorithms • Hamming Distance

  26. Significance Tests • Pairwise student t-test • ANOVA • ANOVA/Tukey-Kramer statistical test

  27. Evaluation Questions • Does the data contain time (global ordered sequence) • Actor-, Link-, Graph-, or Set-based queries • List, Set, or Set-of-Lists output • Contextual question or absolute • Statistical purity versus maximum information

  28. Music Recommendation • Example - Personalized Dynamic Tag Radio • LastFM profile data • LastFM tag data • Semantic Web data • Next-week-data ground truth • Conditional query • Graph cross-validation • Kendall Tau scoring metric • ANOVA/Tukey-Kramer statistical analysis

  29. Conclusions • No one-size-fits-all • Data and ground-truth set the framework • Question determines the final structure • Each discipline has a piece of the answer • Graph-RAT 0.5

  30. Future Work • Finish exploring Social Network Analysis significance tests • Fully explore set-of-sets evaluation metrics • Debugging of Graph-RAT cross-validation schedulers • Ease of use improvements to Graph-RAT

More Related