1 / 51

Discriminative, Unsupervised, Convex Learning

Discriminative, Unsupervised, Convex Learning. Dale Schuurmans Department of Computing Science University of Alberta MITACS Workshop, August 26, 2005. Current Research Group. PhD Tao Wang reinforcement learning PhD Ali Ghodsi dimensionality reduction

liora
Download Presentation

Discriminative, Unsupervised, Convex Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discriminative, Unsupervised, Convex Learning Dale Schuurmans Department of Computing Science University of Alberta MITACS Workshop, August 26, 2005

  2. Current Research Group PhD Tao Wang reinforcement learning PhD Ali Ghodsi dimensionality reduction PhD Dana Wilkinson action-based embedding PhD Yuhong Guo ensemble learning PhD Feng Jiao bioinformatics PhD Jiayuan Huang transduction on graphs PhD Qin Wang statistical natural language PhD Adam Milstein robotics, particle filtering PhD Dan Lizotte optimization, everything PhD Linli Xu unsupervised SVMs PDF Li Cheng computer vision

  3. Current Research Group PhD Tao Wang reinforcement learning PhD Dana Wilkinson action-based embedding PhD Feng Jiao bioinformatics PhD Qin Wang statistical natural language PhD Dan Lizotte optimization, everything PDF Li Cheng computer vision

  4. Today I will talk about: One Current Research Direction Learning Sequence Classifiers (HMMs) • Discriminative • Unsupervised • Convex EM?

  5. Outline • Unsupervised SVMs • Discriminative, unsupervised, convex HMMs • Tao, Dana, Feng, Qin, Dan, Li

  6. Unsupervised Support Vector Machines Joint work with Linli Xu

  7. Main Idea • UnsupervisedSVMs (and semi-supervised SVMs) • Harder computational problem than SVMs • Convex relaxation – Semidefinite program (Polynomial time)

  8. Background: Two-class SVM • Supervised classification learning • Labeled data  linear discriminant • Classification rule: + Some better than others?

  9. Maximum Margin Linear Discriminant Choose a linear discriminant to maximize

  10. Unsupervised Learning • Given unlabeled data, how to infer classifications? • Organize objects into groups — clustering

  11. Idea: Maximum Margin Clustering • Given unlabeled data, find maximum margin separating hyperplane • Clusters the data • Constraint:class balance: bound difference in sizes between classes

  12. Challenge • Find label assignment that results in a large margin • Hard • Convex relaxation – based on semidefinite programming

  13. How to Derive Unsupervised SVM? Two-class case: • Start with Supervised Algorithm Given vector of assignments, y, solve Inv. sq. margin

  14. How to Derive Unsupervised SVM? • Think of as a function of y Goal: Choose y to minimize inverse squared margin If given y, would then solve Problem: not a convex function of y Inv. sq. margin

  15. How to Derive Unsupervised SVM? • Re-express problem with indicators comparing y labels New variables: An equivalence relation matrix If given y, would then solve Inv. sq. margin

  16. How to Derive Unsupervised SVM? • Re-express problem with indicators comparing y labels New variables: An equivalence relation matrix If given M, would then solve Note:convex function of M Inv. sq. margin Maximum of linear functions is convex

  17. How to Derive Unsupervised SVM? • Get constrained optimization problem Solve for M encodes an equivalence relation iff Not convex! Class balance 

  18. How to Derive Unsupervised SVM? • Get constrained optimization problem Solve for M encodes an equivalence relation iff

  19. How to Derive Unsupervised SVM? • Relax indicator variables to obtain a convex optimization problem Solve for M

  20. How to Derive Unsupervised SVM? • Relax indicator variables to obtain a convex optimization problem Solve for M Semidefinite program

  21. Multi-class Unsupervised SVM? • Start with Supervised Algorithm Given vector of assignments, y, solve Margin loss (Crammer & Singer 01)

  22. Multi-class Unsupervised SVM? • Think of as a function of y Goal: Choose y to minimize margin loss If given y, would then solve Margin loss Problem: not a convex function of y (Crammer & Singer 01)

  23. Multi-class Unsupervised SVM? • Re-express problem with indicators comparing y labels New variables: M & D If given y, would then solve Margin loss (Crammer & Singer 01)

  24. Multi-class Unsupervised SVM? • Re-express problem with indicators comparing y labels New variables: M & D If given MandD, would then solve Margin loss convex function of M & D

  25. Multi-class Unsupervised SVM? • Get constrained optimization problem Solve for MandD Class balance 

  26. Multi-class Unsupervised SVM? • Relax indicator variables to obtain a convex optimization problem Solve for MandD

  27. Multi-class Unsupervised SVM? • Relax indicator variables to obtain a convex optimization problem Solve for MandD Semidefinite program

  28. Experimental Results SemiDef Spectral Clustering Kmeans

  29. Experimental Results

  30. Experimental Results Percentage of misclassification errors Digit dataset

  31. Extension to Semi-Supervised Algorithm MatrixM :

  32. Experimental Results Percentage of misclassification errors Face dataset

  33. Experimental Results

  34. Discriminative, Unsupervised, Convex HMMs Joint work with Linli Xu With help from Li Cheng and Tao Wang

  35. Must coordinate local classifiers Hidden Markov Model • Joint probability model • Viterbi classifier “hidden” state observations

  36. HMM Training: Supervised • Given Models input distribution Maximum likelihood Conditional likelihood Discriminative (CRFs)

  37. Marginal likelihood HMM Training: Unsupervised • Given only • Now what? EM! Exactly the part we don’t care about

  38. HMM Training: Unsupervised • Given only The problem with EM: • Not convex • Wrong objective • Too popular • Doesn’t work

  39. HMM Training: Unsupervised • Given only The dream: • Convex training • Discriminative training When will someone invent unsupervised CRFs?

  40. HMM Training: Unsupervised • Given only The question: • How to learn effectively without seeing any y’s?

  41. HMM Training: Unsupervised • Given only The question: • How to learn effectively without seeing any y’s? The answer: • That’s what we already did!  Unsupervised SVMs

  42. HMM Training: Unsupervised • Given only The plan: single sequence SVM  M3N supervised    unsupervised unsup SVM ?

  43. M3N: Max Margin Markov Nets • Relational SVMs • Supervised training: • Given • Solve factored QP

  44. Unsupervised M3Ns • Strategy • Start with supervised M3N QP • y-labels  re-express in local M,D equivalence relations • Impose class-balance • Relax non-convex constraints • Then solve a really big SDP • But still polynomial size

  45. Unsupervised M3Ns • SDP

  46. Some Initial Results • Synthetic HMM • Protein Secondary Structure pred.

  47. Current Research Group PhD Tao Wang reinforcement learning PhD Dana Wilkinson action-based embedding PhD Feng Jiao bioinformatics PhD Qin Wang statistical natural language PhD Dan Lizotte optimization, everything PDF Li Cheng computer vision

More Related