1 / 47

Self-Paced Learning for Semantic Segmentation

Self-Paced Learning for Semantic Segmentation. M. Pawan Kumar. Self-Paced Learning for Latent Structural SVM. M. Pawan Kumar. Benjamin Packer. Daphne Koller. Aim. To learn accurate parameters for latent structural SVM. Input x. Output y  Y. Hidden Variable h  H. “Deer”.

rianna
Download Presentation

Self-Paced Learning for Semantic Segmentation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Self-Paced Learning forSemantic Segmentation M. Pawan Kumar

  2. Self-Paced Learning forLatent Structural SVM M. Pawan Kumar Benjamin Packer Daphne Koller

  3. Aim To learn accurate parameters for latent structural SVM Input x Output y Y Hidden Variable h  H “Deer” Y = {“Bison”, “Deer”, ”Elephant”, “Giraffe”, “Llama”, “Rhino” }

  4. Aim To learn accurate parameters for latent structural SVM Feature (x,y,h) (HOG, BoW) Parameters w (y*,h*) = maxyY,hH wT(x,y,h)

  5. Motivation Math is for losers !! Real Numbers Imaginary Numbers eiπ+1 = 0 FAILURE … BAD LOCAL MINIMUM

  6. Motivation Euler was a Genius!! Real Numbers Imaginary Numbers eiπ+1 = 0 SUCCESS … GOOD LOCAL MINIMUM

  7. Motivation Start with “easy” examples, then consider “hard” ones Simultaneously estimate easiness and parameters Easiness is property of data sets, not single instances Easy vs. Hard Expensive Easy for human  Easy for machine

  8. Outline • Latent Structural SVM • Concave-Convex Procedure • Self-Paced Learning • Experiments

  9. Latent Structural SVM Felzenszwalb et al, 2008, Yu and Joachims, 2009 Training samples xi Ground-truth label yi Loss Function (yi, yi(w), hi(w))

  10. Latent Structural SVM (yi(w),hi(w)) = maxyY,hH wT(x,y,h) min ||w||2 + C∑i(yi, yi(w), hi(w)) Non-convex Objective Minimize an upper bound

  11. Latent Structural SVM (yi(w),hi(w)) = maxyY,hH wT(x,y,h) min ||w||2 + C∑i i maxhiwT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i Still non-convex Difference of convex CCCP Algorithm - converges to a local minimum

  12. Outline • Latent Structural SVM • Concave-Convex Procedure • Self-Paced Learning • Experiments

  13. Concave-Convex Procedure Start with an initial estimate w0 hi = maxhH wtT(xi,yi,h) Update Update wt+1 by solving a convex problem min ||w||2 + C∑i i wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i

  14. Concave-Convex Procedure Looks at all samples simultaneously “Hard” samples will cause confusion Start with “easy” samples, then consider “hard” ones

  15. Outline • Latent Structural SVM • Concave-Convex Procedure • Self-Paced Learning • Experiments

  16. Self-Paced Learning REMINDER Simultaneously estimate easiness and parameters Easiness is property of data sets, not single instances

  17. Self-Paced Learning Start with an initial estimate w0 hi = maxhH wtT(xi,yi,h) Update Update wt+1 by solving a convex problem min ||w||2 + C∑i i wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i

  18. Self-Paced Learning min ||w||2 + C∑i i wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i

  19. Self-Paced Learning vi {0,1} min ||w||2 + C∑i vii wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i Trivial Solution

  20. Self-Paced Learning vi {0,1} min ||w||2 + C∑i vii - ∑ivi/K wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i Large K Medium K Small K

  21. Self-Paced Learning Alternating Convex Search Biconvex Problem vi [0,1] min ||w||2 + C∑i vii - ∑ivi/K wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i Large K Medium K Small K

  22. Self-Paced Learning Start with an initial estimate w0 hi = maxhH wtT(xi,yi,h) Update Update wt+1 by solving a convex problem min ||w||2 + C∑i vii - ∑i vi/K wT(xi,yi,hi) - wT(xi,y,h) ≥ (yi, y, h) - i Decrease K  K/

  23. Outline • Latent Structural SVM • Concave-Convex Procedure • Self-Paced Learning • Experiments

  24. Object Detection Input x - Image Output y Y Latent h - Box  - 0/1 Loss Y = {“Bison”, “Deer”, ”Elephant”, “Giraffe”, “Llama”, “Rhino” } Feature (x,y,h) - HOG

  25. Object Detection Mammals Dataset 271 images, 6 classes 90/10 train/test split 4 folds

  26. Object Detection Self-Paced CCCP

  27. Object Detection Self-Paced CCCP

  28. Object Detection Self-Paced CCCP

  29. Object Detection Self-Paced CCCP

  30. Object Detection Objective value Test error

  31. Handwritten Digit Recognition Input x - Image Output y Y Latent h - Rotation  - 0/1 Loss MNIST Dataset Y = {0, 1, … , 9} Feature (x,y,h) - PCA + Projection

  32. Handwritten Digit Recognition SPL C C C - Significant Difference

  33. Handwritten Digit Recognition SPL C C C - Significant Difference

  34. Handwritten Digit Recognition SPL C C C - Significant Difference

  35. Handwritten Digit Recognition SPL C C C - Significant Difference

  36. Motif Finding Input x - DNA Sequence Output y Y Y = {0, 1} Latent h - Motif Location  - 0/1 Loss Feature (x,y,h) - Ng and Cardie, ACL 2002

  37. Motif Finding UniProbe Dataset 40,000 sequences 50/50 train/test split 5 folds

  38. Average Hamming Distance of Inferred Motifs Motif Finding SPL SPL SPL SPL

  39. Motif Finding SPL Objective Value

  40. Motif Finding SPL Test Error

  41. Noun Phrase Coreference Input x - Nouns Output y - Clustering Latent h - Spanning Forest over Nouns Feature (x,y,h) - Yu and Joachims, ICML 2009

  42. Noun Phrase Coreference MUC6 Dataset 60 documents 1 predefined fold 50/50 train/test split

  43. Noun Phrase Coreference MITRE Loss Pairwise Loss - Significant Improvement - Significant Decrement

  44. Noun Phrase Coreference SPL MITRE Loss SPL Pairwise Loss

  45. Noun Phrase Coreference SPL MITRE Loss SPL Pairwise Loss

  46. Summary • Automatic Self-Paced Learning • Concave-Biconvex Procedure • Generalization to other Latent models • Expectation-Maximization • E-step remains the same • M-step includes indicator variables vi Kumar, Packer and Koller, NIPS 2010

More Related