1 / 50

Visual Recognition With Humans in the Loop

Visual Recognition With Humans in the Loop. ECCV 2010, Crete, Greece. Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie. Peter Welinder Pietro Perona. What type of bird is this?. Field Guide. …?. What type of bird is this?. Computer Vision. ?.

clem
Download Presentation

Visual Recognition With Humans in the Loop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visual Recognition With Humans in the Loop ECCV 2010, Crete, Greece Steve Branson Catherine Wah FlorianSchroff Boris Babenko Serge Belongie Peter Welinder PietroPerona

  2. What type of bird is this?

  3. Field Guide …? What type of bird is this?

  4. Computer Vision ? What type of bird is this?

  5. Computer Vision Bird? What type of bird is this?

  6. Computer Vision Chair? Bottle? What type of bird is this?

  7. Field guides difficult for average users • Computer vision doesn’t work perfectly (yet) • Research mostly on basic-level categories Parakeet Auklet

  8. Visual Recognition With Humans in the Loop What kind of bird is this? Parakeet Auklet

  9. Levels of Categorization Basic-Level Categories • Airplane? Chair? • Bottle? … [Griffin et al. ‘07, Lazebnik et al. ‘06, Grauman et al. ‘06, Everingham et al. ‘06, Felzenzwalb et al. ‘08, Viola et al. ‘01, … ]

  10. Levels of Categorization Subordinate Categories • American Goldfinch? • Indigo Bunting? … [Belhumeur et al. ‘08 , Nilsback et al. ’08, …]

  11. Levels of Categorization Parts and Attributes • Yellow Belly? • Blue Belly?… [Farhadi et al. ‘09, Lampert et al. ’09, Kumar et al. ‘09]

  12. Visual 20 Questions Game • Blue Belly? no • Cone-shaped Beak? yes • Striped Wing? yes • American Goldfinch? yes Hard classification problems can be turned into a sequence of easy ones

  13. Recognition With Humans in the Loop • Computer Vision • Computers: reduce number of required questions • Humans: drive up accuracy of vision algorithms • Cone-shaped Beak? yes • Computer Vision • American Goldfinch? yes

  14. Research Agenda 2010 2015 Heavy Reliance on Human Assistance More Automated • Striped Wing? yes • American Goldfinch? yes • Blue belly? no • Cone-shaped beak? yes • Striped Wing? yes • American Goldfinch? yes Computer Vision Improves Fully Automatic • American Goldfinch? yes 2025

  15. Field Guides www.whatbird.com

  16. Field Guides www.whatbird.com

  17. Example Questions

  18. Example Questions

  19. Example Questions

  20. Example Questions

  21. Example Questions

  22. Example Questions

  23. Basic Algorithm Computer Vision Max Expected Information Gain Input Image ( ) A: NO Question 1: Is the belly black? Max Expected Information Gain A: YES Question 2: Is the bill hooked? …

  24. Without Computer Vision Class Prior Max Expected Information Gain Input Image ( ) A: NO Question 1: Is the belly black? Max Expected Information Gain A: YES Question 2: Is the bill hooked? …

  25. Basic Algorithm Select the next question that maximizes expected information gain: • Easy to compute if we can to estimate probabilities of the form: Object Class Image Sequence of user responses

  26. Basic Algorithm Normalization factor Model of user responses Computer vision estimate

  27. Basic Algorithm Normalization factor Model of user responses Computer vision estimate

  28. Modeling User Responses • Assume: • Estimate using Mechanical Turk Definitely grey red black white brown blue Probably Guessing grey red black white brown blue grey red black white brown blue Pine Grosbeak What is the color of the belly?

  29. Incorporating Computer Vision • Use any recognition algorithm that can estimate: p(c|x) • We experimented with two simple methods: 1-vs-all SVM Attribute-based classification [Lampert et al. ’09, Farhadiet al. ‘09]

  30. Incorporating Computer Vision • Used VLFeat and MKL code + color features Color Histograms Color Layout Geometric Blur Self Similarity Color SIFT, SIFT Multiple Kernels Bag of Words Spatial Pyramid [Vedaldi et al. ’08, Vedaldi et al. ’09]

  31. Birds 200 Dataset • 200 classes, 6000+ images, 288 binary attributes • Why birds? Parakeet Auklet Field Sparrow Vesper Sparrow Black-footed Albatross Groove-Billed Ani Baird’s Sparrow Henslow’s Sparrow Arctic Tern Forster’s Tern Common Tern

  32. Birds 200 Dataset • 200 classes, 6000+ images, 288 binary attributes • Why birds? Parakeet Auklet Field Sparrow Vesper Sparrow Black-footed Albatross Groove-Billed Ani Baird’s Sparrow Henslow’s Sparrow Arctic Tern Forster’s Tern Common Tern

  33. Birds 200 Dataset • 200 classes, 6000+ images, 288 binary attributes • Why birds? Parakeet Auklet Field Sparrow Vesper Sparrow Black-footed Albatross Groove-Billed Ani Baird’s Sparrow Henslow’s Sparrow Arctic Tern Forster’s Tern Common Tern

  34. Results: Without Computer Vision Comparing Different User Models

  35. Results: Without Computer Vision Perfect Users: 100% accuracy in 8≈log2(200) questions if users answers agree with field guides…

  36. Results: Without Computer Vision Real users answer questions MTurkers don’t always agree with field guides…

  37. Results: Without Computer Vision Real users answer questions MTurkers don’t always agree with field guides…

  38. Results: Without Computer Vision Probabilistic User Model: tolerate imperfect user responses

  39. Results: With Computer Vision

  40. Results: With Computer Vision Users drive performance: 19%  68% Just Computer Vision 19%

  41. Results: With Computer Vision Computer Vision Reduces Manual Labor: 11.1 6.5 questions

  42. Examples Different Questions Asked w/ and w/out Computer Vision Western Grebe Without computer vision: Q #1: Is the shape perching-like?no (Def.) With computer vision: Q #1: Is the throat white?yes (Def.) perching-like

  43. Examples User Input Helps Correct Computer Vision Magnolia Warbler Common Yellowthroat computer vision Common Yellowthroat Is the breast pattern solid?no (definitely) Magnolia Warbler

  44. Recognition is Not Always Successful Acadian Flycatcher Parakeet Auklet Is the belly multi-colored? yes (Def.) Unlimited questions Least Auklet Least Flycatcher

  45. Summary • Recognition of fine-grained categories 11.1 6.5 questions 19% • Users drive up performance • Computer vision reduces manuallabor • More reliable than field guides

  46. Summary • Recognition of fine-grained categories 11.1 6.5 questions 19% • Users drive up performance • Computer vision reduces manuallabor • More reliable than field guides

  47. Summary • Recognition of fine-grained categories 11.1 6.5 questions 19% • Users drive up performance • Computer vision reduces manuallabor • More reliable than field guides

  48. Summary • Recognition of fine-grained categories 11.1 6.5 questions 19% • Users drive up performance • Computer vision reduces manuallabor • More reliable than field guides

  49. Future Work • Extend to domains other than birds • Methodologies for generating questions • Improve computer vision

  50. Questions? Project page and datasets available at: http://vision.caltech.edu/visipedia/ http://vision.ucsd.edu/project/visipedia/

More Related