1 / 11

3 Small Comments Alex Berg Stony Brook University

3 Small Comments Alex Berg Stony Brook University. I work on recognition: features – action recognition – alignment – detection – attributes – hierarchical image classification & retrieval + machine learning for large scale recognition

monty
Download Presentation

3 Small Comments Alex Berg Stony Brook University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 3 Small CommentsAlex BergStony Brook University I work on recognition: features – action recognition – alignment – detection – attributes – hierarchical image classification & retrieval + machine learning for large scale recognition Collaborations: words & pictures – ImageNet – human visual search – neural coding of visual memory

  2. 3 Small Comments • 0. Good computer vision tells us something about the structure of the visual world • or about our descriptions of the visual world. • We should know what we are recognizing, and be able to prove it! • Large datasets can be gallant but doomed attempts to avoid hard representation problems. • (Small datasets are worse.) • We should try to understand uncertainty in computer vision.

  3. 0. Good computer vision tells us something about the structure of the visual world. The structure of the visual world is complex. So are the ways we describe it. Recognition experiments probe mappings from samples of the visual world to descriptions.

  4. 1. We should know what we are recognizing, and be able to prove it! This is a PR problem! Advertise our successes and abilities, but avoid over-selling. Example Given training and evaluation of an X detector on my own dataset: Ideal If your dataset has Xs, then my algorithm can detect them. Great If people can detect Xs in your dataset, then my algorithm will detect them. Good I can predict performance on your dataset based on a statistical characterization. Okay I can very roughly predict performance on your dataset based on a characterization. Now If I were to run my algorithm on your dataset, we could determine performance. Bad I don’t know how it works on my dataset!

  5. Describable Visual Attributes for Face Verification and Image SearchN. Kumar, A.C. Berg, P.N. Belhumeur, S.K. Nayar (T.PAMI 2011) Verification classifier

  6. LFW Results In 2009 In 2009 6

  7. Describable Visual Attributes for Face Verification and Image SearchN. Kumar, A.C. Berg, P.N. Belhumeur, S.K. Nayar Verification classifier

  8. 2. Large datasets can be gallant but doomed attempts to avoid hard representation problems. (Small datasets are worse!) Efficient Additive Models for Detection & Classification w/ SubhransuMaji @ UCB −> TTI-C Large Dataset Test Image w/ Li Fei-Fei & Jia Deng @ Stanford Large Scale Recognition Challenge going on now(part of PASCAL VOC) Image Classification + Object Detection! Might hope that matching/classifying a whole image / pattern / patch @ large scale “just works” for recognition. There comes a point when it is necessary to go into more detail – this is the regime of mid-level vision. More data provides better joint statistics, but is enough only sometimes. There is some boost from looking at large data, but, to do well we still need to address hard (mid-level) representation problems.

  9. 3. We should try to understand uncertainty in computer vision. Need an explicit idea of what is possible/likely given observations. We need this for low, mid, and high level vision. This is a difficult representational challenge – mainly because of complex structure.

  10. What does classifying more than 10,000 image categories tell us?J. Deng, A.C. Berg, K. Li, L. Fei-Fei (ECCV 2010) Correlation between CV classifier confusions and WordNet!

  11. 3 Small Comments • 0. Good computer vision tells us something about the structure of the visual world • or about our descriptions of the visual world. • We should know what we are recognizing, and be able to prove it! • Large datasets can be gallant but doomed attempts to avoid hard representation problems. • (Small datasets are worse.) • We should try to understand uncertainty in computer vision.

More Related