1 / 43

Graphical Models in Vision .

Graphical Models in Vision . Alan L. Yuille. UCLA. Dept. Statistics. The Purpose of Vision. “To Know What is Where by Looking”. Aristotle. (384-322 BC). Information Processing: receive a signal by light rays and decode its information.

astra
Download Presentation

Graphical Models in Vision .

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Graphical Models in Vision. Alan L. Yuille. UCLA. Dept. Statistics

  2. The Purpose of Vision. • “To Know What is Where by Looking”. Aristotle. (384-322 BC). • Information Processing: receive a signal by light rays and decode its information. • Vision appears deceptively simple, but there is more to Vision than meets the Eye.

  3. Ames Room

  4. Perspective.

  5. What are Humans Ideal for? • Clearly humans are not good at determining the size of objects in images – at least for these types of stimuli. • But they are good at determining context and taking contextual cues into account – i.e. use perspective cues to estimate depth and make adjustments. • What reasoning/statistical tasks are humans ideal for?

  6. Brightness of Patterns: Adelson (MIT)

  7. Visual Illusions • The perception of brightness of a surface, • or the length of a line, • depends on context. • Not on basic measurements like: • the no. of photons that reach the eye • or the length of line in the image.

  8. Vision is ill-posed. • Vision is ill-posed – the data in the retina is not sufficient to unambiguously determine the visual scene. • Vision is possible because we have prior knowledge about visual scenes. • Even simple perception is an act of creation.

  9. Perception as Inference • Helmholtz. 1821-1894. • “Perception as Unconscious Inference”.

  10. Ball in a Box. (D. Kersten)

  11. How Hard is Vision? • The Human Brain devotes an enormous amount of resources to vision. • (I) Optic nerve is the biggest nerve in the body. • (II) Roughly half of the neurons in the cortex are involved in vision (van Essen). • If intelligence is proportional to neural activity, then vision requires more intelligence than mathematics or chess.

  12. Vision and the Brain

  13. Half the Cortex does Vision

  14. Vision and Artificial Intelligence • The hardness of vision became clearer when the Artificial Intelligence community tried to design computer programs to do vision. ’60s. • AI workers thought that vision was “low- level” and easy. • Prof. Marvin Minsky (pioneer of AI) asked a student to solve vision as a summer project.

  15. Chess and Face Detection • Artificial Intelligence Community preferred Chess to Vision. • By the mid-90’s Chess programs could beat the world champion Kasparov. • But computers could not find faces in images.

  16. Man and Machine. • David Marr (1945-1980) • Three Levels of explanation: 1. Computation Level/Information Processing 2. Algorithmic Level 3. Hardware: Neurons versus silicon chips. Claim: Man and Machine are similar at Level 1.

  17. Vision: Decoding Images

  18. Vision as Probabilistic Inference • Represent the World by S. • Represent the Image by I. • Goal: decode I and infer S. • Model image formation by likelihood function, generative model, P(I|S) • Model our knowledge of the world by a prior P(S).

  19. Bayes Theorem • Then Bayes’ Theorem states we show infer the world S from I by • P(S|I) = P(I|S)P(S)/P(I). • Rev. T. Bayes. 1702-1761

  20. Bayes to Infer S from I • P(I|S) likelihood function . P(S) prior. .

  21. Ambiguity and Complexity of Images. • Similar objects give rise to very different images. Different objects can cause similar images.

  22. Ideal Observers The Image of a cylinder is consistent with multiple objects and viewpoints. • The likelihood is ambiguous (concave or convex). • The prior resolves the ambiguity by biasing towards convex objects viewed from above.

  23. Influence Graphs and Visual Tasks • Influence Graphs and the Visual Task

  24. A Simple Taxonomy of Graphs • A Taxonomy of Graphs: B. C. D.

  25. Examples of Vision Tasks • Visual Inference: (1) Estimating Shape. (2) Segmenting Images. (3) Detecting Faces. (4) Detecting and Reading Text. (5) Parsing the full image – detect and recognize all objects in the image, understand the viewed scene.

  26. Segmentation (Level Sets)

  27. Segmentation (Level Sets)

  28. Analysis by Synthesis • Invert generation process to parse the image. • Probabilistic Grammars for image generation (week 2).

  29. Probabilistic Grammars for Images • (I) Image are generated by composing visual patterns: • (II) Parse an image by decomposing it into patterns.

  30. Generative Models for Patterns • Examples of images synthesized from generative models (MCMC).

  31. Shape Inference

  32. Face and Text Detection.

  33. Text Detection

  34. Towards Full Image Parsing • The image genome project (Zhu). • Attempt to determine the grammar for images by interactive parsing of images. • Thereby learn the statistical regularities of images – the priors and the representations.

  35. Parse graph with horizontal relations

  36. Example: street scene

  37. Database

  38. Back to the Brain • Top-Level; compare human performance to Ideal Observers. • Explain human perceptual biases (visual illusions) as strategies that are “statistical effective”.

  39. Brain Architecture • The Bayesian models have interesting analogies to the brain. • Generative models and analysis by synthesis. • This is consistent with top-down processing? (Kersten’s talk next week).

  40. Conclusion • Vision is unconscious inference. • Bayesian Approach lead to vision as analysis by synthesis -- inverting the image generation process. • This requires “sophisticated” priors about the statistics of natural images. • This can be formulated mathematically in terms of Probabilistic Grammars for image formation. • These grammars can be learnt by analysing the “sophisticated” statistics of natural images.

More Related