1 / 35

Object and Face Recognition: Lecture Notes

Object and Face Recognition: Lecture Notes . Introduction These notes are in 2 parts, roughly corresponding to (1) object recognition; and (2) face recognition.

mimis
Download Presentation

Object and Face Recognition: Lecture Notes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Object and Face Recognition: Lecture Notes Introduction • These notes are in 2 parts, roughly corresponding to (1) object recognition; and (2) face recognition. • We'll actually make a smooth transition by discussing whether face recognition is a special kind of recognition, or whether it can be described in terms of more general abilities.

  2. 1. Early Models of Recognition • Template-matching. Early models of recognition were based on matching the stimulus pattern to a set of stored, pre-defined 'templates'. These models can be expanded by incorporating some pre-processing • Feature analysis. Instead of looking for a match to a standard shape, feature analysis emphasizes that what makes an 'A' an A is the letter's unique set of defining features

  3. 2. Failures of Matching-Models There are several key weaknesses of these early approaches to recognition: • 3 dimensions. The template-matching and feature analysis models were designed mostly to recognize alphanumeric characters • However, most real-world human recognition occurs on 3D objects, not 2D symbols • Superficial differences. It's hard for these models to 'extract' the basic similarity of repeat instances of an item while also representing superficial differences (consider handwriting as an example)

  4. Where to start? • These models don't have a clear means for breaking up complex input into a series of objects-to-be-identified; instead, they seem to rely on being 'fed' items (e.g., letters) to recognize • That seems utterly unlike human perception

  5. 3. Building Blocks of Recognition? • In order to simulate human recognition in a complex 3D world, more recent models of object recognition have relied on defining constraints (or "primitives") and basic strategies that the visual system might use

  6. Marr and Nishihara • Marr and Nishihara present a model of recognition, restricted to the set of objects that can be described as generalized cones-- objects with a clear main axis and a constant-shape cross section

  7. Recognition occurs in the following levels: • Single-model axis. The first step in the model is identification of the main axis of the object • Component axes. Then, the axes of each of the smaller sub-portions of the object are identified.[see steps 1 and 2] • 3D model match. Finally, a match between the arrangement of components and a stored 3D model description is performed to identify the object. [see step 3-- the catalog]

  8. Steps 1 & 2 of Marr and Nishihara’s theory:

  9. Step 3: The catalog (Marr & Nishihara)

  10. Problem • Although object comparisons are fastest if the main axis of an object is the same as the object it is being judged against, there is not strong data supporting the 'psychological reality' of the Marr & Nishihara model (although it might be a very good way to get computers to identify objects)

  11. Geons (Biederman) • According to geon theory, complex objects are made up of arrangements of basic, component parts ('geons') • Some evidence for the psychological reality of geons comes from experiments that show that recognition is impaired when we hide the intersections of geons • However, occluding these intersections also changes the picture in many other ways

  12. Object with geon intersections occluded

  13. Object with geon intersections revealed

  14. 4. View-Dependent Recognition? • An alternate theory of recognition is simply that you store a set of characteristic views (mental images) of each object • You recognize each object by finding the closest match • Note that this type of model does not require you to infer/recover 3D shape in order to identify objects

  15. Experimental Evidence • Bulthoff & Edelman trained people to learn the shapes of weird objects that they made up • They tested them on identification of the objects shown from (1) the same views as during training; and (2) different views • Performance was much better for the training (familiar) views

  16. 5. Perceptual Organization • We’ve switched from talking about how the visual system extracts important features of the environment-- edges, colors, motion, depth, etc-- to how the visual system identifies objects (which are made up of these key features) • The goal of understanding "perceptual organization" is to understand how we put together the basic features to see a coherent, organized world of things and surfaces

  17. Gestalt • The basic motto of the Gestaltists-- the first real "school" of perceptual organization theory-- was that the whole is greater than the sum of its parts • Examples: a melody made a various notes; or see an example from the pointillist painter Paul Signac

  18. Figure & Ground • The first step in organizing the perceptual world is to divide figure and ground; ambiguous figures demonstrate this • Notice that you can perceive either the white or dark portion of the picture as figure or ground, but that it's very difficult to perceive both as figure or both as ground simultaneously

  19. Figure & Ground

  20. Gestalt laws of organization: • Proximity. Elements that are close are often grouped together • Similarity. Similar elements tend to be grouped together • Common fate. Things that move together tend to group • Good continuation. Perceptual mechanisms tend to preserve smooth continuity in favor of abrupt edges • Closure. Of several possible perceptual organizations, ones yielding "closed" figures are more likely than those yielding "open" ones. • Orientation & symmetry. Objects oriented with horizontal and vetical axes, or ones that are symmetric, are more often perceived as figures

  21. Demonstration of proximity & similarity

  22. Demonstration of good continuation (your perception of the objects on the left) and not good continuation (right)

  23. Figure/ground changes with orientation and closure

  24. The law of Pragnanz: • "Of several geometrically possible organizations that one will actually occur that possesses the best, simplest, and most stable shape" (Koffka, 1935) • But what's the "best, simplest, most stable"? • Do you think that the Gestalt laws of organization provide a sufficient answer?

  25. Illusions • Sometimes, your visual system must group/segment an object, not based on edges in the retinal image, but by inferring parts of the missing boundaries • Consider illusory contours; and also note that such edgeless contours can occur in the real world, too

  26. Illusory contour. Note that there is no physical contour for most of the perceived "edge"

  27. Illusory contour in real life

  28. Impossible Figures • Certain figures can be drawn so that there is no consistent interpretation, due to the fact that there is no obvious real-world physical object that could correspond to the image • However, "impossible objects" can be built so that they look like an impossible figure from a certain vantage point

  29. Impossible figure

  30. Impossible object

  31. Impossible object, revealed

  32. Failures of Grouping • Some figures do not yield a consistent grouping, and so the visual system struggles to arrive at a solution, and ends up switching back and forth between several possible groupings

  33. Multiple possible groupings

More Related