1 / 46

Rigid Structure from Video

Rigid Structure from Video Pedro M. Q. Aguiar Outline Other methods - limitations Proposed approach Problem formulation Algorithms Experiments Motivation Segmentation of 2D rigid moving objects Inference of 3D rigid structure Content-based video representation

benjamin
Download Presentation

Rigid Structure from Video

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rigid Structure from Video Pedro M. Q. Aguiar

  2. Outline • Other methods - limitations • Proposed approach • Problem formulation • Algorithms • Experiments • Motivation • Segmentation of 2D rigid moving objects • Inference of 3D rigid structure

  3. Content-based video representation apps: compression, non-linear editing, virtual reality, etc Motivation • Video • Generative Video (GV) [Jasinschi & Moura, 95] • flat scenario • flat moving objects • PROBLEM: Segmentation of 2D rigid moving objects • 3D content-based representation • 3D rigid shape • 3D motion • PROBLEM: Inference of 3D rigid structure (shape and motion)

  4. Motion segmentation in low texture with low texture, segmentation fails ! • Two-frame motion-based segmentation • No prior knowledge about shape, texture • [Diehl, 91] time consumingalgorithms ! • Possible solution - smoothing • Statistical regularization [Dubuisson & Jain, 95] • Combine motion with other attributes [Bouthemy & François, 93] • Proposed approach - exploit rigidity over a set of frames • Explicit modeling of occlusion • Feasible implementation of MLE

  5. Observation model background camera window camera position camera position object template (modeling of oclusion) object position object texture noise

  6. Maximum Likelihood estimation • Given • set of F frames • Estimate • background texture • object texture • object template • camera motion • object motion • ML cost function over all frames and pixels • ML estimate

  7. Minimization procedure • ML estimation quadratic in O and B average of the observations, after registration • Object and background estimates linear in T average of the observations, in the regions not occluded by the object nonlinear in T • Decouple the estimation of the position vectors • Motion is estimated on a frame by frame basis [Bergen et al, 92]

  8. Minimization procedure - two-step iterative method • Replacing and in the ML cost function nonlinear minimization ! • Replacing only in the ML cost function • minimize using a two-step iterative method: • solve for with fixed • solve for with fixed (quadratic, closed-form solution) (linear, closed-form solution)

  9. Minimization procedure - segmentation matrix Segmentation matrix • Template estimate • Replacing only in the ML cost function Accumulated differences between each pair of co-registered frames Accumulated differences between each frame and the background • regions where the test is inconclusive with the available F frames linear in T !

  10. Experiment moving object three frames from the image sequence background

  11. Experiment background estimate Two-step method template estimate

  12. Experiment background estimate moving objects four frames from a video sequence

  13. 3D structure from 2D video • Motivation: 3D content-based video representation (application areas go well behind digital video) • Key step: recovery of 3D shape and 3D motion from an image sequence • Strongest cue: motion of the brightness pattern • Structure From Motion: • Step 1. Compute the 2D motion on the image plane • Step 2. Recover the 3D motion and the depth

  14. Two-frame SFM - common problem • step 1. track feature points across a set of frames • step 2. recover relative depth and set of 3D positions • Two-frame SFM failswhen object is far from camera 3D • Solution: exploit rigidity - multi-frame SFM • Multi-frame Structure From Motion:

  15. Factorization method expedite method • Factorization [Tomasi & Kanade, 92]: • uses linear subspace constraints • 3D structure is estimated by factorizing a measurement matrix R whose entries are the trajectories of feature point projections • without noise, R is rank 3. AnSVD is used to factorize matrix R • Multi-frame SFM - hard problem: • non-linear • large set of unknowns (due to the entire set of 3D positions) • Problems: • track a large set of features: computationally very heavy, if possible • cost of SVD: high for large number of features or frames

  16. Proposed approach: surfaced-based factorization • Induces a parametric description for the 2D motion in the image plane • Recover the 3D shape and 3D motion parameters from the 2D motion parameters by further exploiting linear subspace constraints: • surface-based factorization • rank 1 factorization • weighted factorization uses a fast algorithm to compute only the largest singular value computes the weighted estimate without additional computational cost • Describe the 3D shape by a local parameterization

  17. Maximum Likelihood formulation rather than the two components of the motion, local depth is a single unknown • Observations: the images in the sequence. Unknowns: object texture, 3D shape, 3D motion • Through ML, 3D structure is recovered: • Exploiting object rigidity over a set of frames • Directly from the image intensity values so, where do SFM and factorization come from ? • Minimization procedure : • Minimize with respect to the texture in terms of 3D shape and 3D motion • After replacing the texture estimate, the ML cost function depends on the 3D structure only through the 2D motion in the image plane • Estimate 3D motion by inferring SFM (factorization). Plug-in the 3D motion estimates • Minimize the ML cost function with respect to the relative depth • Local 2D motion estimation is ill-posed - aperture problem. Direct methods: • Infer 3D structure by using the brightness change constraintbetween two frames [Horn & Weldon, 88] • Kalman filter to update estimates over time [J. Hell, 90]

  18. Observation model • Observation model texture shape 3D position • Unknowns:

  19. Texture estimate • Texture estimate - weighted average • ML estimate

  20. SFM as an approximation to MLE • The ML cost function depends on the 3D structure only through the 2D motion induced in the image plane (no approximations involved) • Insert the texture estimate into the cost function • 3D structure estimation: • 3D motion estimation: • Compute 2D motion • SFM: rank 1 surface-based factorization • 3D shape estimation: • Plug-in the 3D motion estimate into the ML cost function • Then, minimize with respect to the shape • (The estimates can be refined by minimizing the ML cost function in two alternate steps, • but initialization is the key problem)

  21. Feature-based SFM Translation estimate: Define:

  22. Rank 1 factorization • Decomposition (minimize without constraints) Define: • Normalization (computes by approximating the constraints) Define:

  23. Rank 1 factorization - experiment three larger singularvalues of R matrix is well described by its largest singular value

  24. Rank 1 factorization - experiment all trajectories have equal shape - it depends only on the 3D motion. The scaling factor depends on the 3D shape (relative depth) 3D shape and 3D motion are observed in a coupled way through the feature trajectories

  25. Surface-based factorization • Orthographic projection (easily extended to scaled-orthographic and para-perspective projections) • 2D motion in the image plane is affine Relation between the parameters: • Rank 1 factorization Multi-frame SFM: • Piecewise planar 3D shapes

  26. Surface-based factorization - experiment smooth texture image motion parameters image sequence

  27. Surface-based factorization - experiment motion shape

  28. Weighted factorization observation noise • rank 1 factorization

  29. Weighted factorization - experiment non-weighted estimates weighted estimates two components of translation six entries of the rotation matrix feature trajectories

  30. Feature trajectories

  31. Non-weighted factorization - reconstruction

  32. Weighted factorization - reconstruction

  33. ML estimate of the 3D shape • Image motion: known motion parameters affine mapping that depends only on the 3D motion • Define a sequence: • Motion of the affine mapped sequence: unknown relative depth shape of the trajectory of s (known from 3D motion) magnitude of the trajectory of s (unknown relative depth) • Plug-in the 3D motion estimate into the ML cost function • Estimating the relative depth after plugging-in the 3D motion is more constrained than estimating the image motion • Motivation for the minimization procedure

  34. Minimization procedure - multiresolution • Multiresolution continuation-type method • coarse-to-fine as more images are being taken into account • each stage minimizes the ML cost function by using a Gauss-Newton method components of the image gradient • Region R - constant relative depth z

  35. Experiment • Image sequence: • and motion: • Shape

  36. Experiment Affine mapped image sequence: • Shape:

  37. Experiment without smoothing Multiresolution continuation-type method. Shape estimate:

  38. Experiment

  39. Experiment • Synthesizing different views:

  40. Application - video compression Original Compressed 317:1 Compressed 575:1 Texture patches JPEG compressed

  41. Major contributions and extensions • Explicit modeling of occlusion • Multiframe motion segmentation algorithm (two-step) • Surface-based factorization • Rank 1 factorization • Weighted factorization • extension: contour model • extensions: • other projection models • multibody • occlusion • 3D deformable shape from a set of cameras • subspace constraints for image motion estimation • Multiresolution algorithm for direct inference of 3D shape • extension: parameterized surface model

  42. Experiment Multiresolution continuation-type method. Shape estimate:

  43. Experiment

  44. Experiment

  45. Experiment

  46. Rank 1 factorization - computational cost

More Related