220 likes | 382 Views
Structure from motion. ( Tomasi and Kanade ). Input: a set of point tracks Output: 3D location of each point (shape) camera parameters (motion). Orthographic SFM: Setup. : a collection of images (video frames) depicting a rigid scene Orthographic projection (no scale)
E N D
Structure from motion (Tomasi and Kanade) • Input: • a set of point tracks • Output: • 3D location of each point (shape) • camera parameters (motion)
Orthographic SFM: Setup • : a collection of images (video frames) depicting a rigid scene • Orthographic projection (no scale) • point tracks in those frames • Unknown 3D location:, • Projected locations: denote by the location of at frame , then , are the two top rows of a rotation matrix
Orthographic SFM: Objective Find and that minimize Subject to
Eliminate translation • We can eliminate translation by representing the location of each point relative to the centroids of all points: • Assume without loss of generality that the centroid of coincides with the origin • Translate each image point by setting denotes the centroid of
Objective (w/o translation) Find that minimize Subject to
Objective: matrix notation Find and that minimize Subject to is , is , is
TK-Factorization Step 1: find rank 3 approximation to using SVD • where • is , , • , size ,and • is ,
TK-Factorization where Note: this is a relaxation, only noise components outside the 3D space are annihilated Step 2: factorization Ambiguity: for any non-singular, matrix
TK-Factorization Step 3: resolve ambiguity Let , note that Let be the corresponding rows in , then Find a symmetric matrix
TK-Factorization • Equation is linear in • There are equations in 6 unknowns • Find by eigen-decomposition so that • Solution is obtained up to a rotation ambiguity such that
TK-Factorization: Summary • Eliminate translation, construct • to get rank 3 and factorize (ambiguity remains) • Resolve ambiguity: estimate from orthonormality and factorize to obtain Solution up to rotation and reflection
Incomplete tracks • Tracks are often incomplete – • Factorization with missing data • Rank is difficult to enforce • Surrogate: minimize the nuclear norm – sum of singular values, • Nuclear norm is convex, minimization often achieves low rank • Accurate reconstruction usually requires accounting for perspective distortion
Perspective projection • A point is projected to • A point rotated by and translated by projects to denotes the rows of • We call a camera matrix • calibration matrix, • camera orientation, • camera location
Bundle adjustment • Given points in frames, ,,find camera matrices and positions ()that minimize • Alternate optimization • Given and , solve for • Given solve forand • Very good initial guess is required
Bundler (photo-tourism) (Snavelyet al.)
Bundler (photo-tourism) • Given images, identify feature points, describe them with SIFTs • Match SIFTs, accept each match whose score is at least twice of any other match • For every pair of images with sufficiently many matches use RANSAC to recover Essential matrices • Starting with two images and adding one image at a time: use essential matrix to recover depth and apply bundle adjustment
Simultaneous solutions • : Essential matrix between and , • (on a subset of image pairs) • Objective: recover camera orientation and location relative to a global coordinate system • This can be solved in various ways, for example : least squares solution if we ignore the orthonormality constraints for
Essential in global coordinates • Corresponding points, and , satisfy the following relation • This generalizes the formula for the essential matrix (plug in , ) • Once camera orientations are known we can solve for camera locations • Solution suffers from shrinkage problems