510 likes | 686 Views
Roadmap. Introduction to Block Matching Algorithm Fast BMA Three classes of speed-up strategies Generalized BMA From integer-pel to fractional-pel From fixed block size to variable block size Deformable BMA (DBMA) or mesh-based BMA Experimental results
E N D
Roadmap • Introduction to Block Matching Algorithm • Fast BMA • Three classes of speed-up strategies • Generalized BMA • From integer-pel to fractional-pel • From fixed block size to variable block size • Deformable BMA (DBMA) or mesh-based BMA • Experimental results • How do block size and motion accuracy affect the MCP efficiency? EE591f Digital Video Processing
An Intuitive Way of Understanding Block Matching Algorithm (BMA) a c d ? b e template f database EE591f Digital Video Processing
Block Matching in Motion Estimation b a c d reference frame inquiry block in current frame a: (-3,-2) b: (-3,-1) c: (0,0) d: (1,2) EE591f Digital Video Processing
Motion Estimation and Compensation Motion Compensation With the estimated motion vector, the block in the reference frame is displaced to generate a prediction of the inquiry block in the current frame. Such procedure is called “motion compensation”. Motion Compensated Prediction (MCP) residues displaced reference frame current frame (d1,d2) : estimated motion vector EE591f Digital Video Processing
Two Key Elements in BMA • Matching criterion: How do I measure the similarity between two blocks? • Mean Square Error (MSE): L2 norm • Mean Absolute Difference (MAD): L1 norm • Search strategy: How do I find the best match of the given block? • Exhaustive search: global minimum • Non-exhaustive search: close to global minimum EE591f Digital Video Processing
Goal: Find the Best Tradeoff variance of MCP residues computational cost EE591f Digital Video Processing
Roadmap • Introduction to Block Matching Algorithm • Fast BMA • Three classes of speed-up strategies • Generalized BMA • From fixed block size to variable block size • From integer-pel to fractional-pel • Experimental results • How do block size and motion accuracy affect the MCP efficiency? EE591f Digital Video Processing
Benchmark: Exhaustive Search An example of window size T=7 It searches (2T+1)2=225 points in total EE591f Digital Video Processing
Fast Block Matching Algorithms • Class-A (I-IV): ad-hoc speed-up strategies • Class-B (V-VII): advanced speed-up strategies (wise use of computational resource to account for probabilities) • Class-C (VIII): hierarchical strategy General Principle – trade complexity with performance EE591f Digital Video Processing
Fast BMA (I): 3-Step-Search search 9+8+8= 25 points EE591f Digital Video Processing
Fast BMA (II): Logarithmic Search search at most 5+4+2+3+2+8= 24 points EE591f Digital Video Processing
Fast BMA (III): Orthogonal Search search at most 2(3+2+2+2+2+2)= 26 points EE591f Digital Video Processing
Fast BMA (IV): Cross Search search at most 5+4+4+4= 17 points EE591f Digital Video Processing
Why does probabilistic modeling of MV help? Empirical pdf of motion vectors EE591f Digital Video Processing
Fast BMA (V): New 3-Step Search EE591f Digital Video Processing
New 3-Step Search: Examples EE591f Digital Video Processing
Fast BMA (VI): 4-Step Search Search the 9 checking points located at a 5-by-5 window to see if the point reaching the minimum distortion is found at the center? N Y N Is it at the corner or not? Search 5 additional Checking points Search 3 additional Checking points Y Repeat the procedure in the dashed box Final 3-by-3 search EE591f Digital Video Processing
4-Step Search: Examples EE591f Digital Video Processing
The Idea of Successive Refinement • Note that in all previous approaches to fast BMA, we only consider the possibility of reducing the number of search points • For each search point, we still need to calculate the matching criterion for a B-times-B block • To further reduce the complexity, we might consider reducing the cost of each matching as well EE591f Digital Video Processing
Multi-resolution Representation of Images M/4 N/4 M/2 N/2 M N Multi-resolution representation by pyramid EE591f Digital Video Processing
Why does Hierarchical Strategy Help? Level-2 ME result Level-1 ME result Level-0 EE591f Digital Video Processing
Hierarchical Block Matching Algorithm (HBMA) EE591f Digital Video Processing
Example: Three-level HBMA EE591f Digital Video Processing
Fast BMA (VIII): HierarchicalSearch EE591f Digital Video Processing
Summary • Why do we care fast BMA? • Driven by the application demands of video coding • Can we go beyond BMA? • The block-based constraint is simple but not appropriate for accounting for arbitrary shape of moving objects • The integer-pel accuracy is not sufficient to account for continuous nature of motion EE591f Digital Video Processing
Roadmap • Introduction to Block Matching Algorithm • Fast BMA • Three classes of speed-up strategies • Generalized BMA • From integer-pel to fractional-pel • From fixed block size to variable block size • Deformable BMA (DBMA) or mesh-based BMA* • Experimental results • How do block size and motion accuracy affect the MCP efficiency? EE591f Digital Video Processing
Why Do We Need Fraction-pel? EE591f Digital Video Processing
Fractional-pel BMA 2N N 2M linear interpolation M original reference frame interpolated reference frame EE591f Digital Video Processing
Half-pel BMA 1 1 1 1 current frame digits indicate physical distances reference frame EE591f Digital Video Processing
Bilinear Interpolation (x,y) (x+1,y) (2x,2y) (2x+1,2y) (2x,2y+1) (2x+1,2y+1) (x,y+!) (x+1,y+1) O[2x,2y]=I[x,y] O[2x+1,2y]=(I[x,y]+I[x+1,y])/2 O[2x,2y+1]=(I[x,y]+I[x,y+1])/2 O[2x+1,2y+1]=(I[x,y]+I[x+1,y]+I[x,y+1]+I[x+1,y+1])/4 Generalize to 1/K pixel where K >2 EE591f Digital Video Processing
Hierarchical Strategy for Half-pel BMA Integer-pel Half-pel EE591f Digital Video Processing
Beyond Half-pel Accuracy • There exist results supporting the further prediction efficiency gain from half-pel to quarter-pel; sometimes it is even worthwhile to reach 1/8-pel accuracy • The improved prediction efficiency is comprised by modestly increased computational complexity and overhead • Question: for what kind of video, finer-accuracy improves the MCP efficiency most? EE591f Digital Video Processing
Generalizations of BMA • Variable block-size matching algorithms • Widely used by various video coding standards • H.264 includes three variable block sizes: 4-by-4, 8-by-8 and 16-by-16 • Fractional-pel accuracy BMA • Half-pel : MPEG-1/2/4, H.263/H.263+ • Quarter-pel: H.264 (even 1/8-pel) • Tradeoff between overhead on motion and MCP efficiency EE591f Digital Video Processing
Variable Block-size BMA 16-by-16 4-by-4 8-by-8 EE591f Digital Video Processing
BMA Strategy Adopted by H.263 16-by-16 8-by-8 Macroblock level Block level EE591f Digital Video Processing
BMA Strategy Adopted by H.264 8-by-16 16-by-8 8-by-8 16-by-16 8-by-8 4-by-8 8-by-4 4-by-4 Note: require overhead to signal which partition is adopted by the encoder EE591f Digital Video Processing
Deformable Block Matching Algorithm EE591f Digital Video Processing
Overview of DBMA • Three steps: • Partition the anchor frame into regular blocks • Model the motion in each block by a more complex motion • The 2-D motion caused by a flat surface patch undergoing rigid 3-D motion can be approximated well by projective mapping • Projective Mapping can be approximated by affine mapping and bilinear mapping • Estimate the motion parameters block by block independently • Discontinuity problem cross block boundaries still remain EE591f Digital Video Processing
Affine and Bilinear Model • Affine (6 parameters): • Good for mapping triangles to triangles • Bilinear (8 parameters): • Good for mapping blocks to quadrangles EE591f Digital Video Processing
Mesh-Based Motion Estimation A control grid is used to partition a frame into non-overlapping polygon elements. The nodal motion is constrained so that a feasible mesh is still formed with the motion. (a) Using a triangular mesh (b) Using a quadrilateral mesh EE591f Digital Video Processing
Mesh-based vs Block-based (a) block-based ME (b) mesh-based ME (c) mesh-based motion tracking EE591f Digital Video Processing
Example: BMA vs Mesh-based Target Anchor EBMA (half-pel) (29.86dB) Predicted EE591f Digital Video Processing Mesh-based method (29.72dB)
Roadmap • Introduction to Block Matching Algorithm • Fast BMA • Three classes of speed-up strategies • Generalized BMA • From fixed block size to variable block size • From integer-pel to fractional-pel • Experimental results • How do block size and motion accuracy affect the MCP efficiency? EE591f Digital Video Processing
Experiment Results Frame #1 Frame #2 EE591f Digital Video Processing
Motion-Compensated Prediction Residues 16-by-16 block, integer-pel, var(e)=271.8 EE591f Digital Video Processing
Motion-Compensated Prediction Residues 8-by-8 block, integer-pel, var(e)=220.8 EE591f Digital Video Processing
Motion-Compensated Prediction Residues 16-by-16 block, half-pel, var(e)=164.2 EE591f Digital Video Processing
Motion-Compensated Prediction Residues 8-by-8 block, half-pel, var(e)=123.8 EE591f Digital Video Processing