280 likes | 447 Views
Distinctive Image Features from Scale-Invariant Keypoints. David G. Lowe International Journal of Computer Vision(IJCV), 2004. Extracting distinctive invariant features. Points are individually ambiguous More unique matches are possible with small regions of images .
E N D
Distinctive Image Featuresfrom Scale-Invariant Keypoints David G. Lowe International Journal of Computer Vision(IJCV), 2004
Extracting distinctive invariant features • Points are individually ambiguous • More unique matches are possible with small regions of images http://www.csie.ntu.edu.tw/~cyy/courses/vfx/05spring/lectures/handouts/lec04_feature.pdf
Desired properties for features • Invariant: invariant to scale, rotation, affine, illumination and noise for robust matching across a substantial range of affine distortion, viewpoint change and so on. • Distinctive: a single feature can be correctly matched with high probability
Moravec corner detector (1980) • We should easily recognize the point by looking through a small window • Shifting a window in anydirection should give a large change in intensity
Moravec corner detector flat edge
Moravec corner detector corner isolated point flat edge
Window function Shifted intensity Intensity Moravec corner detector Change of intensity for the shift [u,v]: Four shifts: (u,v) = (1,0), (1,1), (0,1), (-1, 1) Problem: responds too strong for edges because only minimum of E is taken into account
Harris corner detector [1992] • Consider all small shifts by Taylor’s expansion W(x, y): Gaussian function => M: 2x2 Hessian matrix, 1, 2 – eigenvalues of M
Harris corner detector 2 Measure of corner response: edge 2 >> 1 Corner 1 and 2 are large,1 ~ 2;E increases in all directions edge 1 >> 2 Classification of image points using eigenvalues of M: flat 1
Harris Detector: Problem • non-invariant to image scale! All points will be classified as edges Corner !
Scale-invariant feature transform (SIFT) • Scale-invariant feature transform (or SIFT) is an algorithm to detect and describe local features in images. • Distinctive features • Invariant to image scale, rotation and affine distortion • Applied locally on key-points • Based upon the image gradients in a local neighborhood
descriptor detector local descriptor SIFT stages: • Scale-space extrema detection • Keypoint localization • Orientation assignment • Keypoint descriptor
Convolution with a variable-scale Gaussian 1. Detection of scale-space extrema Difference-of-Gaussian (DoG) filter
Scale space doubles for the next octave 2 K=2(1/s), s+3 images for each octave k∙
DoG • Efficient function to compute • A close approximation to the scale-normalized Laplacian of Gaussian
2. Keypoint localization X is selected if it is larger or smaller than all 26 neighbors
Pre-smoothing =1.6, plus a double expansion
If has offset larger than 0.5, sample point is changed. If is less than 0.03 (low contrast), it is discarded. 2. Accurate keypoint localization Reject points with low contrast and poorly localized along an edge
Eliminating edge responses Let Keep the points with r=10
3. Orientation assignment • By assigning a consistent orientation, the keypoint descriptor can be orientation invariant. • For a keypoint, L is the image with the closest scale • 36-bin orientation histogram over 360° • weighted by m • Peak is the dominant orientation • Local peak within 80% creates multiple orientations • About 15% has multiple orientations
4. Local image descriptor • Image gradients are sampled over 16x16 array of locations in scale space • Create array of orientation histograms • 8 orientations x 4x4 histogram array = 128 dimensions