670 likes | 1.39k Views
A review on:. Visual Object Tracking. Alireza Asvadi. Video-Vigilance and Biometrics. University of Coimbra. May 2014. Outline:. Definition and application. “Object tracking” from different points of view. Yilmaz et al. 06. Yang et al. 11. David Forsyth & Jean Ponce 12.
E N D
A review on: Visual Object Tracking Alireza Asvadi • Video-Vigilance and Biometrics University of Coimbra May 2014
Outline: Definition and application “Object tracking” from different points of view Yilmaz et al. 06 Yang et al. 11 David Forsyth & Jean Ponce 12
What is object tracking? Estimating the trajectory of an object over time by locating its position in every frame.
Applications: motion-based recognition automated surveillance video indexing human-computer interaction traffic monitoring vehicle navigation …
Object tracking from different points of view Yilmaz et al. 06 Yang et al. 11 David Forsyth & Jean Ponce 12
How to categorize? There are different points of view David Forsyth & Jean Ponce 12 Yilmaz et al. 06 Object tracking Yang et al. 11
Object representation Object Tracking: A Survey Feature selection for tracking Yilmaz et al. 06 Point detectors Background subtraction Object detection Segmentation Supervised learning Object tracking Point tracking Kernel tracking Silhouette tracking
Object representation: Points, primitive geometric shapes, object silhouette and contour, articulated shape models and skeletal models. Object representations are chosen according to the application domain.
Feature selection for tracking: Desirable property of a visual feature is its uniqueness so that the objects can be easily distinguished in the feature space. Color: RGB, L*u*v, L*a*b, HSV Edges: less sensitive to illumination changes Optical flow: displacement vectors Brightness constraint [Horn & Schunk 81] Texture: measure of the intensity variation of a surface many tracking algorithms use a combination of these features.
Object detection: Every tracking method requires an object detection mechanism either in every frame or when the object first appears in the video. Point detectors: are used to find interest points in images which have an expressive texture in their respective localities (ex. Harris & SIFT). A desirable interest point: invariance to illumination and camera viewpoint Background subtraction: Object detection can be achieved by building a representation of the scene called the background model and then finding deviations from the model for each incoming frame. Segmentation: The aim of image segmentation algorithms is to partition the image into perceptually similar regions and can be used for object detection (ex. Mean shift clustering & graph cut). Supervised learning: Object detection can be performed by learning different object views automatically from a set of examples by means of a supervised learning mechanism (ex. SVM).
Point tracking: Objects detected in consecutive frames are represented by points. Deterministic Methods: Proximity, maximum velocity (r denotes radius), small velocity-change, common motion, rigidity constraints Statistical Methods: Kalman filter, particle filter
Kernel tracking: Kernel refers to the object shape and appearance. For example, the kernel can be a rectangular template or an elliptical shape with an associated histogram. Template and density based appearance models: Ex. Template matching & mean shift tracking. Advantage of the mean shift is the elimination of a brute force search. Multiview appearance models: The objects may appear different from different views. Different views of the object can be learned offline and used for tracking.
Silhouette tracking: Silhouette-based object tracker try to find the object region in each frame by means of an object model generated using the previous frames. Shape matching: Object model is in the form of an edge map. Shape matching performed similar to tracking based on template matching. Contour tracking: Iteratively evolve an initial contour in the previous frame to its new position in the current frame. This contour evolution requires that some part of the object in the current frame overlap with the object region in the previous frame.
Summary: Point tracking: Good for finding Geometrical and 3D structure of object. Point correspondence is a complicated problem-specially in the presence of occlusions, misdetections, entries, and exits of objects. Kernel tracking: Real time applicability. One of the limitations of primitive geometric shapes for object representation is that parts of the objects may be left outside of the defined shape while parts of the background may reside inside it. Silhouette tracking: Good for modeling object with complex shape. Sensitive to noise. Not capable to deal with object split and merge.
Recent advances and trends in visual tracking: A review Yang et al. 11 Feature descriptors for visual tracking Online learning based tracking methods Generative methods Discriminative methods
Feature descriptors for visual tracking: HOG multi-resolution HOG SIFT SURF … Gradient features: statistical summarization of the gradients CSIFT (concatenation of the hue histogram with the SIFT descriptor. Invariant to light intensity) Color features: LBP (Local Binary Patterns) a grayscale invariant texture MB-LBP (multi-scale block LBP) … Texture features: HOG/HOF (Histograms of oriented gradients & optic flow) HOG3D (histograms of 3D gradient orientations) ESURF (extends the image SURF descriptor to video) DLBP (dynamical Local Binary Patterns) … Spatio-temporal features: Multiple features fusion: HOG-LBP, …
Online learning based tracking methods: Generative online learning methods: The generative method builds a model to describe the appearance of an object and then finds the object by searching for the region most similar to the reference model in each frame. The object model is often updated online to adapt to appearance changes. generative methods would easily fail within cluttered background. Discriminative online learning methods: Discriminative methods pose object tracking as a binary classification problem in which the task is to determine a decision boundary that distinguishes the object from the background without the need to a complex model characterizing the object. To handle appearance changes, the classifier is updated incrementally over time. A major shortcoming of discriminative methods is their noise sensitivity. There are methods which combine these two methods.
Computer Vision A Modern Approach 2nd Edition, Ch 11, Tracking Forsyth et al. 12 Tracking by detection Tracking with dynamics Applying Data association Object tracking
Tracking by detection: we have a strong model of the object, we detect the object independently in each frame and can record its position over time. Time t Time t+1 Time t+n Occlusions Problems: Time Similar Objects
Tracking with dynamics: Observation (Detected object) + Dynamics Key idea: Given a model of expected motion, predict where objects will occur in next frame. Filtering Problem: Estimate of c based on prediction a and measurement b The Kalman filter: The information from the predictions and measurements are combined to provide the best possible estimate of the location of the train. the product of two Gaussian functions is another Gaussian function. R. Faragher , “Understanding the Basis of the Kalman Filter Via a Simple and Intuitive Derivation,” IEEE Signal Processing Magazine, September 2012.
Problems: Yet it is not good enough So far, we’ve assumed the entire measurement to be relevant to determining the state. In reality, there may be uninformative measurements may belong to different tracked objects. Data association: Task of determining which measurements go with which tracks. Tracking Matching: Match should be close to predicted position Gating: Omit Measurements Outside the gate
Tracking: Detection(observation)+ dynamics +Data association Applying Data association (Gating) Predicted Values By Kalman Filter (Green)
Summary: Tracking by detection detect the object independently in each frame tracking=Detection Detection Methods: Tracking with dynamics incorporate object dynamics to tracking Methods: tracking=Detection(observation)+dynamics Applying Data association Eliminate highly unlikely measurements tracking=Detection(observation)+ dynamics +Data association Point detectors Template matching density Based appearance models Background Subtraction … Filtering Methods Kalman filter … Tracking Matching Gating … Methods:
Reference: A. Yilmaz, O. Javed, and M. Shah, “Object tracking: A survey,” ACM Computing Surveys, Vol. 38, No. 4, pp. 1–45, December 2006. H. Yang, L. Shao, F. Zheng, L. Wang, and Z. Song, “Recent advances and trends in visual tracking: A review,” Neurocomputing, Vol. 74, No. 18, pp. 3823-3831. D. A. Forsyth, J. Ponce, “Computer Vision: A Modern Approach,” Prentice Hall,2nd Edition, 2012. R. Faragher , “Understanding the Basis of the Kalman Filter Via a Simple and Intuitive Derivation,” IEEE Signal Processing Magazine, September 2012.