1 / 46

Vibhav Vineet, Jonathan Warrell, Paul Sturgess, Philip H.S. Torr

Improved Initialisation and Gaussian Mixture Pairwise Terms for Dense Random Fields with Mean-field Inference. Vibhav Vineet, Jonathan Warrell, Paul Sturgess, Philip H.S. Torr. http://cms.brookes.ac.uk/research/visiongroup/. Labelling Problem. Assign a label to each image pixel.

galena
Download Presentation

Vibhav Vineet, Jonathan Warrell, Paul Sturgess, Philip H.S. Torr

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improved Initialisation and Gaussian Mixture Pairwise Terms for Dense Random Fields with Mean-field Inference Vibhav Vineet, Jonathan Warrell, Paul Sturgess, Philip H.S. Torr http://cms.brookes.ac.uk/research/visiongroup/

  2. Labelling Problem Assign a label to each image pixel Object segmentation Stereo Object detection

  3. Problem Formulation Find a labelling that maximizes the conditional probability or minimizes the energy function

  4. Problem Formulation Grid CRF leads to over smoothing around boundaries Inference Grid CRF construction

  5. Problem Formulation Grid CRF leads to over smoothing around boundaries Dense CRF is able to recover fine boundaries Inference Grid CRF construction Inference Dense CRF construction

  6. Inference in Dense CRF Very high time complexity graph-cuts based methods not feasible alpha-expansion takes almost 1200 secs/per image with neighbourhood size of 15 on PascalVOC segmentation dataset

  7. Inference in Dense CRF • Filter-based mean-field inference method takes 0.2 secs* • Efficient inference under two assumptions • Mean-field approximation to CRF • Pairwise weights take Gaussian weights *Krahenbuhl et al. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, NIPS 11

  8. Efficient inference in dense CRF • Mean-fields methods (Jordan et.al., 1999) • Intractable inference with distribution P • Approximate distribution from tractable family

  9. Naïve mean field • Assume all variables are independent

  10. Efficient inference in dense CRF • Assume Gaussian pairwise weight Mixture of Gaussian kernels Spatial Bilateral

  11. Marginal update • Marginal update involve expectation of cost over distribution Q given that x_i takes label l Expensive message passing step is solved using highly efficient permutohedral lattice based filtering approach • Maximum posterior marginal (MPM) with approximate distribution:

  12. Q distribution Q distribution for different classes across different iterations on CamVID dataset Iteration 0 0 0.5 0.3 0.4 0.6 0.7 0.8 0.9 0.1 0.2 1

  13. Q distribution Q distribution for different classes across different iterations on CamVID dataset Iteration 1 0 0.5 0.3 0.4 0.6 0.7 0.8 0.9 0.1 0.2 1

  14. Q distribution Q distribution for different classes across different iterations on CamVID dataset Iteration 2 0 0.5 0.3 0.4 0.6 0.7 0.8 0.9 0.1 0.2 1

  15. Q distribution Q distribution for different classes across different iterations on CamVID dataset Iteration 10 0 0.5 0.3 0.4 0.6 0.7 0.8 0.9 0.1 0.2 1

  16. Q distribution Q distribution for different classes across different iterations on CamVID dataset Iter 0 Iter 1 Iter 2 Iter 10

  17. Two issues associated with the method • Sensitive to initialisation • Restrictive Gaussian pairwise weights

  18. Our Contributions Resolve two issues associated with the method • Sensitive to initialisation • Propose SIFT-flow based initialisation method • Restrictive Gaussian pairwise weights • Expectation maximisation (EM) based strategy to learn more general Gaussian mixture model

  19. Sensitivity to initialisation Experiment on PascalVOC-10 segmentation dataset Observe an improvement of almost 13% in I/U score on initialising the mean-field inference with the ground truth labelling • Good initialisation can lead to better solution Propose a SIFT-flow based better initialisation method

  20. SIFT-flow based correspondence Given a test image, we first retrieve a set of nearest neighbours from training set using GIST features Test image Nearest neighbours retrieved from training set

  21. SIFT-flow based correspondence K-nearest neighbours warped to the test image 13.31 14.31 23.31 18.38 22 22 Test image 22 30.87 27.2 Warped nearest neighbours and corresponding flows

  22. SIFT-flow based correspondence Pick the best nearest neighbour based on the flow value Test image Nearest neighbour Warped image 13.31 Flow:

  23. Label transfer Ground truth of test image Ground truth of the best nearest neighbour Flow Warp the ground truth according to correspondence Transfer labels from top 1 using flow Warped ground truth according to flow

  24. SIFT-flow based initialisation Rescore the unary potential s rescores the unary potential of a variable based on the label observed after the label transfer stage set through cross-validation After rescoring Without rescoring Test image Ground truth Qualitative improvement in accuracy after using rescored unary potential

  25. SIFT-flow based initialisation Initialise mean-field solution Test image Ground truth With initialisation Without initialisation Qualitative improvement in accuracy after initialisation of mean-field

  26. Gaussian pairwise weights Experiment on PascalVOC-10 segmentation dataset Plotted the distribution of class-class ( ) interaction by selecting pair of random points (i-j) Aeroplane-Aeroplane Car-Person Horse-Person

  27. Gaussian pairwise weights Experiment on PascalVOC-10 segmentation dataset Such complex structure of data can not be captured by zero mean Gaussian distributed horizontally distributed vertically not centred around zero mean Propose an EM-based learning strategy to incorporate more general class of Gaussian mixture model

  28. Our model Our energy function takes following form: We use separate weights for label pairs but Gaussian components are shared We follow piecewise learning strategy to learn parameters of our energy function

  29. Learning mixture model • Learn the parameters similar to this model* *Krahenbuhl et al. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, NIPS 11

  30. Learning mixture model • Learn the parameters similar to this model* • Learn the parameters of the Gaussian mixture mean, standard deviation mixing coefficients *Krahenbuhl et al. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, NIPS 11

  31. Learning mixture model • Learn the parameters similar to this model* • Learn the parameters of the Gaussian mixture mean, standard deviation mixing coefficients • Lambda is set through cross validation *Krahenbuhl et al. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, NIPS 11

  32. Our model • We follow a generative training model • Maximise joint likelihood of pair of labels and features: : latent variable: cluster assignment We follow expectation maximization (EM) based method to maximize the likelihood function

  33. Learning mixture model Our model is able to capture the true distribution of class-class interaction Aeroplane-Aeroplane Car-Person Horse-Person

  34. Inference with mixture model • Involves evaluating M extra Gaussian terms: • Perform blurring on mean-shifted points • Increases time complexity

  35. Experiments on Camvid Q distribution for building classes on CamVID dataset Iteration 0 Ground truth Without initialisation With initialisation Confidence of building pixels increases with initialisation 0 0.5 0.3 0.4 0.6 0.7 0.8 0.9 0.1 0.2 1

  36. Experiments on Camvid Q distribution for building classes on CamVID dataset Iteration 1 Ground truth Without initialisation With initialisation Confidence of building pixels increases with initialisation 0 0.5 0.3 0.4 0.6 0.7 0.8 0.9 0.1 0.2 1

  37. Experiments on Camvid Q distribution for building classes on CamVID dataset Iteration 2 Ground truth Without initialisation With initialisation Confidence of building pixels increases with initialisation 0 0.5 0.3 0.4 0.6 0.7 0.8 0.9 0.1 0.2 1

  38. Experiments on Camvid Q distribution for building classes on CamVID dataset Iteration 10 Ground truth Without initialisation With initialisation Confidence of building pixels increases with initialisation 0 0.5 0.3 0.4 0.6 0.7 0.8 0.9 0.1 0.2 1

  39. Experiments on Camvid Image 2 Ground truth Without Initialisation With Initialisation Building is properly recovered with our initialisation strategy

  40. Experiments on Camvid Quantitative results on Camvid dataset • Our model with unary and pairwise terms achieve better accuracy than other complex models • Generally achieve very high efficiency compared to other methods

  41. Experiments on Camvid Qualitative results on Camvid dataset Alpha-expansion Ours Image Ground truth Able to recover building and tree properly

  42. Experiments on PascalVOC-10 Qualitative results of SIFT-flow method Output with SIFT-flow Image Output without SIFT-flow Warped nearest ground truth image Ground truth Able to recover missing body parts

  43. Experiments on PascalVOC-10 Quantitative results PascalVOC-10 segmentation dataset • Our model with unary and pairwise terms achieves better accuracy than other complex models • Generally achieves very high efficiency compared to other methods

  44. Experiments on PascalVOC-10 Qualitative results on PascalVOC-10 segmentation dataset Dense CRF alpha-expansion Ours Image Ground truth Able to recover missing object and body parts

  45. Conclusion • Filter-based mean-field inference promises high efficiency and accuracy • Proposed methods to robustify basic mean-field method • SIFT-flow based method for better initialisation • EM based algorithm for learning general Gaussian mixture model • More complex higher order models can be incorporated into pairwise model

  46. Thank you 

More Related