Effective measurement selection in truncated Kernel density estimator

Effective measurement selection in truncated Kernel density estimator Yoon, Ji Won School of Computer Science and Statistics, Trinity College Dublin [working with Hyoung-joo Lee (Oxford, UK) and Hyoungshick Kim (Cambridge, UK)] ICUIMC , 23, Feb, 2011

Clustering

Clustering Clustering

Non-parametric clustering • Mean Shift algorithm (MS) is • One of the simplest non-parametric clustering algorithms; • Based on the Parzen window technique of Kernel Density Estimation in statistics. • Application of MS • in Computer Vision and clustering • Image Segmentation • De-noising

Probability Density Function (PDE) Exact PDE

Mean Shift Algorithm on PDF • Simple interpretation of Mean Shift algorithm • MS clustering is equivalent to a MAP approach to find several local optima for each data points.

Non-parametric Density Estimation • In many cases, the model is not known so we do not know the PDF. • We can not do MAP estimation for unknown PDF. • How can we do MAP operation? • Alternatively, we reconstruct an approximated PDE. • Hint: Use data • by Kernel Density estimation (non-parametric approach)via Parzen window

Approximated PDF via KDE Only with data, we reconstruct underlying PDE

Mean Shift Algorithm • Mean Shift algorithm =Mean shift vector

Mean Shift Algorithm Previous mean Stationary final mean (Local optima) Updated Mean

Question about Mean Shift Algorithm • It is known that MS can be useful in de-noising by clustering the data points. • However, it is not really working because • noise can be located alone. • Then, the noise cannot often be clustered in the main clusters. • Instead, they will have their own isolated clusters for each noise if the noise and other neighboured points are far distant than the bandwidth (or truncation). unwanted positions Desired position

Bandwidth selection problems in Kernel Density Estimation Bandwidth selection is critical in MS.

Bandwidth selection problems in Kernel Density Estimation • Q: Can we remove this separation effect in MS? • = Use Geometric structure to pull such far distance data point into one of major clusters. •  Voronoi Mean Shift

Voronoi Diagram • Subdivision of plane (space) into cells • S = {S1,S2,…Sn} points in the plane • V(Si) = { x : d(x, Si) < d(x, Sj) for all j≠ i} The position x’s the nearest neighbor is Si. x Si

Voronoi Kernel for MS Conventional Truncated Kernel Voronoi based Truncated Kernel

Voronoi Mean Shift (VMS) • Finding relevant points •  Effective Greedy algorithm • Divide three cases • Inner points • Outer points (case 1) • Outer points (case 2) Relevant Points (their regions overlap the windows) (a) Inner Points (b) Outer Points (case 1) (c) Outer Points (case 2)

Testing Performance • KL divergence • Estimation of KL divergence via Importance sampling

Results • Synthetic datasets (a) Gaussian (b) Banana

Results • Synthetic datasets (Gaussian) (a) MS (b) VMS

Results • Synthetic datasets (Gaussian) (a) MS (b) VMS Here, h=0.5

Results • Synthetic datasets (Gaussian) (b) KL divergence (a) Sampling area (h=0.1)

Results • The number of clusters (Synthetic datasets) (a) Gaussian (b) Banana

Results • Real experimental datasets (a) Original Images (b) Noisy Images

Results MS VMS (b) Filtered Images excluding clusters with less than 5 pixels (a) Filtered Images

Conclusion and Discussion • Advantages of VMS are • A greedy algorithm for Nonlinear gating windowing scheme; • Assigning data points into one of major clusters even with a still small gating size; • Estimating target density more accurately according to KL divergence checking. • Disadvantages of VMS are • Time Complexity • VMS requires extra processing time. • Building Voronoi Diagram before running MS. • Finding relevant point from the Voronoi map. • Blurring effect • Voronoi Kernel is always larger than the conventional Kernel  over-smoothing effect

Thanks • If you have any further questions, please feel free to contact me! • yoonj@tcd.ie Or • http://www.cs.tcd.ie/~yoonj

Effective measurement selection in truncated Kernel density estimator