1 / 10

Clustering II

Clustering II. CMPUT 466/551 Nilanjan Ray. Mean-shift Clustering. Will show slides from: http://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/files/mean_shift/mean_shift.ppt. Spectral Clustering. Let’s visit a serious issue with K -means

finley
Download Presentation

Clustering II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Clustering II CMPUT 466/551 Nilanjan Ray

  2. Mean-shift Clustering • Will show slides from: http://www.wisdom.weizmann.ac.il/~deniss/vision_spring04/files/mean_shift/mean_shift.ppt

  3. Spectral Clustering • Let’s visit a serious issue with K-means • K-means tries to figure out compact, hyper-ellipsoid like structures • What if the clusters are not ellipsoid like compact? K-means fails. • What can we do? Spectral clustering can be a remedy here.

  4. Basic Spectral Clustering • Forms a similarity matrix wij for all pairs of observations i, j. • This is a dense graph with data points as the vertex set. Edge strength is given by the wij, similarity between ith and jth observations. • Clustering can be conceived as a partitioning of the graph into connected components, where within a component, the edge weights are large, whereas, across the components they are low.

  5. Basic Spectral Clustering… • Form the Laplacian of this graph: where G is a diagonal matrix with entries, • L is positive semi-definite and has a constant eigenvector (all 1’s) with zero eigenvalue. • Find m smallest eigenvectors Z=[z1z2zm] of L, ignoring the constant eigenvector. • Cluster (say by K-means) N observations with features as rows of matrix Z.

  6. Why Spectral Clustering Works Insight 1: The graph cut cost for a label vector f: So, a small value of will be obtained if pairs of points with large adjacencies same labels. Insight 2: The constant eigenvector corresponding to 0 eigenvalue is actually a trivial solution that suggests to put all N observations into a single cluster. If a graph has K connected components, the nodes of the graph can be reordered so that L will be block diagonal with K diagonal blocks and L will have zero eigenvalue with multiplicity K, one for each connected component. Corresponding eigenvectors will have indicator variables indentifying these connected components. In reality, we only have weak and strong edges. So look for small eigenvalues. Combining Insight 1 and 2: Choose eigenvectors corresponding to small eigenvalues and cluster them into K classes.

  7. A Tiny Example: A Perfect World W =[1.0000 0.5000 0 0 0.5000 1.0000 0 0 0 0 1.0000 0.8000 0 0 0.8000 1.0000]; We observe two classes each with 2 observations here. W is a perfect block diagonal matrix here. L = 0.5000 -0.5000 0 0 -0.5000 0.5000 0 0 0 0 0.8000 -0.8000 0 0 -0.8000 0.8000]; Laplacian L Eigenvalues of L: 0, 0, 1, 1.6 Eigenvectors corresponding to two 0 eigenvalues: [-0.7071 -0.7071 0 0] and [ 0 0 -0.7071 -0.7071]

  8. The Real World Tiny Example W =[1.0000 0.5000 0.0500 0.1000 0.5000 1.0000 0.0800 0.0400 0.0500 0.0800 1.0000 0.8000 0.1000 0.0400 0.8000 1.0000] L =[ 0.6500 -0.5000 -0.0500 -0.1000 -0.5000 0.6200 -0.0800 -0.0400 -0.0500 -0.0800 0.9300 -0.8000 -0.1000 -0.0400 -0.8000 0.9400] [V,D]=eig(L) V = 0.5000 0.4827 -0.7169 0.0557 0.5000 0.5170 0.6930 -0.0498 0.5000 -0.5027 0.0648 0.7022 0.5000 -0.4970 -0.0409 -0.7081 D = 0.0000 0.2695 1.1321 1.7384 Eigenvectors: Eigenvalues: Notice that eigenvalue 0 has a constant eigenvector. The next eigenvalue 0.26095 has an eigenvector that clearly indicates the class memberships.

  9. Normalized Graph Cut for Image Segmentation A cell image Similarity: Pixel locations

  10. (a) A blood cell image. (b) Eigenvector corresponding to second smallest eigenvalue. (c) Binary labeling via Otsu’s method. (d) Eigenvector corresponding to third smallest eigenvalue. (e) Ternary labeling via k-means clustering. NGC Example (a) (b) (c) (d) (e) Demo: NCUT.m

More Related