CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View of Spectral Seg

Markus Herrgard UCSD Bioengineering and Bioinformatics CSE 291 Fall 2001Marina Meila and Jianbo Shi:Learning Segmentation by Random Walks/A Random Walks View of Spectral Segmentation Random walks and spectral segmentation

Overview • Introduction: Why random walks? • Review of the Ncut algorithm • Finite Markov chains • Spectral properties of Markov chains • Conductance of a Markov chain • Block-stochastic matrices • Application: Supervised segmentation Random walks and spectral segmentation

Introduction • Why bother with mapping a segmentation problem to a random walk problem? • Utilize strong connections between: • Graph theory • Theory of stochastic processes • Matrix algebra Random walks and spectral segmentation

Applications of random walks • Markov chain monte carlo: • Approximate high dimensional integration e.g. in Bayesian inference • How to sample efficiently from a complex distribution? • Randomized algorithms: • Approximate counting in high dimensional spaces • How to sample points efficiently inside a convex polytope? Random walks and spectral segmentation

Segmentation as graph partitioning • Consider an image I with a similarity function Sijbetween all pairs of pixels i,jI • Represent S as graph G =(I,S): • Pixels are the nodes of the graph • Sijis the weight of the edge between nodes i and j • Degree of node i: di = jSij • Volume of set AI: volA= iAdi Random walks and spectral segmentation

Simple example Similarity matrix Data with both distance and color cues Random walks and spectral segmentation

The normalized cut criterion • Partitioning of G into A and its complement is found by minimizing the normalized cut criterion: • Produces more balanced partitions than regular graph cut • Approximate solution can be found through spectral methods Random walks and spectral segmentation

The normalized cut algorithm • Define: • Diagonal matrix D with Dii = di • Laplacian of the graph G: L = D – S • Solve the generalized eigenvalue problem: Lx = Dx • Let xL be the eigenvector corresponding to 2nd smallest eigenvalue L • Partition xL to two sets containing roughly equal values  graph partition Random walks and spectral segmentation

What does this actually mean? • Spectral methods are easy to apply, but notoriously hard to understand intuitively • Some questions: • Why does it work? (see Shi & Malik) • Why this particular eigenvector? • Why would xL be piecewise constant? • What if there are more than two segments? • What if xL is not piecewise constant? (see Kannan, Vempala & Vetta) Random walks and spectral segmentation

Interlude: Finite Markov chains • Discrete time, finite state random process • State of the system at time tn: xn • Probability of being in state i at time tn given by: • Probability distribution for all states represented by the column vector (n) • Markov property: Random walks and spectral segmentation

Transition matrix • Transition matrix: • P is a (row) stochastic matrix: • Pij 0 • jPij = 1 • If at tn the distribution is (n) at tn+1 the distribution is given by: Random walks and spectral segmentation

Work Play Sleep Example of a Markov chain Random walks and spectral segmentation

Some terminology • Stationary distribution  is given by: • Markov chain is reversible if the “detailed balance” condition holds: • A reversible finite Markov chain is called a random walk Random walks and spectral segmentation

Spectra of stochastic matrices • For reversible Markov chains the eigenvalues of Pare real and eigenvectors orthogonal • Spectral radius (P) = 1 (i.e. ||1) • Right (left) hand eigenvector corresponding to 1=1 is x1=1 (x1=) Random walks and spectral segmentation

Back to Ncut • How is Ncut related to random walks on graphs? • Transform the similarity matrix S to a stochastic matrix: • Pij is the probability of moving from pixel i to pixel j in the graph representation of the image in one step of a random walk Random walks and spectral segmentation

Relationship to random walks • Spectrum of P: • The generalized eigenvalue problem in Ncut can be written as: • How are the spectra related? • Same eigenvectors: x =xP • Eigenvalues:  = 1-P Random walks and spectral segmentation

Simple example Transition matrix P=D-1S Similarity matrix S Random walks and spectral segmentation

Eigenvalues and eigenvectors of P Random walks and spectral segmentation

Why the second eigenvector? • The smallest eigenvalue in NCut corresponds to the largest eigenvalue of P • The corresponding eigenvector x1=1 has no information about partitioning Random walks and spectral segmentation

Conductance of a Markov chain • Conductance of set A: • If we start from a random node in A(according to ) this as the probability of moving out of A in one step Random walks and spectral segmentation

Conductance and the Ncut criterion • Assume that the random walk started from its stationary distribution • Using this and Pij = Sij/di we can write: Random walks and spectral segmentation

Interpretation of the Ncut criterion • Alternative representation of the Ncut criterion: • Minimum NCut is equivalent to • minimizing the conductance between set A and its complement • minimizing the probability of moving between set A and its complement Random walks and spectral segmentation

Block-stochastic matrices • Let  = (A1,A2,…,Ak) be a partition of I • Pis a block-stochasticmatrix or equivalently the Markov chain is aggregatable iff Random walks and spectral segmentation

Aggregation • Markov chain defined by P with state space iIcan be aggregated to a Markov chain with a smaller state space As and a transition matrix R • The k eigenvalues of R are the same as the k largest eigenvalues of P • Aggregation can be performed as a linear transformation R = UPV Random walks and spectral segmentation

Aggregation example Aggregated transition matrix R Transition matrix P Random walks and spectral segmentation

Why piecewise constant eigenvectors? • If Pis block-stochastic with kblocks then its kfirst eigenvectors are piecewise constant • Ncut is exact for block-stochastic matrices in addition to block diagonal matrices • Ncut groups pixels by the similarity of their transition probabilities to subsets of I Random walks and spectral segmentation

Block-stochastic matrix example Transition matrix P Piecewise constant eigenvector x Random walks and spectral segmentation

The modified Ncut algorithm • Finds k segments in one pass • Requires that the k eigenvalues of Rare larger than the other n-k spurious eigenvalues of P • Compute eigenvalues of P • Select k largest eigenvectors • Use k-means to obtain segmentation based on the keigenvectors Random walks and spectral segmentation

Supervised image segmentation • Training data: • Based on a human-segmented image define target probabilities • Features: • Different criteria fqijq=1,…,Q that measure similarity between pixels iand j Random walks and spectral segmentation

Supervised segmentation criterion • Model: • Parametrized similarity function: • Optimization criterion: • Minimize Kullback-Leibler divergence between target transition matrix P* and P()=D-1S () • Corresponds to maximizing cross-entropy: Random walks and spectral segmentation

Supervised segmentation algorithm • This can be done by using gradient ascent in : where Random walks and spectral segmentation

Training segmentation 2 (by color): 1=-0.19, 2=-4.55 Training segmentation 1 (by distance): 1=-1.19, 2=1.04 Toy example Distance “Color” (or intensity) Random walks and spectral segmentation

Toy example results Training segmentation 1 (by distance): Test data Training segmentation 2 (by color): Random walks and spectral segmentation

Application real image segmentation • Cues: • Intervening contour: • Edge flow: Random walks and spectral segmentation

Training Random walks and spectral segmentation

Testing Random walks and spectral segmentation

Conclusions I • Random walks perspective provides new insights to the Ncut algorithm: • Relating the Ncut algorithm to spectral properties of random walks • Interpreting of the Ncut criterion in terms of conductance of a random walk • Proving that Ncut is exact for block stochastic matrices Random walks and spectral segmentation

Conclusions II • Is any of this useful in practice? • Supervised segmentation method • Comparing different spectral clustering methods in terms of the underlying random walks • Choosing the kernel to allow for effective clustering (approximately block-stochastic) • New clustering criteria, e.g. bipartite clustering Random walks and spectral segmentation

References • Kemeny JG, Snell JL: Finite Markov Chains. Springer 1976. • Stewart WJ: Introduction to the Numerical Solution of Markov Chains. Princeton University Press 1994. • Lovasz L: Random Walks of Graphs: A Survey. • Jerrum M, Sinclair A: The Markov Chain Monte Carlo Method: An Approach to Approximate Counting and Integration. Random walks and spectral segmentation

CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View of Spectral Seg