Pattern Recognition: Statistical and Neural

Nanjing University of Science & Technology Pattern Recognition:Statistical and Neural Lonnie C. Ludeman Lecture 28 Nov 9, 2005

Lecture 28 Topics • Review Clustering Methods • 2. Agglomerative Hierarchical Clustering Example • 3. Introduction to Fuzzy Sets • 4. Fuzzy Partitions • 5. Define Hard and soft clusters

K-Means Clustering Algorithm: Basic Procedure Randomly Select K cluster centers from Pattern Space Distribute set of patterns to the cluster center using minimum distance Compute new Cluster centers for each cluster Continue this process until the cluster centers do not change or a maximum number of iterations is reached. Review

Flow Diagram for K-Means Algorithm Review

Iterative Self Organizing Data Analysis Technique A ISODATA Algorithm Performs Clustering of unclassified quantitative data with an unknown number of clusters Similar to K-Means but with ability to merge and split clusters thus giving flexibility in number of clusters Review

Hierarchical Clustering Dendrogram Review

Example - Hierarchical Clustering Given the following data (a) Perform a Hierarchical Clustering of the data (b) Give the results for 3 clusters.

Solution: Plot of data vectors

Calculate distances between each pair of original data vectors S5(10) S7(10) = S5(9) U Data sample x5 and x7 are the closest together thus we combine them to give the following for 9 clusters

S5(9) S5(10) US7(10) = S5(9) Combine Closest Clusters

Compute distances between new clusters Clusters S8(9) and S9(9) are the closest together thus we combine them to give the following for 8 clusters

S8(8) S5(8) S8(9) US9(9) = S8(8) Combine Closest Clusters

S8(8) S5(8) S5(7) S5(8) US8(8) = S5(7) Combine Closest Clusters

S5(7) S1(6) S1(7) US4(7) = S1(6) Combine Closest Clusters

Continuing this process we see the following combinations of clusters at the given levels

Level 5

Level 4

Level 3

Level 2

Level 1

Dendogram for Given Example

(b) Using the dendrogram determine the results for just three clusters From the dendrogram at level 3 we see the following clusters S5(3) = { 5, 7, 8, 9, 10 ) S2(3) = { 2, 6 } S1(3) = { 1,4,3 } Answer Cl1 = { x5, x7, x8, x9, x10) Cl2 = {x2, x6} Cl3 = { x1, x4, x3}

Introduction to Fuzzy Clustering K-means, Hierarchical, and ISODATA clustering algorithms are what we call “Hard Clustering”. The assignment of clusters is a partitioning of the data into mutually disjoint and exhaustive non empty sets. Fuzzy clustering is a relaxation of this propertyand provides another way of solving the clustering problem. Before we present the Fuzzy Clustering algorithm we first lay a background by defining Fuzzy sets.

Given a set S composed of pattern vectors as follows S = { x1, x2, ... , xN} A proper subset of Sis any nonempty collection of pattern vectors. Examples follow B = { x2, x4, xN} C = { x4} D = { x1, x3 , x5, xN-1} A = { x1, x2}

Given the following Set S = { x1, x2, ... , xk, ... , xN} We can also specify subsets by using the characteristic function which is defined on the set S as follows for the subset A µA(xk) = 1 if xk is in the subset A = 0 if xk is not in the subset A Characteristic Function for subset A µA(.): [ µA(x1), µA(x2), ... , µA(xk), ... , (xN ) ]

Examples A = { x1, x2} µA(xk): [ 1, 1, 0, 0, 0, ... , 0 ]

Examples A = { x1, x2} µA(xk): [ 1, 1, 0, 0, 0, ... , 0 ] B = { x2, x4, xN} µB(xk): [ 0, 1, 0, 1, 0, ... , 1 ]

Examples A = { x1, x2} µA(xk): [ 1, 1, 0, 0, 0, ... , 0 ] B = { x2, x4, xN} µB(xk): [ 0, 1, 0, 1, 0, ... , 1 ] C = { x4} µC(xk): [ 0, 0, 0, 1, 0, ... , 0 ]

Examples A = { x1, x2} µA(xk): [ 1, 1, 0, 0, 0, ... , 0 ] B = { x2, x4, xN} µB(xk): [ 0, 1, 0, 1, 0, ... , 1 ] C = { x4} µC(xk): [ 0, 0, 0, 1, 0, ... , 0 ] D = { x1, x3 , x5, xN-1} µD(xk): [ 1, 0, 1, 0, 1,...,1 , 0 ]

Partitions of a set S Given a set S of NS n-dimensional pattern vectors: S= { xj ; j =1, 2, ... , NS } Apartition ofS is a set of M subsets of S, Sk , k=1, 2, ... , M, that satisfy the following conditions. Note: Thus Clusters can be specified as a partition of the pattern space S.

K Sk ∩ k = 1 Properties of subsets of a Partition 1. Sk≠ Φ Not empty 2. Sk∩Sj≠ΦPairwise disjoint =SExhaustive 3. where Φ is the Null Set

[ µS (x1), µS (x2), ... , µS (xk), ... , µS (xN ) ] 1 1 1 1 [ µS (x1), µS (x2), ... , µS (xk), ... , µS (xN ) ] 2 2 2 2 [ µS (x1), µS (x2), ... , µS (xk), ... , µS (xN ) ] M M M M Partition in terms of Characteristic Functions µS: 1 µS : 2 . . . . . . . . . µS : M µS (xk) = 0 or 1 Sum = 1 for each column j for all k and j

Cl2 Cl3 Cl1 x1x2x3x4x5x6x7 Cl1: [ 1 0 1 0 0 0 0 ] Cl2: [ 0 1 0 0 0 0 0 ] Cl3: [ 0 0 0 1 1 1 1 ] Hard Partition

0 < µF(xk) < 1 = = A Fuzzy Set can be defined by extending the concept of the characteristic function to allow positive values between and including 0 and 1 as follows Given a set S= { x1, x2, ... , xN} A Fuzzy Subset F, of a set S, is defined by its membership function F: [ µF(x1), µF(x2), ... , µF(xk), ... , µF(xN ) ] where xk isfrom S and

S Function defined on S

Example: Define a Fuzzy set A by the following membership function Or equivalently

A Fuzzy Partition F, of a set S, is defined by its membership functions for the fuzzy sets Fk : k =1, 2, ... , K ) ] Fuzzy Partition

where Each value bounded by 0 and 1 Sum of each columns values =1 Sum of each row less than n

“Hard” or Crisp Clusters P2 P1 x5 x1 x3 x4 x6 x2 Set Descriptions P1= {x1 ,x2 ,x3 } P2= {x4 ,x5 ,x6 }

“Soft” or Fuzzy Clusters x5 x1 x3 Pattern Vectors x4 x6 x2 Membership Functions

Hard or Crisp Partition Soft or Fuzzy Partition

Membership Function for F1 Domain Pattern Vectors Membership Function for F2 Domain Pattern Vectors Membership Functions for Fuzzy Clusters

Note: Sum of columns = 1

“Fuzzy” Kitten

Lecture 28 Topics • Reviewed Clustering Methods • 2. Gave an Example using Agglomerative Hierarchical Clustering • 3. Introduced Fuzzy Sets • 4. Described Crisp and Fuzzy Partitions • 5. Defined Hard and soft clusters

End of Lecture 28

Pattern Recognition: Statistical and Neural