1 / 20

Clustering Algorithms Meta Applier (CAMA) Toolbox

Clustering Algorithms Meta Applier (CAMA) Toolbox. Dmitry S. Shalymov Kirill S. Skrygan Dmitry A. Lyubimov. Clustering. Goals To detect the underlying structure in data To reduce data set capacity To extract unique objects Usage Data mining Machine learning Financial mathematics

gefen
Download Presentation

Clustering Algorithms Meta Applier (CAMA) Toolbox

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Clustering Algorithms Meta Applier (CAMA) Toolbox Dmitry S. Shalymov Kirill S. Skrygan Dmitry A. Lyubimov

  2. Clustering • Goals • To detect the underlying structure in data • To reduce data set capacity • To extract unique objects • Usage • Data mining • Machine learning • Financial mathematics • Optimization • Statistics • Pattern recognition • Control strategies development SYRCoSE’09

  3. Clustering Problem Clustering and Classification SYRCoSE’09

  4. Variety of Clustering Algorithms • Hierarchical • Aglomerative • Partitioning • Iterative • Hard (K-means, SVM, SPSA) • Fuzzy (FCM) Important parameters -Distance norm -Number of clusters -Initial values of cluster centers SYRCoSE’09

  5. Cluster Stability Algorithms • Indexes • Stability (similarity, merit) functions • Probabilistic measures assessing the likelihood of a decision • Density estimation approaches SYRCoSE’09

  6. Stochastic Approximation Recursive stochastic approximation FDSA SPSA SYRCoSE’09

  7. SYRCoSE’09

  8. Effectiveness of SPSA SYRCoSE’09

  9. Finding the number of clusters in data set • Run the SPSA algorithm for different numbers of clusters, K, and calculate the corresponding distortions • Select a transformation power, Y • Calculate the “jumps” in transformed distortion • Estimate the number of clusters in the data set by SYRCoSE’09

  10. Structure of data set detection SYRCoSE’09

  11. Examples • Iris (3 clusters, 4 features, 150 instances) • Wine (3 clusters, 13 features, 178 instances) • Breast Cancer (2 clusters, 32 features, 569 instances) • Image Segmentation (7 clusters, 19 features, 2310 instances) SYRCoSE’09

  12. Software Tools for Clustering Analysis • Research • COMPACT • DCPR (Data Clustering & Pattern Recognition) • FCDA (Fuzzy Clustering and Data Analysis Toolbox) • ClusterPack Matlab Toolbox • The Curve Clustering Toolbox • SOM (Self-Organizing Map) • Spectral Clustering Toolbox • Yashil's FCM Clustering • License software • SPSS • STATISTICA • Characteristics • Visualization • Efectiveness analysis with patterns • Tools to check performance • Shortcomings • Limited number of data sets and algorithms • No possibilities to load own algorithm • No on-line services • MATLAB SYRCoSE’09

  13. Clustering Algorithms Meta Applier SYRCoSE’09

  14. Clustering Algorithms Meta Applier SYRCoSE’09

  15. CAMA. Kernel SYRCoSE’09

  16. CAMA. Kernel SYRCoSE’09

  17. CAMA Toolboxhttp://ancient.punklan.net:8084/CAMA2/index.jsp SYRCoSE’09

  18. CAMA Toolbox SYRCoSE’09

  19. CAMA Toolbox SYRCoSE’09

  20. Thank you! SYRCoSE’09

More Related