1 / 86

An Introduction to WEKA

An Introduction to WEKA. SEEM 4630 10-2012. Content. What is WEKA? The Explorer: Preprocess data Classification Clustering Association Rules Attribute Selection Data Visualization References and Resources. What is WEKA?. W aikato E nvironment for K nowledge A nalysis

laddie
Download Presentation

An Introduction to WEKA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to WEKA SEEM 4630 10-2012

  2. Content • What is WEKA? • The Explorer: • Preprocess data • Classification • Clustering • Association Rules • Attribute Selection • Data Visualization • References and Resources

  3. What is WEKA? • Waikato Environment for Knowledge Analysis • It’s a data mining/machine learning tool developed by Department of Computer Science, University of Waikato, New Zealand. • Weka is also a bird found only on the islands of New Zealand.

  4. Download and Install WEKA • Website: http://www.cs.waikato.ac.nz/~ml/weka/index.html • Support multiple platforms (written in java): • Windows, Mac OS X and Linux • WEKA has been installed in the teaching labs in SEEM

  5. Main Features • 49 data preprocessing tools • 76 classification/regression algorithms • 8 clustering algorithms • 3 algorithms for finding association rules • 15 attribute/subset evaluators + 10 search algorithms for feature selection

  6. Main GUI • Three graphical user interfaces • “The Explorer” (exploratory data analysis) • “The Experimenter” (experimental environment) • “The KnowledgeFlow” (new process model inspired interface)

  7. Content • What is WEKA? • The Explorer: • Preprocess data • Classification • Clustering • Association Rules • Attribute Selection • Data Visualization • References and Resources

  8. Explorer: pre-processing the data • Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary • Data can also be read from a URL or from an SQL database (using JDBC) • Pre-processing tools in WEKA are called “filters” • WEKA contains filters for: • Discretization, normalization, resampling, attribute selection, transforming and combining attributes, …

  9. @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ... WEKA only deals with “flat” files Flat file in ARFF (Attribute-Relation File Format) format 8/29/2014 9

  10. WEKA only deals with “flat” files @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ... numeric attribute nominal attribute

  11. University of Waikato

  12. University of Waikato

  13. University of Waikato

  14. University of Waikato

  15. University of Waikato

  16. University of Waikato

  17. University of Waikato

  18. University of Waikato

  19. University of Waikato

  20. University of Waikato

  21. University of Waikato

  22. University of Waikato

  23. University of Waikato

  24. University of Waikato

  25. University of Waikato

  26. University of Waikato

  27. University of Waikato

  28. University of Waikato

  29. University of Waikato

  30. University of Waikato

  31. University of Waikato

  32. Explorer: building “classifiers” • Classifiers in WEKA are models for predicting nominal or numeric quantities • Implemented learning schemes include: • Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, …

  33. Decision Tree Induction: Training Dataset This follows an example of Quinlan’s ID3 (Playing Tennis)

  34. age? <=30 overcast >40 31..40 student? credit rating? yes excellent fair no yes no yes no yes Output: A Decision Tree for “buys_computer”

  35. University of Waikato

  36. University of Waikato

  37. University of Waikato

  38. University of Waikato

  39. University of Waikato

  40. University of Waikato

  41. University of Waikato

  42. University of Waikato

  43. University of Waikato

  44. University of Waikato

  45. University of Waikato

  46. University of Waikato

  47. University of Waikato

  48. University of Waikato

  49. University of Waikato

  50. University of Waikato

More Related