170 likes | 350 Views
Multi-Relational Data Mining: An Introduction. Joe Paulowskey. Overview. Introduction to Data Mining Relational Data Patterns Inductive Logic Programming (ILP) Relational Association Rules Relational Decision Trees Relation Distance-Based Approaches. Relation Data. Relational Database
E N D
Multi-Relational Data Mining: An Introduction Joe Paulowskey
Overview • Introduction to Data Mining • Relational • Data • Patterns • Inductive Logic Programming (ILP) • Relational Association Rules • Relational Decision Trees • Relation Distance-Based Approaches
Relation Data • Relational Database • Multiple Tables • Defined • Views • Tables
Relational Pattern • Multiple Relations from a relational database • More Expressive • Opens up • Classification • Association • Regression
Relational Pattern (Cont.) • Expressed in Subsets of First Order Logic
Look for patterns in data What do you discover? Associations Sequences Classifications Goals of Data Mining Predict Identify Classify Optimize Uses Business Data Environmental/Traffic Engineering Web Mining Drug Design Data Mining
Data Mining: Relational Databases • Most Data Mining approaches deal with single tables • Not safe to merge multiple tables into one single table • Number of patterns increases • Explicit constraints required
Inductive Logic Programming (ILP) • Logic Programs used to find patterns • Clauses • Head and Body • Literals • Types • Definite • Program
ILP (Cont) • Predicate • Relations in relational database • Arguments -> Attributes • Attributes are Typed • Database Clauses are typed program clauses • Deductive Database
Relational Rule Induction ILP • Learn logical definitions of relations • Classification • Rules can be found by decision trees • Simple Algorithm • Dealing with noisy/incomplete data
ILP Problems to Propositional Forms • Propositional • attribute-value • Use Single Table Data Mining algorithms • LINUS • Background Knowledge
ILP/RDM Algorithms • Share • Learning as a Search Paradigm • Differences • Representation of Data, Patterns • Refinement operators • Testing Coverage • Upgrading from Propositional to Relational
Relational Association Rules • Frequent Patterns • Determining Frequency • Itemsets • Association Rules • Obtained by frequent itemsets
Relational Decision Trees • Used for Prediction • Binary Trees • First Order Decision List
Relational Distance-Based Approaches • Calculated distance between two objects • Statistical Approaches