80 likes | 274 Views
Classification by Association Rules: Use Minimum Set of Rules. Jianyu Yang December 10, 2003. Classification System. Problem: (A, B, C) => y | n ? Decision tree learning, etc. Association rules: X => c X : antecedent, c : consequent Support & Confidence Algorithms: Apriori.
E N D
Classification by Association Rules:Use Minimum Set of Rules Jianyu Yang December 10, 2003
Classification System • Problem: (A, B, C) => y | n ? • Decision tree learning, etc. • Association rules: X => c • X: antecedent, c : consequent • Support & Confidence • Algorithms: Apriori
Association Rules: Issues • Too many rules • Inefficient • Overfitting • Applying order matters • Example: (A, B) => y, (C) => n • Minimum Support (minsup) • Minimum Confidence (minconf )
Ideas: No redundant rules (A, B) =>y (A, B, C) =>y Total order of rules “Occum’s razor”: favor general rules Pre-pruning (A, B) =>y (A, B, D)=>? 1 L1 = {large 1-ruleitems}; 2 CAR1 = genRules(L1) 3 pruneSet(L1) 4 for (k = 2; Lk-1 ≠ ; k++) do begin 5 Ck = apriori-gen(Lk-1); 6 forall training instances tD do begin 7 Ct = subset(Ck, t) 8 forall candidates cCt 9 Ci .count++ for class label i 10 end 11 Lk = {cCt| ci .count ≥ minsup for any class i} 12 CARk = genRules(Lk) 13 pruneSet(Lk) 14 end 15 CARs = UNIONk(CARk) MSR Algorithm
Conclusions • A new algorithm was designed to build a classification system using a minimum set of association rules. • In general, low minsup and high minconf produce low error rates. • Experiments on 26 benchmark datasets showed lower error rates in 17 datasets thanC4.5 (R8), in 16 than CBA (v2.0). • The new algorithm does not always produce lower error rates than other algorithms.