1 / 12

Efficient Mining of High Utility Itemsets from Large Datasets

Efficient Mining of High Utility Itemsets from Large Datasets. Alva Erwin Department ofComputing Raj P. Gopalan, and N.R. Achuthan Department of Mathematics and Statistics Curtin University of Technology Kent St. Bentley Western Australia PAKDD08. Outline. Introduction Definition

luce
Download Presentation

Efficient Mining of High Utility Itemsets from Large Datasets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient Mining of High Utility Itemsets from Large Datasets Alva Erwin Department ofComputing Raj P. Gopalan, and N.R. Achuthan Department of Mathematics and Statistics Curtin University of Technology Kent St. Bentley Western Australia PAKDD08

  2. Outline • Introduction • Definition • Method–Compressed Transaction Utility-Prol • Experiments • Conclusions

  3. Introduction • Frequent itemset mining is to find items that co-occur in a transaction database above a user given frequency threshold, without considering the quantity or weight such as profit of the items. • TwoPhase based on Apriori issuitable for sparse data sets with short patterns, CTU-Mine based on the pattern growth is suitable for dense data.

  4. Definition • u(3 4, t1) =$60u(3 4, t3)=$60u(3 4) = $120 ,

  5. Definition • Transaction Utility : • Transaction weighted Utility: • tu(1) = 80 twu(3 4)=$190

  6. Compressed Transaction Utility-Prol 99<min_Utility(129.9)

  7. Compressed Utility Pattern-Tree • Parallel projection of transaction database

  8. CUP-tree • Traverse index 1 (110) from 5, 2 (310) from (2,3,4), • 3 (195) from 2, and 4 (190)from (3,5)

  9. ProCUP-tree • index 1 (110) from 5, cause 110<min_Utility(129.9) • 2 (310) from (2,3,4),3 (195) from 2, and 4 (190)from (3,5)

  10. ProCUP-tree • oriUtility*itemQuantity + proUtility*proQuantity = Utility • 35*2+25*2=120, 150*1+25*1=175,10*5+25*3=125 • High_Utility_Itemset = (3,2) (3,2,1)

  11. Experiments

  12. Conclusion • CTU-Pro algorithm to mine the complete set of high utility itemsets from both sparse and relatively dense datasets with short or longer high utility patterns. • The algorithm adapts to large data by constructing parallel subdivisions on disk that can be mined independently.

More Related