Academia.eduAcademia.edu

Outline

Minimally Infrequent Itemset Mining Using Pattern-Growth Paradigm

2016, International Journal of Advance Engineering and Research Development

https://doi.org/10.21090/IJAERD.030647

Abstract

Itemset mining has been an active area of research due to its successful application in various data mining scenarios including finding association rules. Though most of the past work has been on finding frequent itemsets, infrequent itemset mining has demonstrated its utility in web mining, bioinformatics and other fields. In this paper, we propose a new algorithm based on the pattern-growth paradigm to find minimally infrequent itemsets. A minimally infrequent itemset has no subset which is also infrequent. We also introduce the novel concept of residual trees. We further utilize the residual trees to mine multiple level minimum support itemsets where different thresholds are used for finding frequent itemsets for different lengths of the itemset. Finally, we analyze the behavior of our algorithm with respect to different parameters and show through experiments that it outperforms the competing ones.

References (12)

  1. R. Agrawal and R. Srikant. Fast algorithms for min- ing association rules in large databases. In VLDB, pages 487-499, 1994.
  2. K. Beyer and R. Ramakrishnan. Bottom-up compu- tation of sparse and iceberg cube. SIGMOD Rec., 28:359-370, 1999.
  3. X. Dong, Z. Niu, X. Shi, X. Zhang, and D. Zhu. Min- ing both positive and negative association rules from frequent and infrequent itemsets. In ADMA, pages 122-133, 2007.
  4. X. Dong, Z. Niu, D. Zhu, Z. Zheng, and Q. Jia. Min- ing interesting infrequent and frequent itemsets based on mlms model. In ADMA, pages 444-451, 2008.
  5. X. Dong, Z. Zheng, and Z. Niu. Mining infrequent itemsets based on multiple level minimum supports. In ICICIC, page 528, 2007.
  6. G. Grahne and J. Zhu. Fast algorithms for frequent itemset mining using fp-trees. Trans. Know. Data Engg., 17(10):1347-1362, 2005.
  7. D. J. Haglin and A. M. Manning. On minimal in- frequent itemset mining. In DMIN, pages 141-147, 2007.
  8. J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. In SIGMOD, pages 1- 12, 2000.
  9. A. M. Manning and D. J. Haglin. A new algorithm for finding minimal sample uniques for use in statisti- cal disclosure assessment. In ICDM, pages 290-297, 2005.
  10. A. M. Manning, D. J. Haglin, and J. A. Keane. A re- cursive search algorithm for statistical disclosure as- sessment. Data Mining Know. Discov., 16:165-196, 2008.
  11. J. Srivastava and R. Cooley. Web usage mining: Dis- covery and applications of usage patterns from web data. SIGKDD Explorations, 1:12-23, 2000.
  12. X. Wu, C. Zhang, and S. Zhang. Efficient mining of both positive and negative association rules. ACM Trans. Inf. Syst., 22(3):381-405, 2004.