A Hash Based Frequent Itemset Mining using Rehashing
Abstract
Data mining is the use of automated data analysis techniques to uncover previously undetected relationships among data items. Mining frequent item sets is one of the most important concepts of data mining. Frequent item set mining has been ahighly concerned field of data mining for researcher for over two decades. It plays an essential role in many data mining tasks that try to find interesting itemsets from databases, such as association rules, correlations, sequences, classifiers and clusters . In this paper, we propose a new association rule mining algorithm called Rehashing Based Frequent Item set (RBFI) in which hashing technology is used to store the database in vertical data format. To avoid hash collision and secondary clustering problem in hashing, rehashing technique is utilized here. The advantages of this new hashing technique are easy to compute the hash function, fast access of data and efficiency. This algorithm provides facilities to avoid unnecessary scans to the database.
References (19)
- M.S.V.K. Pang-Ning Tan, -Data mining, in Introduction to datamining", Pearson International Edition, 2006, pp.2-7.
- J. Han, M. Kamber, -Data Mining: Concepts and Techniques 3rdedition‖, Morgan Kaufmann Publishers, 2013.
- R. Agrawal, T. Imielienski and A. Swami, -Miningassociation rules between sets of items in large databases. In P. Bunemann and S. Jajodia, editors, Proceedings of the 1993 ACM SIGMODConference on Management of Data, Pages 207-216, Newyork, 1993, ACM Press.
- R. Agrawal, T. Imielinski, and A. Swami, -Mining Association RulesBetween Sets of Items in Large IJRITCC | December 2014, Available @ http://www.ijritcc.org _______________________________________________________________________________________ Databases,‖ Proc. ACM SIGMOD, May1993, pp. 207- 216.
- D. Gunopulos, H. Mannila, and S. Saluja, -Discovering all the most specific sentences byrandomized algorithms‖, In Intl. Conf. onDatabase Theory, Jan. 1997.
- Roberto Bayardo, -Efficiently mining long patterns from databases‖, in ACM SIGMOD Conference 1998.
- R. Agarwal, C. Aggarwal and V. Prasad, -A tre projection algorithm for generation of frequent itemsets‖, Journal of Parallel and Distributed Computing, 2001
- J. Han, J. Pei, and Y. Yin. -Mining frequent patterns without candidate generation‖, In ACM SIGMOD Conf., May 2000
- Burdick, D., M. Calimlim and J. Gehrke, -MAFIA: A maximal frequent itemset algorithm for transactional databases‖, In International Conference on Data Engineering, pp: 443 -452, April 2001, doi = 10.1.1.100.6805
- Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo, -Efficient Algorithms for discovering association rules‖, in Usama M. Fayyad and Ramasamy Uthurusamy, editors, AAAI Workshop on Knowledge Discovery on Databases (KDD-94),pages 181-192, Seattle, Washington, 1994, AAAIPress.
- R. Agrawal and R. Srikant, -Fast algorithms for mining association rules‖, in Proceedings of the 20th International Conference on Very Large Databases (VLDB'94), Santiago de Chile, September 12-15, pages 487-499, Morgan Kaufmann, 1994.
- R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. I. Verkamo, -Fast discovery of association rules‖, Advances in Knowledge Discovery andData Mining, pages 307-328, MIT Press, 1996.
- M. J. Zaki, -Scalable Algorithms for Association Mining‖, IEEE Transactions on Knowledge and Data Engineering, Vol. 12, No. 3, pp 372-390, May/June 2000.
- P. Shenoy, J. R. Haritsa, S. Sudarshan, G. Bhalotia, M. Bawa, and D. Shah, -Turbo-charging Vertical Mining of Large Databases‖, SIGMOD Conference 2000: 22-33.
- V. Ganti, J. E. Gehrke, and R. Ramakrishnan, -DEMON: Mining and Monitoring Evolving Data‖, ICDE 2000: 439-448
- M. Holsheimer, M. L. Kersten, H. Mannila, and H.Toivonen, -A Perspective on Databases and Data Mining‖, KDD 1995: 150-155.
- A. Savasere, E. Omiecinski, and S. Navathe, -An efficient algorithm for mining association rules in large databases‖, 21st VLDB Conference, 1995.
- A.M.J. Md. Zubair Rahman, P. Balasubramanie and P. Venkata Krihsna -A Hash based Mining Algorithm for Maximal Frequent Itemsets using Linear Probing‖. Infocomp Journal of Computer Science 2009, Vol.8, No.1, pp.14-19.
- M. Krishnamurthy, A. Kannan , R. Baskaran, R. Deepalakshmi -Frequent Item set Generation Using Hashing-Quadratic Probing Technique‖ -European Journal of Scientific Research ISSN 1450-216X Vol.50 No.4 (2011), pp. 523-532.