Academia.eduAcademia.edu

Outline

Knnba : K-Nearest-Neighbor-Based-Association Algorithm

Abstract

KNN algorithm is one of the best and the most usable classification algorithms which is used largely in different applications. One of the problems of this algorithm is the equal effect of all attributes in calculating the distance between the new record and the available records in training dataset ,how ever ,may be some of these attributes are less important to the classification and some of these attributes are more important. This results in misleading of classification process and decreasing the accuracy of classification algorithm. A major approach to deal with this problem is to weight attributes differently when calculating the distance between two records. In this research we used association rules to weight attributes and suggested new classification algorithm K-Nearest-Neighbor-Based-Association (KNNBA) that improves accuracy of KNN algorithm. We experimentally tested KNNBA accuracy, using the 15 UCI data sets [1], and compared it with other classification algorithms NB, ...

References (16)

  1. A. Asuncion and D.J. Newman: "UCI Machine Learning Repository", Irvine, CA: University of California, School of Information and Computer Science, 2007. [Online]. Available: http://archive.ics.uci.edu/ml/datasets.html
  2. Cover and Hart, "Nearest neighbor pattern classification", IEEE Transactions on Information Theory, 1967.
  3. J. Han and M. Kamber, "Data Mining Concepts and Techniques", 2nd ed. Amsterdam: Morgan Kaufmann Publishers, 2006.
  4. D. T. LAROSE, "Discovering knowledge in data: an introduction to data mining", New Jersey: John Wiley & Sons, 2005.
  5. Y. Zhan, H. Chen and G. C. Zhang," An Optimization Algorithm Of K-NN Classification", Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, 13-16 August 2006.
  6. Md. Shamsul Huda, K. Md. R. Alam , K. Mutsuddi, Md. K. S. Rahman and C. M. Rahman," A Dynamic K-Nearest Neighbor Algorithm For Pattern Analysis Problem",3rd International Conference on Electrical & Computer Engineering, ICECE 2004, Dhaka, Bangladesh, 28-30 December 2004.
  7. K. Kumar Han, "Text categorization using weight adjusted k-nearest neighbor classification", Technical report, Dept. of CS, University of Minnesota, 1999.
  8. L. Jiang, H. Zhang and Z. Cai," Dynamic K- Nearest-Neighbor Naive Bayes with Attribute Weighted", FSKD 2006, LNAI 4223, 2006, pp. 365-368.
  9. J. Dongchao, S. Bifeng and H. Fei,"An Improved KNN Algorithm of Intelligent Built-in Test", Proceedings of the IEEE International Conference on Networking, Sensing and Control, ICNSC 2008, Hainan, China, 6-8 April2008.
  10. H. George, J. Langley and P. Langley, "Estimating Continuous Distributions in Bayesian Classifiers", Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann, San Mateo,1995, pp. 338-345.
  11. R. Quinlan, "C4.5: Programs for Machine Learning", Morgan Kaufmann Publishers, San Mateo, CA, 1993.
  12. R. Kohavi, "Scaling up the accuracy of naive-Bayes classifiers: a decision tree hybrid", Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 1996.
  13. G. Demiroz and A. Guvenir, "Classification by voting feature intervals", ECML- 97,1997.
  14. E. Frank, M. Hall and B. Pfahringer, "Locally Weighted Naive Bayes", Working Paper 04/03, Department of Computer Science, University of Waikato, Atkeson, 2003.
  15. D. Aha and D. Kibler, "Instance-based learning algorithms", Machine Learning, vol.6, 1991, pp. 37-66.
  16. Weka---Machine Learning Software in Java, 2008. [Online]. Available: http://sourceforge.net/projects/weka/