Academia.eduAcademia.edu

Outline

ARDV: A New Density Based Outlier Mining Approach 1

2014

Abstract

Local Outlier Factor (LOF) is an important and well known density based outliers handling algorithm, which quantifies, how much an object is outlying, in a given database. In this paper first we discuss LOF then we introduce the concept of ARDV. In LOF there is a concept of lrd (local reachability density). If in place of lrd we calculate ard (average reachability distance) and in place of LOF we calculate variance in ard (ARDV) then experimental results show that percentage of detecting correct outliers increases without increasing time complexity. Definition 2 (k-distance neighborhood of an object p represented as Nk(p)):this is the total no of object whose distance from p is not greater than the kdistance.

References (10)

  1. M. M. Breunig, H.-P. Kriegel, R. T. Ng, J. Sander. LOF:Identifying density-based local outliers. In Proceedings of ACMSIGMOD International Conference on Management of Data, pp. 93-104,Dallas, Texas, U.S.A., 2000.
  2. J. Han, M. Kamber,"Data Mining, Concepts and Techniques. Morgan Kaufmann, San Francisco, 2001.
  3. D. Hawkins,"Identification of Outliers. Chapman and Hall, London, 1980.
  4. Anny Lai-mei Chiu, Ada Wai-chee Fu,"Enhancements on Local Outlier Detection", Proceedings of the Seventh International Database Engineering and Applications Symposium (IDEAS'03).
  5. Aggarwal, C. C., Yu, S. P.,"An effective and efficient algorithm for high-dimensional outlier detection", The VLDB Journal, 2005, Vol. 14, pp. 211-221.
  6. M. Ester, H. Kriegel, J. Sander, X. Xu.,"A density-based algorithm for discovering clusters in large spatial databases with noise", In Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), pages 226-231, Portland, Oregon, 1996.
  7. M. Ankerst,M. M. Breunig, H.-P.Kriegel, J. Sander. OPTICS: Ordering points to identify the clustering structure. In Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 49-60, Philadephia, Pennsylvania,U.S.A., 1999.
  8. Frank, A., Asuncion, A. (2010). UCI Machine Learning Repository, [Online] Available: http://archive.ics.uci. edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
  9. Fabrizio Angiulli , Clara Pizzuti,"Outlier Mining in Large High-Dimensional Data Sets", IEEE Transactions on Knowledge and Data Engineering, Vol. 17, No. 2, February 2005
  10. K. G. Sharma, A. Ram, Y. Singh," Efficient Density Based Outlier Handeling Technique in Data Mining", In Proceedings of 1st international Conference on Computer science and Information Technology, CCSIT 2011, Part 1, CCIS 131, pp. 542-550, Banglore, India, January 2011.