Academia.eduAcademia.edu

Outline

DeEPs: A New Instance-based Discovery and Classification System

Abstract

Distance is used as a measurement in most of instance-based classi cation systems. Rather than using distance, we make use of the frequency of an instance's subsets and the frequency-changing rate of the subsets among training classes to perform both knowledge discovery and classi cation tasks. We name the system DeEPs. Whenever an instance is considered, DeEPs can e ciently discover those patterns contained in the instance which sharply di erentiate the training classes from one to another. If the instance is unseen previously, DeEPs can predict a class label for it by compactly summarizing the frequencies of the discovered patterns based on a view to collectively maximizing the discriminating power of the patterns. According to some interestingness measures, top-ranked patterns by DeEPs can help users conveniently assess the importance of the discovered patterns. Many experimental results are used to evaluate the system, showing that the patterns are comprehensible and that DeEPs is accurate and scalable. Based on the results on 40 benchmark data sets, DeEPs is more accurate than k-nearest neighbor and C5.0.

References (47)

  1. Aha, D. W. (1997). Lazy learning. Dordrecht, Netherlands: Kluwer Academic Publishers.
  2. Aha, D. W., Kibler, D., & Albert, M. K. (1991). Instance-based learning algorithms. Machine Learning, 6, 37{66.
  3. Bayardo, R. J., & Agrawal, R. (1999). Mining the most interesting rules. Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 145{154). San Diego, CA: ACM Press.
  4. Blake, C., & Murphy, P. (1998). The UCI machine learning repository. http://www.cs.uci.edu/ ~mlearn/MLRepository.html].
  5. Irvine, CA: University of California, Department of Information and Computer Science.
  6. Cover, T. M., & Hart, P. E. (1967). Nearest neighbor pattern classi cation. IEEE Transactions on Information Theory, 13, 21{27.
  7. Dasarathy, B. (1991). Nearest neighbor pattern classi cation techniques. Los Alamitos, CA: IEEE Com- puter Society Press.
  8. Datta, P., & Kibler, D. (1995). Learning prototypical concept description. Proceedings of the Twelfth International Conference on Machine Learning (pp. 158{166). Morgan Kaufmann Publishers: San Francisco, CA.
  9. Datta, P., & Kibler, D. (1997). Symbolic nearest mean classi er. Proceedings of the Fourteenth Inter- national Conference on Machine Learning (pp. 75{82). Morgan Kaufmann Publishers: San Francisco, CA. Deitel, H. M., & Deitel, P. J. (1998). C++ how to program, second edition. Prentice Hall, Upper Saddle River, New Jersey, USA.
  10. Devroye, L., Gyor , L., & Lugosi, G. (1996). A probabilistic theory of pattern recognition. New York: Springer-Verlag.
  11. Domingos, P. (1996). Unifying instance-based and rule-based induction. Machine Learning, 24(2), 141{ 168. Dong, G., & Li, J. (1998). Interestingness of discovered association rules in terms of neighborhood-based unexpectedness. Proceedings of the Second Paci c-Asia Conference on Knowledge Discovery and Data Mining, Melbourne (pp. 72{86). Springer-Verlag.
  12. Dong, G., & Li, J. (1999). E cient mining of emerging patterns: Discovering trends and di erences. Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 43{52). San Diego, CA: ACM Press.
  13. Dong, G., Zhang, X., Wong, L., & Li, J. (1999). CAEP: Classi cation by aggregating emerging patterns. Proceedings of the Second International Conference on Discovery Science, Tokyo, Japan (pp. 30{42). Springer-Verlag.
  14. Duda, R., & Hart, P. (1973). Pattern classi cation and scene analysis. New York: John Wiley & Sons.
  15. Fayyad, U., & Irani, K. (1993). Multi-interval discretization of continuous-valued attributes for classi - cation learning. Proceedings of the 13th International Joint Conference on Arti cial Intelligence (pp. 1022{1029). Morgan Kaufmann.
  16. Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian network classi ers. Machine Learning, 29, 131{163.
  17. Hilderman, R. J., & Hamilton, H. J. (2001). Evaluation of interestingness measures for ranking discovered knowledge. Proceedings of the Fifth Paci c-Asia Conference on Knowledge Discovery and Data Mining (pp. 247{259). Hong Kong, China: Springer-Verlag.
  18. Hirsh, H. (1994). Generalizing version spaces. Machine Learning, 17, 5{46.
  19. Holte, R. C. (1993). Very simple classi cation rules perform well on most commonly used data sets. Machine Learning, 11, 63{90.
  20. Keung, C.-K., & Lam, W. (2000). Prototype generation based on instance ltering and averaging. Proceedings of the Fourth Paci c-Asia Conference on Knowledge Discovery and Data Mining (pp. 142{152). Berlin Heidelberg: Springer-Verlag.
  21. Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., & Verkamo, A. (1994). Finding interesting rules from large sets of discovered association rules. Proceedings of the 3rd International Conference on Information and Knowledge Management (pp. 401{408). Gaithersburg, Maryland: ACM Press.
  22. Kohavi, R., John, G., Long, R., Manley, D., & P eger, K. (1994). MLC++: A machine learning library in C++. Tools with arti cial intelligence (pp. 740 { 743).
  23. Kubat, M., & Cooperson, M. (2000). Voting nearest-neighbor subclassi ers. Proceedings of the Seven- teenth International Conference on Machine Learning (pp. 503{510). Morgan Kaufmann.
  24. Langley, P., & Iba, W. (1993). Average-case analysis of a nearest neighbor algorithm. Proceedings of the Thirteenth International Joint Conference on Arti cial Intelligence (pp. 889 {894). Chambery, France.
  25. Langley, P., Iba, W., & Thompson, K. (1992). An analysis of Bayesian classi er. Proceedings of the Tenth National Conference on Arti cial Intelligence (pp. 223 { 228). AAAI Press.
  26. Li, J. (2001). Mining emerging patterns to construct accurate and e cient classi ers. Department of Computer Science and Software Engineering, The University of Melbourne, Australia: Ph.D. Thesis.
  27. Li, J., Dong, G., & Ramamohanarao, K. (2000a). Instance-based classi cation by emerging patterns. Proceedings of the Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases (pp. 191{200). Lyon, France: Springer-Verlag.
  28. Li, J., Dong, G., & Ramamohanarao, K. (2000b). Making use of the most expressive jumping emerging patterns for classi cation. Proceedings of the Fourth Paci c-Asia Conference on Knowledge Discovery and Data Mining. An expanded version of the paper appears in Knowledge and Information Systems: An International Journal, 2001 (pp. 220{232). Kyoto, Japan: Springer-Verlag.
  29. Li, J., Ramamohanarao, K., & Dong, G. (2000c). The space of jumping emerging patterns and its incre- mental maintenance algorithms. Proceedings of the Seventeenth International Conference on Machine Learning, Stanford, CA, USA (pp. 551{558). San Francisco: Morgan Kaufmann.
  30. Li, J., Ramamohanarao, K., & Dong, G. (2001). Combining the strength of pattern frequency and distance for classi cation. The Fifth Paci c-Asia Conference On Knowledge Discovery and Data Mining (pp. 455{466). Hong Kong.
  31. Liu, B., Hsu, W., & Ma, Y. (1998). Integrating classi cation and association rule mining. Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (pp. 80{86). New York, USA: AAAI Press.
  32. Meretakis, D., & Wuthrich, B. (1999). Extending naive bayes classi ers using long itemsets. Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 165{174). San Diego, CA: ACM Press.
  33. Mitchell, T. (1977). Version spaces: A candidate elimination approach to rule learning. Proceedings of the Fifth International Joint Conference on Arti cial Intelligence (pp. 305{310). Cambridge, MA.
  34. Mitchell, T. (1982). Generalization as search. Arti cial Intelligence, 18, 203{226.
  35. Padmanabhan, B., & Tuzhilin, A. (1998). A belief-driven method for discovering unexpected patterns. Proceedings of the Fourth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 94{100). New York, NY: AAAI Press.
  36. Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1, 81{106.
  37. Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann.
  38. Quinlan, J. R. (1996). Improved use of continuous attributes in C4.5. Journal of Arti cial Intelligence Research, 4, 77{90.
  39. Sahar, S. (1999). Interestingness via what is not interesting. Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 332{336). San Diego, CA: ACM Press.
  40. Salzberg, S. (1991). A nearest hyperrectangle learning method. Machine Learning, 6, 251{276.
  41. Sebag, M. (1996). Delaying the choice of bias: A disjunctive version space approach. Proceedings of the Thirteenth International Conference on Machine Learning (pp. 444{452). Morgan Kaufmann.
  42. Silberschatz, A., & Tuzhilin, A. (1996). What makes patterns interesting in knowledge discovery systems. IEEE Transactions on Knowledge and Data Engineering, 8(6), 970{974.
  43. Wettscherech, D. (1994). A hybrid nearest-neighbor and nearest-hyperrectangle algorithm. Proceedings of the Seventh European Conference on Machine Learning (pp. 323{335). Springer-Verlag.
  44. Wettschereck, D., & Dietterich, T. G. (1995). An experimental comparison of the nearest-neighbor and nearest-hyperrectangle algorithms. Machine Learning, 19(1), 5{27.
  45. Wilson, D., & Martinez, T. R. (2000). Reduction techniques for instance-based learning algorithms. Machine Learning, 38(3), 257{286.
  46. Wilson, D. R., & Martinez, T. R. (1997). Instance pruning techniques. Proceedings of the Fourteenth International Conference on Machine Learning (pp. 403{411). Morgan Kaufmann.
  47. Zhang, J. (1992). Selecting typical instances in instance-based learning algorithms. Proceedings of the Ninth International Conference on Machine Learning (pp. 470{479). Morgan Kaufmann.