Academia.eduAcademia.edu

Outline

Novel ensemble methods for regression via classification problems

2012, Expert Systems with Applications

https://doi.org/10.1016/J.ESWA.2011.12.029

Abstract

Regression via classification (RvC) is a method in which a regression problem is converted into a classification problem. A discretization process is used to covert continuous target value to classes. The discretized data can be used with classifiers as a classification problem. In this paper, we use a discretization method, Extreme Randomized Discretization (ERD), in which bin boundaries are created randomly to create ensembles. We present two ensemble methods for RvC problems. We show theoretically that the proposed ensembles for RvC perform better than RvC with the equal-width discretization method. We also show the superiority of the proposed ensemble methods experimentally. Experimental results suggest that the proposed ensembles perform competitively to the method developed specifically for regression problems.

References (33)

  1. Ahmad, A. (2010). Data transformation for decision tree ensembles. PhD thesis, School of Computer Science, University of Manchester.
  2. Bibi, S., Tsoumakas, G., Stamelos, I., & Vlahavas, I. (2008). Regression via classification applied on software defect estimation. Expert Systems with Applications, 32, 20912101.
  3. Bishop, C. M. (2008). Pattern recognition and machine learning. New York, Inc.: Springer-Verlag.
  4. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123-140.
  5. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.
  6. Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. CA: Wadsworth International Group.
  7. Burden, Richard L. (2010). Numerical analysis (9th ed.). BookNumerical. Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2, 121167.
  8. Catlett, J. (1991). Megainduction: Machine learning on very large databases. PhD thesis, Basser Department of Computer Science, University of Sydney.
  9. Chan, C. C., Batur, C., & Srinivasan, A. (1991). Determination of quantization intervals in rule based model for dynamic systems. In Proceedings of the IEEE conference on systems, man, and cybernetics (pp. 1719-1723).
  10. Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10, 1895-1923.
  11. Dietterich, T. G. (2000). Ensemble methods in machine learning. In Proceedings of conference multiple classifier systems (Vol. 1857, p. 1-15).
  12. Dougherty, J., Kahavi, R., & Sahami, M. (1995). Supervised and unsupervised discretization of continuous features. In Machine learning: Proceedings of the twelth international conference.
  13. Fan, W., Wang, H., Yu, P. S., & Ma. S. (2003). Is random model better? On its accuracy and efficiency. In Proceedings of third IEEE international conference on data mining (ICDM2003) (p. 51-58).
  14. Fan, W., McCloskey, J., & Yu, P. S. (2006). A general framework for accurate and fast regression by data summarization in random decision trees. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (p. 136-146).
  15. Fayyad, U. M., & Irani, K. B. (1993). Multi-interval discretization of continuous valued attributes for classification learning. In Proceedings of the thirteenth international joint conference on artificial intelligence (p. 1022-1027).
  16. Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119-139.
  17. Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3-42.
  18. Hall, M., Frank, E., Holmes, Geoffrey, Pfahringer, B., Reutemann, P., & Witten, Ian H. (2009). The WEKA data mining software: An update. SIGKDD Explorations, 11(1), 10-18.
  19. Hansen, L. K., & Salamon, P. (1990). Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10), 993-1001.
  20. Holte, R. C. (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11, 63-90.
  21. Indurkhya, N., & Weiss, S. M. (2001). Solving regression problems with rule-based ensemble classifiers. In ACM international conference knowledge discovery and data mining (KDD01) (pp. 287-292).
  22. Kohonen, T. (1989). Self organization and associative memory. Berlin, Germany: Springer-Verlag.
  23. Kuncheva, L. I. (2004). Combining pattern classifiers: Methods and algorithms. Wiley- Interscience.
  24. Kwedlo, W., & Kretowski, M. (1999). An evolutionary algorithm using mulivariate discretization for decision rule induction. In Principles of data mining and knowledge discovery (pp. 392-397).
  25. Mitchell, T. M. (1997). Machine learning. McGraw-Hill.
  26. Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Francisco, CA, USA: Morgan Kaufman Publishers Inc.
  27. Rodriguez, J. J., Kuncheva, L. I., & Alonso, C. J. (2006). Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1619-1630.
  28. Torgo, L., & Gama, J. (1997). Search-based class discretization. In Proceedings of the 9th European conference on machine learning (pp. 266-273).
  29. Torgo, L., & Gama, J. (1999). Regression by classification. In: Advances in artificial intelligence (pp. 51-60).
  30. Torgo, L., & Gama, J. (1997). Regression using classification algorithms. Intelligent Data Analysis, 4(1), 275-292.
  31. Tumer, K., & Ghosh, J. (1996). Error correlation and error reduction in ensemble classifiers. Connection Science, 8(3), 385-404.
  32. Van de Merckt, T. (1993). Decision trees in numerical attribute spaces. In Proceedings of the 13th international joint conference on artificial intelligence (pp. 1016-1021).
  33. Vapnik, V. (1998). Statistical learning theory. New York: Wiley-Interscience.