Academia.eduAcademia.edu

Outline

A New Learning Algorithm for Classification in the Reduced Space

2008, Iceis

Abstract

The aim of the research reported in the paper was twofold: to propose a new approach in cluster analysis and to investigate its performance, when it is combined with dimensionality reduction schemes. Our attempt is based on group skeletons defined by a set of orthogonal and unitary eigen vectors (principal directions) of the sample covariance matrix. Our developments impose a set of quite natural working assumptions on the true but unknown nature of the class system. The search process for the optimal clusters approximating the unknown classes towards getting homogenous groups, where the homogeneity is defined in terms of the "typicality" of components with respect to the current skeleton. Our method is described in the third section of the paper. The compression scheme was set in terms of the principal directions corresponding to the available cloud. The final section presents the results of the tests aiming the comparison between the performances of our method and the standard k-means clustering technique when they are applied to the initial space as well as to compressed data.

References (14)

  1. Cocianu, C., State, L., Rosca, I., Vlamos, P, 2007. A New Adaptive Classification Scheme Based on Skeleton Information. In Proceedings of ICETE-SIGMAP 2007, Spain.
  2. Diamantaras, K.I., Kung, S.Y., 1996. Principal Component Neural Networks: theory and applications, John Wiley &Sons
  3. Everitt, B. S., 1978. Graphical Techniques for Multivariate Data, North Holland, NY Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R., 1996. Advances in Knowledge Discovery and Data Mining, AAAI Press/MIT Press, Menlo Park, CA.
  4. Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R., 2004. Neighborhood Component Analysis. In Proceedings of the Conference on Advances in Neural Information Processing Systems Gordon, A.D. 1999. Classification, Chapman&Hall/CRC, 2 nd Edition
  5. Hastie, T., Tibshirani, R., Friedman, J. 2001. The Elements of Statistical Learning Data Mining, Inference, and Prediction. Springer-Verlag
  6. Hyvarinen, A., Karhunen, J., Oja, E., 2001. Independent Component Analysis, John Wiley &Sons
  7. Jain,A.K., Dubes,R., 1988. Algorithms for Clustering Data, Prentice Hall,Englewood Cliffs, NJ.
  8. Jain, A.K., Murty, M.N., Flynn, P.J. 1999. Data clustering: a review. ACM Computing Surveys, Vol. 31, No. 3, September 1999
  9. Liu, J., and Chen, S. 2006. Discriminant common vectors versus neighborhood components analysis and Laplacianfaces: A comparative study in small sample size problem. Image and Vision Computing 24 (2006) 249-262
  10. MCQueen, J. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics andProbability, 281-297.
  11. Panayirci,E., Dubes,R.C., 1983. A test for multidimensional clustering tendency. Pattern Recognition,16, 433-444
  12. Smith,S.P., Jain,A.K., 1984. Testing for uniformity in multidimensional data, In IEEE Trans.`Patt. Anal.` and Machine Intell., 6(1),73-81
  13. State L., Cocianu C. 1997. The computation of the most informational linear features, Informatica Economica, Vol. 1, Nr. 4
  14. Ripley, B.D. 1996. Pattern Recognition and Neural Networks, Cambridge University Press, Cambridge.