Academia.eduAcademia.edu

Outline

Compositional Canonical Correlation Analysis

https://doi.org/10.1101/144584

Abstract

The study of the relationships between two compositions by means of canonical correlation analysis is addressed. A compositional version of canonical correlation analysis is developed, and called CODA-CCO. We consider two approaches, using the centred log-ratio transformation and the calculation of all possible pairwise log-ratios within sets. The relationships between both approaches are pointed out, and their merits are discussed. The related covariance matrices are structurally singular, and this is efficiently dealt with by using generalized inverses. We develop compositional canonical biplots and detail their properties. The canonical biplots are shown to be powerful tools for discovering the most salient relationships between two compositions. Some guidelines for compositional canonical biplots construction are discussed. A geological data set with X-ray fluorescence spectrometry measurements on major oxides and trace elements is used to illustrate the proposed method. The rel...

References (20)

  1. Aitchison, J. 1983. Principal component analysis of compositional data. Biometrika, 70(1):57-65.
  2. Aitchison, J. 1986. The Statistical Analysis of Compositional Data. The Blackburn press, Caldwell, NJ. 2003 printing.
  3. Aitchison, J. and Greenacre, M. 2002. Biplots of compositional data. Journal of the Royal Statistical Society, Series C (Applied Statistics), 51(4):375-392.
  4. Anderson, T. W. 1984. An Introduction to Multivariate Statistical Analysis. John Wiley, New York, second edition.
  5. Dillon, W. R. and Goldstein, M. 1984. Multivariate analysis methods and applications. John Wiley & Sons, New York.
  6. Egozcue, J. J., Pawlowsky-Glahn, V., Mateu-Figueras, G., and Barceló-Vidal, C. 2003. Isometric logra- tio transformations for compositional data analysis. Mathematical Geology, 35(3):279-300.
  7. Fitton, g. 1997. X-ray fluorescence spectrometry. In Gill, r., editor, Modern Analytical Geochemistry, pages 41-66. Longman, Singapore.
  8. Gabriel, K. R. 1971. the biplot graphic display of matrices with application to principal component analysis. biometrika, 58(3):453-467.
  9. Gittins, R. 1985. Canonical Analysis. Springer Verlag.
  10. Graffelman, J. 2005. Enriched biplots for canonical correlation analysis. Journal of Applied Statistics, 32(2):173-188.
  11. Graffelman, J. and Aluja-Banet, T. 2003. Optimal representation of supplementary variables in biplots from principal component analysis and correspondence analysis. Biometrical Journal, 45(4):491-509.
  12. Greenacre, M. J. 1984. Theory and applications of correspondence analysis. Academic press.
  13. Haber, M. and Gabriel, K. R. 1976. Weighted least squares approximation of matrices and its application to canonical correlations and biplot display. Technical report, University of Rochester, Department of statistics.
  14. Hotelling, H. 1935. The most predictable criterion. Journal of Educational Psychology, 26:139-142.
  15. Hotelling, H. 1936. Relations between two sets of variates. Biometrika, 28:321-377.
  16. Johnson, R. A. and Wichern, D. W. 2002. Applied multivariate statistical analysis. New Jersey: Prentice Hall, fifth edition.
  17. Manly, B. F. J. 1989. Multivariate statistical methods: a primer. Chapman and Hall, London.
  18. Mardia, K. V., Kent, J. T., and Bibby, J. M. 1979. Multivariate Analysis. Academic press London.
  19. Pawlowsky-Glahn, V. and Buccianti, A., editors 2011. Compositional Data Analysis: Theory and Applications. John Wiley & Sons., Chichester, United Kingdom. 378 p.
  20. Pawlowsky-Glahn, V., Egozcue, J. J., and Tolosana-Delgado, R. 2015. Modeling and Analysis of Compositional Data. John Wiley & Sons, Chichester, United Kingdom. 32