Academia.eduAcademia.edu

Outline

A chemometrics toolbox based on projections and latent variables

2014, Journal of Chemometrics

https://doi.org/10.1002/CEM.2581

Abstract

A personal view is given about the gradual development of projection methods-also called bilinear, latent variable, and more-and their use in chemometrics. We start with the principal components analysis (PCA) being the basis for more elaborate methods for more complex problems such as soft independent modeling of class analogy, partial least squares (PLS), hierarchical PCA and PLS, PLS-discriminant analysis, Orthogonal projection to latent structures (OPLS), OPLS-discriminant analysis and more. From its start around 1970, this development was strongly influenced by Bruce Kowalski and his group in Seattle, and his realization that the multidimensional data profiles emerging from spectrometers, chromatographs, and other electronic instruments, contained interesting information that was not recognized by the current one variable at a time approaches to chemical data analysis. This led to the adoption of what in statistics is called the data analytical approach, often called also the data driven approach, soft modeling, and more. This approach combined with PCA and later PLS, turned out to work very well in the analysis of chemical data. This because of the close correspondence between, on the one hand, the matrix decomposition at the heart of PCA and PLS and, on the other hand, the analogy concept on which so much of chemical theory and experimentation are based. This extends to numerical and conceptual stability and good approximation properties of these models. The development is informally summarized and described and illustrated by a few examples and anecdotes.

References (55)

  1. Kowalski BR, Bender CF. Pattern recognition. 1. A powerful approach to interpreting chemical data. J. Am. Chem. Soc. 1972; 94: 5632-5638.
  2. Kowalski BR, Bender CF. Pattern recognition. 11. Linear and nonlinear methods for displaying chemical data. J. Am. Chem. Soc. 1973; 95: 686-692.
  3. Jöreskog KG. Herman Wold, ed.s, systems under indirect observation, parts I and II. Cartigny proceedings, North-Holland, Amsterdam, 1982.
  4. Mathematics and statistics in chemistry, proceedings NATO Adv. Study inst. On Chemometrics, Cosenza, Italy, Sept. 1983. (BR Kowalski, Ed.), D.Reidel, Dordrecht, The Netherlands, 1984.
  5. Geladi P, Kowalski B. Partial least-squares regression-a tutorial. Anal. Chim. Acta 1986; 185: 1-17.
  6. Gerlach RW, Kowalski BR, Wold, H. Partial Least Squares modeling with latent variables. Anal. Chim. Acta 1979; 112: 417-21.
  7. Hofstadter D, Sander E. The forgotten fuel of our minds. New Sci. 2013; 218: 30-33.
  8. Wold S, Martens H, Wold H. The multivariate calibration problem in chemistry solved by the PLS method, matrix pencils. Lect. Notes Math. (Springer Verlag) 1983; 973: 286-293.
  9. Spiegelman CH, McShane MJ, Goetz MJ, Motamedi M, Yue QL, Coté GL. Theoretical justification of wavelength selection in PLS calibration: development of a new algorithm. Anal. Chem. 1998; 70(1):35-44.
  10. Horst P. Factor Analysis of Data Matrices. Holt, Rinehart, and Winston, Inc., New York, 1965.
  11. Wold H. Estimation of principal components and related models by iterative least squares. In Multivariate Analysis, Krishnaiah PR (eds.). Academic Press: New York, 1966; 391-420.
  12. Andersson M. A comparison of nine PLS1 algorithms. J. Chemometr. 2009; 23: 518-529.
  13. Wold S. Cross validatory estimation of the number of components in factor and principal components models. Technometrics 1978; 20: 397-405.
  14. Wiklund S, Nilsson D, Eriksson L, Sjöström M, Wold S, Faber K. A randomization test for PLS component selection, Svante Wold special issue. J. Chemometr. 2007; 21: 427-439.
  15. Martens H, Naes T. Multivariate calibration H. Martens, T. Naes, Multivariate Calibration. Wiley: New York, 1989. ISBN 0 471 90979 3
  16. Kvalheim OM. Latent-structure decompositions (projections) of multivariate data. Chemom. Intell. Lab. Syst. 1987; 2: 283-290.
  17. Kvalheim OM, Karstang TV. Interpretation of latent-variable regres- sion models. Chemom. Intell. Lab. Syst. 1989; 7: 39-51.
  18. Palm JH. The computation of stereographic projections for non- cubic crystals. Z. Kristallogr. 1966; 123: 388-390.
  19. Wold S. Chemica Scripta: A theoretical foundation of extrather- modynamic relationships (linear free energy relationships). Chemica Scripta 1974; 5: 97-106.
  20. Figure 13. Line plot of po1-the loading of the first orthogonal-in-X component-of the lower level near infrared model. This plot suggests the region around 1970 nm to contain a large fraction of the noncorrelating variation.
  21. Hammett LP. The effect of structure upon the reactions of organic com- pounds, benzene derivatives. J. Am. Chem. Soc. 1937; 59: 96-103.
  22. Tauler R, Smilde A, Kowalski BR. Selectivity, local rank, three-way data analysis and ambiguity in multivariate curve resolution. J. Chemometr. 1995; 9: 31-58.
  23. Tauler R, Kowalski B. Multivariate curve resolution applied to spectral data from multiple runs of an industrial process. Anal. Chem. 1993; 65: 2040-2047.
  24. Trygg J. Prediction and spectral profile estimation in multivariate calibration. J. Chemometr. 2004; 18: 166-172.
  25. Wakeling IN, Macfie HJF. A robust PLS procedure. J. Chemometr. 1992; 6: 189-198.
  26. Wikström C, Albano C, Eriksson L, Friden H, Johansson E, Nordahl Å, Rännar S, Sandberg M, Kettaneh-Wold N, Wold S. Multivariate process and quality monitoring applied to an electrolysis process. Part I: Process supervision with multivariate control charts. Chemom. Intell. Lab. Syst. 1998; 42: 221-231.
  27. Miller P, Swanson RE, Heckler C. Contribution plots: a missing link in multivariate quality control. App. Math. Comp. Sci. 1998; 8: 775-792. ISSN 0867-857X
  28. Wold S. Pattern recognition by means of disjoint principal compo- nent models. Pattern recognition 1976; 8: 127-139.
  29. Hinz DC. Process analytical technologies in the pharmaceutical industry: the FDA's PAT initiative. Anal. Bioanal. Chem. 2006; 384: 1036-1042.
  30. Kourti T, MacGregor JF. Multivariate SPC methods for process and product monitoring. J. Qual. Tech. 1996; 28: 409-428.
  31. Wise BM, Veltkamp DJ, Davis B, Ricker NL, Kowalski BR. "Principal components analysis for monitoring the west valley liquid fed ce- ramic melter," Waste Management '88 Proceedings, pps. 811-818, Tucson AZ 1988.
  32. Hassani S, Martens H. El Mostafa Qannari, Achim Kohler, degrees of freedom estimation in principal component analysis and consensus principal component analysis. Chemom. Intell. Lab. Syst. 2012; 118: 246-259.
  33. Kolda TG, Orthogonal tensor decompositions. SIAM J. Matrix Anal. Appl. 2001; 23(1):243-255.
  34. Geladi P, Wold S, Esbensen K. Image-analysis and chemical infor- mation in images. Anal. Chim. Acta 1986; 191: 473-480.
  35. Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics 1970; 12: 55-67.
  36. Wold S, Ruhe A, Wold H, Dunn WJ. The collinearity problem in linear- regression, the partial least squares (PLS) approach to generalized inverses. SIAM J. Sci. Stat. Comp. 1984; 5: 735-743.
  37. Wold S, Albano C, Dunn III WJ, Esbensen K, Hellberg S, Johansson E, Sjöström M. Pattern recognition: finding and using regularities in multi-variate data (H.Martens and H Russwurm Jr, Ed.s). Proc. IUFOST Conf. Food Research and Data Analysis, 1983, pp. 147-188.
  38. Sjöström M, Wold S, Lindberg W, Persson JA, Martens H. A multivar- iate calibration problem in analytical chemistry solved by partial least squares models in latent variables. Anal. Chim. Acta. 1983; 150: 61-70.
  39. Trygg J, Wold S. Orthogonal projections to latent structures (O-PLS). J. Chemometr. 2002; 16: 119-128.
  40. de Jong S. SIMPLS: an alternative approach to partial least squares regression. Chemom. Intell. Lab. Syst. 1993; 18: 251-263.
  41. Dyrby M, Petersen RV, Larsen J, Rudolf B, Norgaard L, Engelsen SB, Towards on-line monitoring of the composition of commercial carrageenan powders. Carbohydr. Polym. 2004; 57: 337-348.
  42. Barnes RJ, Dhanoa MS, Lister SJ, Standard normal variate transforma- tion and de-trending of near-infrared diffuse reflectance spectra, Appl. Spectrosc., 1989, 43: 772-777.
  43. Sjöström M, Wold S, Söderström B. PLS Discriminant Plots in Pattern Recognition in Practice II. Elsevier Science Publ. B. V: Holland, 1986; 461-470.
  44. Bylesjö B, Rantalainen M, Cloarec O, Nicholson JK, Holmes E, Trygg J. OPLS discriminant analysis, combining the strengths of PLS-DA and SIMCA classification. J. Chemometr. 2007; 20: 341-351.
  45. Eriksson L, Trygg J, Wold S. PLS-Trees, a top-down clustering approach. J. Chemometr. 2009; 23: 569-580.
  46. Wold S, Sjöström M, Eriksson L. PLS in Chemistry, The Encyclopedia of Computational Chemistry. Wiley: Chichester, 1999; 2006-2020.
  47. Macgregor JF. Online statistical process control. Chem. Eng. Prog. 1988; 84: 21-31.
  48. Eriksson L, Toft M, Johansson E, Wold S, Trygg J. Separating Y- predictive and Y-orthogonal variation in multi-block spectral data. J. Chemom. 2006; 20(8-10):352-361.
  49. Wold S, Kettaneh N, Tjessem K. Journal of Chemometrics, Hierarchi- cal multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection. 1996, 10, 463-482.
  50. Fisher RA. The Design of Experiments. Oliver and Boyd: Edinburgh, 1935. ISBN 0-02-844690-9
  51. Box GEP, Hunter WG, Hunter JS. Statistics for Experimenters: An Intro- duction to Design, Data Analysis, and Model Building. Wiley, 1978. ISBN 978-0471093152
  52. Wold S, Sjöström M, Carlson R, Lundstedt T, Hellberg S, Skagerberg B, Wikström C, and Öhman J. Multivariate Design. Analytica. Chemica. Acta. 1986; 191: 17-32.
  53. Carlson R. Design and Optimization in Organic Synthesis. Elsevier, 1991. ISBN: 978-0-444-89201-0
  54. Katkevica D, Trapencieris P, Boman A, Kalvins I, Lundstedt T. The Nenitzescu reaction: an initial screening of experimental conditions for improvement of the yield of a model reaction. J. Chemometr. 2004; 18: 183-187.
  55. Tukey JW. Exploratory Data Analysis. Addison-Wesley, 1977. ISBN 0-201-07616-0