Data dimensional reduction and principal components analysis

Nema  Salem

doi:10.1016/J.PROCS.2019.12.111

Outline

Data dimensional reduction and principal components analysis

Nema Salem

2019, Procedia Computer Science

https://doi.org/10.1016/J.PROCS.2019.12.111

visibility

…

description

8 pages

link

1 file

Abstract

Research in the fields of machine learning and intelligent systems addresses essential problem of developing computer algorithms that can deal with huge amounts of data and then utilize this data in an intellectual way to solve a variety of real-world problems. In many applications, to interpret data with a large number of variables in a meaningful way, it is essential to reduce the number of variables and interpret linear combinations of the data. Principal Component Analysis (PCA) is an unsupervised learning technique that uses sophisticated mathematical principles to reduce the dimensionality of large datasets. The goal of this paper is to provide a complete understanding of the sophisticated PCA in the fields of machine learning and data dimensional reduction. It explains its mathematical aspect and describes its relationship with Singular Value Decomposition (SVD) when PCA is calculated using the covariance matrix. In addition, with the use of MATLAB, the paper shows the usefulness of PCA in representing and visualizing Iris dataset using a smaller number of variables.

FAQs

How does PCA effectively reduce dimensionality in large datasets?add

The paper demonstrates that PCA can retain 99.96% of data variation while reducing dimensions from 4 to 2 for the Iris dataset.

What role does SVD play in PCA calculations?add

PCA utilizes SVD to determine eigenvalues and eigenvectors from the covariance matrix, aiding in the dimensionality reduction process.

What percentage of variance is typically retained using PCA?add

In practice, including principal components that cover about 70-80% of the data variation is often sufficient.

What methods were compared to PCA in the related work?add

The research references methods such as Non-Negative Matrix Factorization and wavelet-based techniques with varying successes in different contexts.

What practical applications are suggested for PCA based on this research?add

The research indicates that PCA derived from SVD can be applied to image classification and EEG signal analysis.

References (10)

Qiao, Hanli. (2015) "New SVD based initialization strategy for non-negative matrix factorization" Pattern Recognition Letters 63: 71-77.
Kumar, Ranjeet; A. Kumar; and G. K. Singh. (2015) "Electrocardiogram signal compression based on singular value decomposition (SVD) and adaptive scanning wavelet difference reduction (ASWDR) technique" AEU-International Journal of Electronics and Communications 69.12: 1810-1822.
Houari, Rima; et al. (2016) "Dimensionality reduction in data mining: A Copula approach." Expert Systems with Applications 64: 247-260.
Menon, Vineetha; Qian Du; and James E. Fowler. (2016) "Fast SVD with random Hadamard projection for hyperspectral dimensionality reduction." IEEE Geoscience and Remote Sensing Letters 13.9: 1275-1279.
Kumar, Manoj; and Ankita Vaish. (2017) "An efficient encryption-then-compression technique for encrypted images using SVD." Digital Signal Processing 60: 81-89.
Olive, David J. (2017) "Principal component analysis." Robust Multivariate Analysis. Springer, Cham, 189-217.
Feng, Jun; et al. (2018) "A Secure Higher-Order Lanczos-Based Orthogonal Tensor SVD for Big Data Reduction." IEEE Transactions on Big Data.
I. T. Jolliffe. (2002) "Principal Component Analysis", 2nd ed. Springer series in statistics.
P. David. (2015) "Linear Algebra: A Modern Introduction", 4th ed. Cengage Learning.
D. a. K. T. Dua, E. (2017) "UCI Machine Learning Repository", Available: http://archive.ics.uci.edu/ml

Data dimensional reduction and principal components analysis

Sign up for access to the world's latest research

Abstract

FAQs

Related papers

References (10)

Related papers

Related topics

Cited by