Statistics: pca

Function Reference: `pca`

statistics: coeff = pca (x)
statistics: coeff = pca (x, Name, Value)
statistics: [coeff, score, latent] = pca (…)
statistics: [coeff, score, latent, tsquared] = pca (…)
statistics: [coeff, score, latent, tsquared, explained, mu] = pca (…)

Performs a principal component analysis on a data matrix.

A principal component analysis of a data matrix of $N$ observations in a $D$ dimensional space returns a $D×D$ transformation matrix, to perform a change of basis on the data. The first component of the new basis is the direction that maximizes the variance of the projected data.

Input argument:

x : a $N×D$ data matrix

The following Name, Value pair arguments can be used:

"Algorithm" defines the algorithm to use:
- "svd" (default), for singular value decomposition
- "eig" for eigenvalue decomposition
"Centered" is a boolean indicator for centering the observation data. It is true by default.
"Economy" is a boolean indicator for the economy size output. It is true by default. Hence, pca returns only the elements of latent that are not necessarily zero, and the corresponding columns of coeff and score, that is, when $N <= D$ , only the first $N - 1$ .
"NumComponents" defines the number of components $k$ to return. If $k < p$ , then only the first $k$ columns of coeff and score are returned.
"Rows" defines how to handle missing values:
- "complete" (default), missing values are removed before computation.
- "pairwise" (only valid when "Algorithm" is "eig"), the covariance of rows with missing data is computed using the available data, but the covariance matrix could be not positive definite, which triggers the termination of pca.
- "complete", missing values are not allowed, pca terminates with an error if there are any.
"Weights" defines observation weights as a vector of positive values of length $N$ .
"VariableWeights" defines variable weights:
- a vector of positive values of length $D$ .
- the string "variance" to use the sample variance as weights.

Return values:

coeff : the principal component coefficients, a $D×D$ transformation matrix
score : the principal component scores, the representation of x in the principal component space
latent : the principal component variances, i.e., the eigenvalues of the covariance matrix of x
tsquared : Hotelling’s T-squared Statistic for each observation in x
explained : the percentage of the variance explained by each principal component
mu : the estimated mean of each variable of x, it is zero if the data are not centered

Matlab compatibility note: the alternating least square method ’als’ and associated options ’Coeff0’, ’Score0’, and ’Options’ are not yet implemented

References

Jolliffe, I. T., Principal Component Analysis, 2nd Edition, Springer, 2002

See also: barttest, factoran, pcacov, pcares

Source Code: pca

Categories &

Functions List

Clustering

Clustering

Classification Classes

Classification Classes

Clustering Classes

Clustering Classes

Regression Classes

Regression Classes

Data Manipulation

Data Manipulation

Descriptive Statistics

Descriptive Statistics

Distribution Classes

Distribution Classes

Distribution Fitting

Distribution Fitting

Distribution Functions

Distribution Functions

Distribution Statistics

Distribution Statistics

Distribution Wrappers

Distribution Wrappers

Experimental Design

Experimental Design

Machine Learning

Machine Learning

Model Fitting

Model Fitting

Hypothesis Testing

Hypothesis Testing

I/O

I/O

Plotting

Plotting

Regression

Regression

Transforms

Transforms

Function Reference: pca

References

Function Reference: `pca`