plsregress
Calculate partial least squares regression using SIMPLS algorithm.
plsregress
uses the SIMPLS algorithm, and first centers X and
Y by subtracting off column means to get centered variables. However,
it does not rescale the columns. To perform partial least squares regression
with standardized variables, use zscore
to normalize X and
Y.
[xload, yload] = plsregress (X, Y)
computes a
partial least squares regression of Y on X, using NCOMP
PLS components, which by default are calculated as
min (size (X, 1) - 1, size(X, 2))
, and returns the
the predictor and response loadings in xload and yload,
respectively.
[xload, yload] = plsregress (X, Y,
NCOMP)
defines the desired number of PLS components to use in the
regression. NCOMP, a scalar positive integer, must not exceed the
default calculated value.
[xload, yload, xscore, yscore, coef,
pctVar, mse, stats] = plsregress (X, Y,
NCOMP)
also returns the following arguments:
0:NCOMP
components with the
first row containing the squared errors for the predictor variables in
X and the second row containing the mean squared errors for the
response variable(s) in Y.
.W
is a matrix of PLS weights.
.T2
is the statistics for each point in
xscore.
.Xresiduals
is an matrix with the
predictor residuals.
.Yresiduals
is an matrix with the
response residuals.
[…] = plsregress (…, Name, Value, …)
specifies one or more of the following Name/Value pairs:
Name | Value | |
---|---|---|
"CV" | The method used to compute mse. When
Value is a positive integer , plsregress uses
-fold cross-validation. Set Value to a cross-validation
partition, created using cvpartition , to use other forms of
cross-validation. Set Value to "resubstitution" to use both
X and Y to fit the model and to estimate the mean squared errors,
without cross-validation. By default, Value = "resubstitution" . | |
"MCReps" | A positive integer indicating the number of
Monte-Carlo repetitions for cross-validation. By default,
Value = 1 . A different "MCReps" value is only
meaningful when using the "HoldOut" method for cross-validation,
previously set by a cvpartition object. If no cross-validation method
is used, then "MCReps" must be 1 . |
Further information about the PLS regression can be found at https://en.wikipedia.org/wiki/Partial_least_squares_regression
Source Code: plsregress
## Perform Partial Least-Squares Regression ## Load the spectra data set and use the near infrared (NIR) spectral ## intensities (NIR) as the predictor and the corresponding octave ## ratings (octave) as the response. load spectra ## Perform PLS regression with 10 components [xload, yload, xscore, yscore, coef, ptcVar] = plsregress (NIR, octane, 10); ## Plot the percentage of explained variance in the response variable ## (PCTVAR) as a function of the number of components. plot (1:10, cumsum (100 * ptcVar(2,:)), "-ro"); xlim ([1, 10]); xlabel ("Number of PLS components"); ylabel ("Percentage of Explained Variance in octane"); title ("Explained Variance per PLS components"); ## Compute the fitted response and display the residuals. octane_fitted = [ones(size(NIR,1),1), NIR] * coef; residuals = octane - octane_fitted; figure stem (residuals, "color", "r", "markersize", 4, "markeredgecolor", "r") xlabel ("Observations"); ylabel ("Residuals"); title ("Residuals in octane's fitted responce"); |
## Calculate Variable Importance in Projection (VIP) for PLS Regression ## Load the spectra data set and use the near infrared (NIR) spectral ## intensities (NIR) as the predictor and the corresponding octave ## ratings (octave) as the response. Variables with a VIP score greater than ## 1 are considered important for the projection of the PLS regression model. load spectra ## Perform PLS regression with 10 components [xload, yload, xscore, yscore, coef, pctVar, mse, stats] = ... plsregress (NIR, octane, 10); ## Calculate the normalized PLS weights W0 = stats.W ./ sqrt(sum(stats.W.^2,1)); ## Calculate the VIP scores for 10 components nobs = size (xload, 1); SS = sum (xscore .^ 2, 1) .* sum (yload .^ 2, 1); VIPscore = sqrt (nobs * sum (SS .* (W0 .^ 2), 2) ./ sum (SS, 2)); ## Find variables with a VIP score greater than or equal to 1 VIPidx = find (VIPscore >= 1); ## Plot the VIP scores scatter (1:length (VIPscore), VIPscore, "xb"); hold on scatter (VIPidx, VIPscore (VIPidx), "xr"); plot ([1, length(VIPscore)], [1, 1], "--k"); hold off axis ("tight"); xlabel ("Predictor Variables"); ylabel ("VIP scores"); title ("VIP scores for each predictror variable with 10 components"); |