ClassificationDiscriminant
statistics: ClassificationDiscriminant
Discriminant analysis classification
The ClassificationDiscriminant class implements a linear discriminant
analysis classifier object, which can predict responses for new data using
the predict method.
Discriminant analysis classification is a statistical method used to classify observations into predefined groups based on their characteristics. It estimates the parameters of different distributions for each class and predicts the class of new observations by finding the one with the smallest misclassification cost.
Create a ClassificationDiscriminant object by using the
fitcdiscr function or the class constructor.
See also: fitcdiscr
Source Code: ClassificationDiscriminant
A numerix matrix containing the unstandardized predictor data. Each column of X represents one predictor (variable), and each row represents one observation. This property is read-only.
Specified as a logical or numeric vector, or cell array of character vectors. Each value in Y is the observed class label for the corresponding row in X. This property is read-only.
A positive integer value specifying the number of observations in the training dataset used for training the ClassificationDiscriminant model. This property is read-only.
A logical column vector with the same length as the observations in the original predictor data X specifying which rows have been used for fitting the ClassificationDiscriminant model. This property is read-only.
A positive integer value specifying the number of predictors in the training dataset used for training the ClassificationDiscriminant model. This property is read-only.
A cell array of character vectors specifying the names of the predictor variables. The names are in the order in which the appear in the training dataset. This property is read-only.
A character vector specifying the name of the response variable Y. This property is read-only.
An array of unique values of the response variable Y, which has the
same data types as the data in Y. This property is read-only.
ClassNames can have any of the following datatypes:
A square matrix specifying the cost of misclassification of a point.
Cost(i,j) is the cost of classifying a point into class j
if its true class is i (that is, the rows correspond to the true
class and the columns correspond to the predicted class). The order of
the rows and columns in Cost corresponds to the order of the
classes in ClassNames. The number of rows and columns in
Cost is the number of unique classes in the response. By
default, Cost(i,j) = 1 if i != j, and
Cost(i,j) = 0 if i = j. In other words, the cost is 0
for correct classification and 1 for incorrect classification.
Add or change the Cost using dot notation as in:
obj.Cost = costMatrix
A numeric vector specifying the prior probabilities for each class. The
order of the elements in Prior corresponds to the order of the
classes in ClassNames.
Add or change the Prior using dot notation as in:
obj.Prior = priorVector
Specified as a character vector representing a built-in function or as a function handle for transforming the classification scores. The following built-in functions are supported:
'doublelogit'
'invlogit'
'ismax'
'logit'
'none'
'identity'
'sign'
'symmetric'
'symmetricismax'
'symmetriclogit'
Add or change the ScoreTransform using dot notation as in:
obj.ScoreTransform = 'function_name'
obj.ScoreTransform = @function_handle
A numeric array specifying the within-class covariance. For linear discriminant type (currently supported) this is a matrix, where is the number of predictors in X. This property is read-only.
A @math {KxP numeric matrix specifying the mean of the multivariate normal distribution of each corresponding class, where is the number of classes and is the number of predictors in X. This property is read-only.
A @math {KxK structure containing the coeeficient matrices, where
is the number of classes. If the 'FillCoeffs' parameter
was set to 'off' in either the fitcdiscr function or the
ClassificationDiscriminant constructor, then Coeffs is
empty ([]). This property is read-only.
Coeffs(i,j) contains the coefficients of the linear (currently
supported) boundaries between the classes i and j in the
following fields:
DiscrimType - A character vector
Class1 - ClassNames(i)
Class2 - ClassNames(j)
Const - A scalar
Linear - A vector with length as the number of predictors.
A nonnegative scalar specifying the threshold for linear discriminant model. Currently unimplemented and fixed to 0. This property is read-only.
A character vector specifying the type discriminant model. Currently only linear discriminant models are supported. This property is read-only.
A scalar value ranging from 0 to 1, specifying the Gamma regularization parameter. This property is read-only.
A scalar value ranging from 0 to 1, specifying the minimum value that the Gamma regularization parameter can have. This property is read-only.
statistics: obj = ClassificationDiscriminant (X, Y)
statistics: obj = ClassificationDiscriminant (…, name, value)
obj = ClassificationDiscriminant (X, Y) returns
a ClassificationDiscriminant object, with X as the predictor data
and Y containing the class labels of observations in X.
X must be a numeric matrix of input data where rows
correspond to observations and columns correspond to features or
variables.
X will be used to train the discriminant model.
Y is matrix or cell matrix containing the class labels
of corresponding predictor data in X. Y can contain any type
of categorical data. Y must have the same number of rows as
X.
obj = ClassificationDiscriminant (…, name,
value) returns a ClassificationDiscriminant object with parameters
specified by the followingname, value pair arguments:
| Name | Value | |
|---|---|---|
'PredictorNames' | A cell array of character vectors specifying the names of the predictors. The length of this array must match the number of columns in X. | |
'ResponseName' | A character vector specifying the name of the response variable. | |
'ClassNames' | Names of the classes in the class
labels, Y, used for fitting the Discriminant model.
ClassNames are of the same type as the class labels in Y. | |
'Prior' | A numeric vector specifying the prior
probabilities for each class. The order of the elements in Prior
corresponds to the order of the classes in ClassNames.
Alternatively, you can specify "empirical" to use the empirical
class probabilities or "uniform" to assume equal class
probabilities. | |
'Cost' | An numeric matrix containing
misclassification cost for the corresponding instances in X, where
is the number of unique categories in Y. If an instance
is correctly classified into its category the cost is calculated to be 1,
otherwise 0. The cost matrix can be altered by using
Mdl.cost = somecost. By default, its value is
cost = ones (rows (X), numel (unique (Y))). | |
'DiscrimType' | A character vector or string scalar
specifying the type of discriminant analysis to perform. The only
supported value is 'linear'. | |
'FillCoeffs' | A character vector or string scalar
with values 'on' or 'off' specifying whether to fill the
coefficients after fitting. If set to "on", the coefficients are
computed during model fitting, which can be useful for prediction. | |
'Gamma' | A numeric scalar specifying the regularization parameter for the covariance matrix. It adjusts the linear discriminant analysis to make the model more stable in the presence of multicollinearity or small sample sizes. A value of 0 corresponds to no regularization, while a value of 1 corresponds to a completely regularized model. |
See also: fitcdiscr
ClassificationDiscriminant: label = predict (obj, XC)
ClassificationDiscriminant: [label, score, cost] = predict (obj, XC)
label = predict (obj, XC) returns the vector of
labels predicted for the corresponding instances in XC, using the
predictor data in obj.X and corresponding labels, obj.Y,
stored in the ClassificationDiscriminant model, obj.
ClassificationDiscriminant class object.
[label, score, cost] = predict (obj,
XC) also returns score, which contains the predicted class
scores or posterior probabilities for each instance of the corresponding
unique classes, and cost, which is a matrix containing the expected
cost of the classifications.
The score matrix contains the posterior probabilities for each class, calculated using the multivariate normal probability density function and the prior probabilities of each class. These scores are normalized to ensure they sum to 1 for each observation.
The cost matrix contains the expected classification cost for each class, computed based on the posterior probabilities and the specified misclassification costs.
See also: ClassificationDiscriminant, fitcdiscr
ClassificationDiscriminant: L = loss (obj, X, Y)
ClassificationDiscriminant: L = loss (…, name, value)
L = loss (obj, X, Y) computes the loss,
L, using the default loss function 'mincost'.
obj is a ClassificationDiscriminant object trained on
X and Y.
X must be a numeric matrix of input data where rows
correspond to observations and columns correspond to features or
variables.
Y is matrix or cell matrix containing the class labels
of corresponding predictor data in X. Y must have same
numbers of Rows as X.
L = loss (…, name, value) allows
additional options specified by name-value pairs:
| Name | Value | |
|---|---|---|
"LossFun" | Specifies the loss function to use.
Can be a function handle with four input arguments (C, S, W, Cost)
which returns a scalar value or one of:
’binodeviance’, ’classifcost’, ’classiferror’, ’exponential’,
’hinge’, ’logit’,’mincost’, ’quadratic’.
| |
"Weights" | Specifies observation weights, must be
a numeric vector of length equal to the number of rows in X.
Default is ones (size (X, 1)). loss normalizes the weights so that
observation weights in each class sum to the prior probability of that
class. When you supply Weights, loss computes the weighted
classification loss. |
See also: ClassificationDiscriminant
ClassificationDiscriminant: m = margin (obj, X, Y)
m = margin (obj, X, Y) returns
the classification margins for obj with data X and
classification Y. m is a numeric vector of length size (X,1).
obj is a ClassificationDiscriminant object trained on X
and Y.
X must be a numeric matrix of input data where rows
correspond to observations and columns correspond to features or
variables.
Y is matrix or cell matrix containing the class labels
of corresponding predictor data in X. Y must have same
numbers of Rows as X.
The classification margin for each observation is the difference between the classification score for the true class and the maximal classification score for the false classes.
See also: fitcdiscr, ClassificationDiscriminant
ClassificationDiscriminant: CVMdl = crossval (obj)
ClassificationDiscriminant: CVMdl = crossval (…, Name, Value)
CVMdl = crossval (obj) returns a cross-validated model
object, CVMdl, from a trained model, obj, using 10-fold
cross-validation by default.
CVMdl = crossval (obj, name, value)
specifies additional name-value pair arguments to customize the
cross-validation process.
| Name | Value | |
|---|---|---|
"KFold" | Specify the number of folds to use in
k-fold cross-validation. "KFold", k, where k is an
integer greater than 1. | |
"Holdout" | Specify the fraction of the data to
hold out for testing. "Holdout", p, where p is a
scalar in the range . | |
"Leaveout" | Specify whether to perform
leave-one-out cross-validation. "Leaveout", Value, where
Value is ’on’ or ’off’. | |
"CVPartition" | Specify a cvpartition
object used for cross-validation. "CVPartition", cv, where
isa (cv, "cvpartition") = 1. |
See also: fitcdiscr, ClassificationDiscriminant, cvpartition, ClassificationPartitionedModel
ClassificationDiscriminant: CVMdl = compact (obj)
CVMdl = compact (obj) creates a compact version of the
ClassificationDiscriminant object, obj.
See also: fitcdiscr, ClassificationDiscriminant, CompactClassificationDiscriminant
ClassificationDiscriminant: savemodel (obj, filename)
savemodel (obj, filename) saves each property of a
ClassificationDiscriminant object into an Octave binary file, the name of
which is specified in filename, along with an extra variable, which
defines the type classification object these variables constitute. Use
loadmodel in order to load a classification object into Octave’s
workspace.
See also: loadmodel, fitcdiscr, ClassificationDiscriminant
load fisheriris x = meas; y = species; xc = [min(x); mean(x); max(x)]; obj = fitcdiscr (x, y); [label, score, cost] = predict (obj, xc); |
load fisheriris
model = fitcdiscr (meas, species);
X = mean (meas);
Y = {'versicolor'};
## Compute loss for discriminant model
L = loss (model, X, Y) |
L = 0 |
load fisheriris
mdl = fitcdiscr (meas, species);
X = mean (meas);
Y = {'versicolor'};
## Margin for discriminant model
m = margin (mdl, X, Y) |
m = 1.0000 |
load fisheriris x = meas; y = species; obj = fitcdiscr (x, y, "gamma", 0.4); ## Cross-validation for discriminant model CVMdl = crossval (obj) |
CVMdl =
ClassificationPartitionedModel object with properties:
BinEdges: []
CategoricalPredictors: []
X: [5.1000, 3.5000, 1.4000, 0.2000; 4.9000, 3, 1.4000, 0.2000; 4.7000, 3.2000, ...]
Y: [150x1 cell]
ClassNames: [3x1 cell]
Cost: [0, 1, 1; 1, 0, 1; 1, 1, 0]
CrossValidatedModel: 'ClassificationDiscriminant'
KFold: 10
ModelParameters: [1x1 struct]
NumObservations: 150
Partition: [1x1 cvpartition]
PredictorNames: [1x4 cell]
Prior: [0.3333; 0.3333; 0.3333]
ResponseName: 'Y'
ScoreTransform: 'none'
Standardize: []
Trained: [10x1 cell]
|