ClassificationGAM
Create a ClassificationGAM
class object containing a generalized
additive classification model.
The ClassificationGAM
class implements a gradient boosting algorithm
for classification, using spline fitting as the weak learner. This approach
allows the model to capture non-linear relationships between predictors and
the binary response variable.
obj = ClassificationGAM (X, Y)
returns a
ClassificationGAM object, with X as the predictor data and Y
containing the class labels of observations in X.
X
must be a numeric matrix of predictor data where rows
correspond to observations and columns correspond to features or variables.
Y
is numeric vector containing binary class labels,
typically 0 or 1.
obj = ClassificationGAM (…, name, value)
returns a ClassificationGAM object with parameters specified by
Name-Value
pair arguments. Type help fitcgam
for more info.
A ClassificationGAM
object, obj, stores the labeled training
data and various parameters for the Generalized Additive Model (GAM) for
classification, which can be accessed in the following fields:
Field | Description | |
---|---|---|
X | Predictor data, specified as a numeric matrix. Each column of X represents one predictor (variable), and each row represents one observation. | |
Y | Class labels, specified as numeric vector of 0’s and 1’s. Each value in Y is the observed class label for the corresponding row in X. | |
BaseModel | A structure containing the parameters of the base model without any interaction terms. The base model represents the generalized additive model (GAM) with only the main effects (predictor terms) included. | |
ModelwInt | A structure containing the parameters
of the model that includes interaction terms. This model extends the base
model by adding interaction terms between predictors, as specified by the
Interactions property. | |
IntMatrix | A logical matrix or a matrix of column indices that describes the interaction terms applied to the predictor data. | |
NumObservations | Number of observations used in
training the ClassificationGAM model, specified as a positive integer scalar.
This number can be less than the number of rows in the training data because
rows containing NaN values are not part of the fit. | |
RowsUsed | Rows of the original training data
used in fitting the ClassificationGAM model, specified as a numerical vector.
If you want to use this vector for indexing the training data in X, you
have to convert it to a logical vector, i.e
X = obj.X(logical (obj.RowsUsed), :); | |
NumPredictors | The number of predictors (variables) in X. | |
PredictorNames | Predictor variable names, specified as a cell array of character vectors. The variable names are in the same order in which they appear in the training data X. | |
ResponseName | Response variable name, specified as a character vector. | |
ClassNames | Names of the classes in the training data Y with duplicates removed, specified as a cell array of character vectors. | |
Cost | Cost of the misclassification of a point,
specified as a square matrix. Cost(i,j) is the cost of classifying a
point into class j if its true class is i (that is, the rows
correspond to the true class and the columns correspond to the predicted
class). The order of the rows and columns in Cost corresponds to the
order of the classes in ClassNames . The number of rows and columns
in Cost is the number of unique classes in the response. By default,
Cost(i,j) = 1 if i != j , and Cost(i,j) = 0 if
i = j . In other words, the cost is 0 for correct classification and
1 for incorrect classification. | |
Formula | A model specification given as a string
in the form "Y ~ terms" where Y represents the reponse
variable and terms the predictor variables. The formula can be used
to specify a subset of variables for training model. For example:
"Y ~ x1 + x2 + x3 + x4 + x1:x2 + x2:x3" specifies four linear terms
for the first four columns of for predictor data, and x1:x2 and
x2:x3 specify the two interaction terms for 1st-2nd and 3rd-4th
columns respectively. Only these terms will be used for training the model,
but X must have at least as many columns as referenced in the formula.
If Predictor Variable names have been defined, then the terms in the formula
must reference to those. When "formula" is specified, all terms used
for training the model are referenced in the IntMatrix field of the
obj class object as a matrix containing the column indexes for each
term including both the predictors and the interactions used. | |
Interactions | A logical matrix, a positive integer
scalar, or the string "all" for defining the interactions between
predictor variables. When given a logical matrix, it must have the same
number of columns as X and each row corresponds to a different
interaction term combining the predictors indexed as true . Each
interaction term is appended as a column vector after the available predictor
column in X. When "all" is defined, then all possible
combinations of interactions are appended in X before training. At the
moment, parsing a positive integer has the same effect as the "all"
option. When "interactions" is specified, only the interaction terms
appended to X are referenced in the IntMatrix field of the
obj class object. | |
Knots | A scalar or a row vector with the same
columns as X. It defines the knots for fitting a polynomial when
training the GAM. As a scalar, it is expanded to a row vector. The default
value is 5, hence expanded to ones (1, columns (X)) * 5 . You can
parse a row vector with different number of knots for each predictor
variable to be fitted with, although not recommended. | |
Order | A scalar or a row vector with the same
columns as X. It defines the order of the polynomial when training the
GAM. As a scalar, it is expanded to a row vector. The default values is 3,
hence expanded to ones (1, columns (X)) * 3 . You can parse a row
vector with different number of polynomial order for each predictor variable
to be fitted with, although not recommended. | |
DoF | A scalar or a row vector with the same
columns as X. It defines the degrees of freedom for fitting a
polynomial when training the GAM. As a scalar, it is expanded to a row
vector. The default value is 8, hence expanded to
ones (1, columns (X)) * 8 . You can parse a row vector with different
degrees of freedom for each predictor variable to be fitted with,
although not recommended. |
You can parse either a "Formula"
or an "Interactions"
optional parameter. Parsing both parameters will result an error.
Accordingly, you can only pass up to two parameters among "Knots"
,
"Order"
, and "DoF"
to define the required polynomial for
training the GAM model.
See also: fitcgam, CompactClassificationGAM, ClassificationPartitionedModel
Source Code: ClassificationGAM
compact
Create a CompactClassificationGAM object.
CVMdl = compact (obj)
creates a compact version of the
ClassificationGAM object, obj.
See also: fitcdiscr, ClassificationGAM, CompactClassificationGAM
crossval
Cross Validate a Generalized Additive Model classification object.
CVMdl = crossval (obj)
returns a cross-validated model
object, CVMdl, from a trained model, obj, using 10-fold
cross-validation by default.
CVMdl = crossval (obj, name, value)
specifies additional name-value pair arguments to customize the
cross-validation process.
Name | Value | |
---|---|---|
"KFold" | Specify the number of folds to use in
k-fold cross-validation. "KFold", k , where k is an
integer greater than 1. | |
"Holdout" | Specify the fraction of the data to
hold out for testing. "Holdout", p , where p is a
scalar in the range . | |
"Leaveout" | Specify whether to perform
leave-one-out cross-validation. "Leaveout", Value , where
Value is ’on’ or ’off’. | |
"CVPartition" | Specify a cvpartition
object used for cross-validation. "CVPartition", cv , where
isa (cv, "cvpartition") = 1. |
See also: fitcgam, ClassificationGAM, cvpartition, ClassificationPartitionedModel
predict
'IncludeInteractions'
, includeInteractions)Predict labels for new data using the Generalized Additive Model (GAM) stored in a ClassificationGAM object.
label = predict (obj, XC)
returns the predicted
labels for the data in X based on the model stored in the
ClassificationGAM object, obj.
label = predict (obj, XC, 'IncludeInteractions',
includeInteractions)
allows you to specify whether interaction
terms should be included when making predictions.
[label, score] = predict (…)
also returns
score, which contains the predicted class scores or posterior
probabilities for each observation.
ClassificationGAM
class object.
See also: fitcgam, ClassificationGAM
savemodel
Save a ClassificationGAM object.
savemodel (obj, filename)
saves a ClassificationGAM
object into a file defined by filename.
See also: loadmodel, fitcgam, ClassificationGAM, cvpartition, ClassificationPartitionedModel
## Train a GAM classifier for binary classification ## using specific data and plot the decision boundaries. ## Define specific data X = [1, 2; 2, 3; 3, 3; 4, 5; 5, 5; ... 6, 7; 7, 8; 8, 8; 9, 9; 10, 10]; Y = [0; 0; 0; 0; 0; ... 1; 1; 1; 1; 1]; ## Train the GAM model obj = fitcgam (X, Y, "Interactions", "all") ## Create a grid of values for prediction x1 = [min(X(:,1)):0.1:max(X(:,1))]; x2 = [min(X(:,2)):0.1:max(X(:,2))]; [x1G, x2G] = meshgrid (x1, x2); XGrid = [x1G(:), x2G(:)]; [labels, score] = predict (obj, XGrid); obj = ClassificationGAM object with properties: BaseModel: [1x1 struct] ClassNames: [2x1 cell] Cost: [2x2 double] DoF: [1x2 double] Formula: [0x0 double] IntMatrix: [1x1 double] Interactions: all Knots: [1x2 double] LearningRate: [1x1 double] ModelwInt: [1x1 struct] NumIterations: [1x1 double] NumObservations: [1x1 double] NumPredictors: [1x1 double] Order: [1x2 double] PredictorNames: [1x2 cell] Prior: [0x0 double] ResponseName: Y RowsUsed: [10x1 double] ScoreTransform: none X: [10x2 double] Y: [10x1 double] |