Statistics: ClassificationGAM

Class Definition: `ClassificationGAM`

statistics: obj = ClassificationGAM (X, Y)
statistics: obj = ClassificationGAM (…, name, value)

Create a ClassificationGAM class object containing a generalized additive classification model.

The ClassificationGAM class implements a gradient boosting algorithm for classification, using spline fitting as the weak learner. This approach allows the model to capture non-linear relationships between predictors and the binary response variable.

obj = ClassificationGAM (X, Y) returns a ClassificationGAM object, with X as the predictor data and Y containing the class labels of observations in X.

X must be a $N×P$ numeric matrix of predictor data where rows correspond to observations and columns correspond to features or variables.
Y is $N×1$ numeric vector containing binary class labels, typically 0 or 1.

obj = ClassificationGAM (…, name, value) returns a ClassificationGAM object with parameters specified by Name-Value pair arguments. Type help fitcgam for more info.

A ClassificationGAM object, obj, stores the labeled training data and various parameters for the Generalized Additive Model (GAM) for classification, which can be accessed in the following fields:

`Field`		`Description`
`X`		Predictor data, specified as a numeric matrix. Each column of `X` represents one predictor (variable), and each row represents one observation.
`Y`		Class labels, specified as numeric vector of 0’s and 1’s. Each value in `Y` is the observed class label for the corresponding row in `X`.
`BaseModel`		A structure containing the parameters of the base model without any interaction terms. The base model represents the generalized additive model (GAM) with only the main effects (predictor terms) included.
`ModelwInt`		A structure containing the parameters of the model that includes interaction terms. This model extends the base model by adding interaction terms between predictors, as specified by the `Interactions` property.
`IntMatrix`		A logical matrix or a matrix of column indices that describes the interaction terms applied to the predictor data.
`NumObservations`		Number of observations used in training the ClassificationGAM model, specified as a positive integer scalar. This number can be less than the number of rows in the training data because rows containing `NaN` values are not part of the fit.
`RowsUsed`		Rows of the original training data used in fitting the ClassificationGAM model, specified as a numerical vector. If you want to use this vector for indexing the training data in `X`, you have to convert it to a logical vector, i.e `X = obj.X(logical (obj.RowsUsed), :);`
`NumPredictors`		The number of predictors (variables) in `X`.
`PredictorNames`		Predictor variable names, specified as a cell array of character vectors. The variable names are in the same order in which they appear in the training data `X`.
`ResponseName`		Response variable name, specified as a character vector.
`ClassNames`		Names of the classes in the training data `Y` with duplicates removed, specified as a cell array of character vectors.
`Cost`		Cost of the misclassification of a point, specified as a square matrix. `Cost(i,j)` is the cost of classifying a point into class `j` if its true class is `i` (that is, the rows correspond to the true class and the columns correspond to the predicted class). The order of the rows and columns in `Cost` corresponds to the order of the classes in `ClassNames`. The number of rows and columns in `Cost` is the number of unique classes in the response. By default, `Cost(i,j) = 1` if `i != j`, and `Cost(i,j) = 0` if `i = j`. In other words, the cost is 0 for correct classification and 1 for incorrect classification.
`Formula`		A model specification given as a string in the form `"Y ~ terms"` where `Y` represents the reponse variable and `terms` the predictor variables. The formula can be used to specify a subset of variables for training model. For example: `"Y ~ x1 + x2 + x3 + x4 + x1:x2 + x2:x3"` specifies four linear terms for the first four columns of for predictor data, and `x1:x2` and `x2:x3` specify the two interaction terms for 1st-2nd and 3rd-4th columns respectively. Only these terms will be used for training the model, but `X` must have at least as many columns as referenced in the formula. If Predictor Variable names have been defined, then the terms in the formula must reference to those. When `"formula"` is specified, all terms used for training the model are referenced in the `IntMatrix` field of the `obj` class object as a matrix containing the column indexes for each term including both the predictors and the interactions used.
`Interactions`		A logical matrix, a positive integer scalar, or the string `"all"` for defining the interactions between predictor variables. When given a logical matrix, it must have the same number of columns as `X` and each row corresponds to a different interaction term combining the predictors indexed as `true`. Each interaction term is appended as a column vector after the available predictor column in `X`. When `"all"` is defined, then all possible combinations of interactions are appended in `X` before training. At the moment, parsing a positive integer has the same effect as the `"all"` option. When `"interactions"` is specified, only the interaction terms appended to `X` are referenced in the `IntMatrix` field of the `obj` class object.
`Knots`		A scalar or a row vector with the same columns as `X`. It defines the knots for fitting a polynomial when training the GAM. As a scalar, it is expanded to a row vector. The default value is 5, hence expanded to `ones (1, columns (X)) * 5`. You can parse a row vector with different number of knots for each predictor variable to be fitted with, although not recommended.
`Order`		A scalar or a row vector with the same columns as `X`. It defines the order of the polynomial when training the GAM. As a scalar, it is expanded to a row vector. The default values is 3, hence expanded to `ones (1, columns (X)) * 3`. You can parse a row vector with different number of polynomial order for each predictor variable to be fitted with, although not recommended.
`DoF`		A scalar or a row vector with the same columns as `X`. It defines the degrees of freedom for fitting a polynomial when training the GAM. As a scalar, it is expanded to a row vector. The default value is 8, hence expanded to `ones (1, columns (X)) * 8`. You can parse a row vector with different degrees of freedom for each predictor variable to be fitted with, although not recommended.

You can parse either a "Formula" or an "Interactions" optional parameter. Parsing both parameters will result an error. Accordingly, you can only pass up to two parameters among "Knots", "Order", and "DoF" to define the required polynomial for training the GAM model.

Source Code: ClassificationGAM

Method: `compact`

ClassificationGAM: CVMdl = compact (obj)

Create a CompactClassificationGAM object.

CVMdl = compact (obj) creates a compact version of the ClassificationGAM object, obj.

Method: `crossval`

ClassificationGAM: CVMdl = crossval (obj)
ClassificationGAM: CVMdl = crossval (…, Name, Value)

Cross Validate a Generalized Additive Model classification object.

CVMdl = crossval (obj) returns a cross-validated model object, CVMdl, from a trained model, obj, using 10-fold cross-validation by default.

CVMdl = crossval (obj, name, value) specifies additional name-value pair arguments to customize the cross-validation process.

`Name`		`Value`
`"KFold"`		Specify the number of folds to use in k-fold cross-validation. `"KFold", k`, where `k` is an integer greater than 1.
`"Holdout"`		Specify the fraction of the data to hold out for testing. `"Holdout", p`, where `p` is a scalar in the range $(0,1)$ .
`"Leaveout"`		Specify whether to perform leave-one-out cross-validation. `"Leaveout", Value`, where `Value` is ’on’ or ’off’.
`"CVPartition"`		Specify a `cvpartition` object used for cross-validation. `"CVPartition", cv`, where `isa (cv, "cvpartition")` = 1.

See also: fitcgam, ClassificationGAM, cvpartition, ClassificationPartitionedModel

Method: `predict`

ClassificationGAM: label = predict (obj, XC)
ClassificationGAM: label = predict (…, 'IncludeInteractions', includeInteractions)
ClassificationGAM: [label, score] = predict (…)

Predict labels for new data using the Generalized Additive Model (GAM) stored in a ClassificationGAM object.

label = predict (obj, XC) returns the predicted labels for the data in X based on the model stored in the ClassificationGAM object, obj.

label = predict (obj, XC, 'IncludeInteractions', includeInteractions) allows you to specify whether interaction terms should be included when making predictions.

[label, score] = predict (…) also returns score, which contains the predicted class scores or posterior probabilities for each observation.

obj must be a ClassificationGAM class object.
XC must be an $M×P$ numeric matrix where each row is an observation and each column corresponds to a predictor variable.
includeInteractions is a ’true’ or ’false’ indicating whether to include interaction terms in the predictions.

See also: fitcgam, ClassificationGAM

Method: `savemodel`

ClassificationGAM: savemodel (obj, filename)

Save a ClassificationGAM object.

savemodel (obj, filename) saves a ClassificationGAM object into a file defined by filename.

See also: loadmodel, fitcgam, ClassificationGAM, cvpartition, ClassificationPartitionedModel

Example: 1


 ## Train a GAM classifier for binary classification
 ## using specific data and plot the decision boundaries.

 ## Define specific data
 X = [1, 2; 2, 3; 3, 3; 4, 5; 5, 5; ...
     6, 7; 7, 8; 8, 8; 9, 9; 10, 10];
 Y = [0; 0; 0; 0; 0; ...
     1; 1; 1; 1; 1];

 ## Train the GAM model
 obj = fitcgam (X, Y, "Interactions", "all")

 ## Create a grid of values for prediction
 x1 = [min(X(:,1)):0.1:max(X(:,1))];
 x2 = [min(X(:,2)):0.1:max(X(:,2))];
 [x1G, x2G] = meshgrid (x1, x2);
 XGrid = [x1G(:), x2G(:)];
 [labels, score] = predict (obj, XGrid);

obj =

  ClassificationGAM object with properties:

            BaseModel: [1x1 struct]
           ClassNames: [2x1 cell]
                 Cost: [2x2 double]
                  DoF: [1x2 double]
              Formula: [0x0 double]
            IntMatrix: [1x1 double]
         Interactions: all
                Knots: [1x2 double]
         LearningRate: [1x1 double]
            ModelwInt: [1x1 struct]
        NumIterations: [1x1 double]
      NumObservations: [1x1 double]
        NumPredictors: [1x1 double]
                Order: [1x2 double]
       PredictorNames: [1x2 cell]
                Prior: [0x0 double]
         ResponseName: Y
             RowsUsed: [10x1 double]
       ScoreTransform: none
                    X: [10x2 double]
                    Y: [10x1 double]

Categories &

Functions List

CVpartition

CVpartition

Classification Classes

Classification Classes

Clustering

Clustering

Clustering Classes

Clustering Classes

Data Manipulation

Data Manipulation

Descriptive Statistics

Descriptive Statistics

Distribution Classes

Distribution Classes

Distribution Fitting

Distribution Fitting

Distribution Functions

Distribution Functions

Distribution Statistics

Distribution Statistics

Distribution Wrappers

Distribution Wrappers

Experimental Design

Experimental Design

Hypothesis Testing

Hypothesis Testing

I/O

I/O

Machine Learning

Machine Learning

Model Fitting

Model Fitting

Plotting

Plotting

Regression

Regression

Regression Classes

Regression Classes

Transforms

Transforms

Class Definition: ClassificationGAM

Method: compact

Method: crossval