Categories &

Functions List

Class Definition: ClassificationGAM

statistics: obj = ClassificationGAM (X, Y)
statistics: obj = ClassificationGAM (…, name, value)

Create a ClassificationGAM class object containing a generalized additive classification model.

The ClassificationGAM class implements a gradient boosting algorithm for classification, using spline fitting as the weak learner. This approach allows the model to capture non-linear relationships between predictors and the binary response variable.

obj = ClassificationGAM (X, Y) returns a ClassificationGAM object, with X as the predictor data and Y containing the class labels of observations in X.

  • X must be a N×P numeric matrix of predictor data where rows correspond to observations and columns correspond to features or variables.
  • Y is N×1 numeric vector containing binary class labels, typically 0 or 1.

obj = ClassificationGAM (…, name, value) returns a ClassificationGAM object with parameters specified by Name-Value pair arguments. Type help fitcgam for more info.

A ClassificationGAM object, obj, stores the labeled training data and various parameters for the Generalized Additive Model (GAM) for classification, which can be accessed in the following fields:

FieldDescription
XPredictor data, specified as a numeric matrix. Each column of X represents one predictor (variable), and each row represents one observation.
YClass labels, specified as numeric vector of 0’s and 1’s. Each value in Y is the observed class label for the corresponding row in X.
BaseModelA structure containing the parameters of the base model without any interaction terms. The base model represents the generalized additive model (GAM) with only the main effects (predictor terms) included.
ModelwIntA structure containing the parameters of the model that includes interaction terms. This model extends the base model by adding interaction terms between predictors, as specified by the Interactions property.
IntMatrixA logical matrix or a matrix of column indices that describes the interaction terms applied to the predictor data.
NumObservationsNumber of observations used in training the ClassificationGAM model, specified as a positive integer scalar. This number can be less than the number of rows in the training data because rows containing NaN values are not part of the fit.
RowsUsedRows of the original training data used in fitting the ClassificationGAM model, specified as a numerical vector. If you want to use this vector for indexing the training data in X, you have to convert it to a logical vector, i.e X = obj.X(logical (obj.RowsUsed), :);
NumPredictorsThe number of predictors (variables) in X.
PredictorNamesPredictor variable names, specified as a cell array of character vectors. The variable names are in the same order in which they appear in the training data X.
ResponseNameResponse variable name, specified as a character vector.
ClassNamesNames of the classes in the training data Y with duplicates removed, specified as a cell array of character vectors.
CostCost of the misclassification of a point, specified as a square matrix. Cost(i,j) is the cost of classifying a point into class j if its true class is i (that is, the rows correspond to the true class and the columns correspond to the predicted class). The order of the rows and columns in Cost corresponds to the order of the classes in ClassNames. The number of rows and columns in Cost is the number of unique classes in the response. By default, Cost(i,j) = 1 if i != j, and Cost(i,j) = 0 if i = j. In other words, the cost is 0 for correct classification and 1 for incorrect classification.
FormulaA model specification given as a string in the form "Y ~ terms" where Y represents the reponse variable and terms the predictor variables. The formula can be used to specify a subset of variables for training model. For example: "Y ~ x1 + x2 + x3 + x4 + x1:x2 + x2:x3" specifies four linear terms for the first four columns of for predictor data, and x1:x2 and x2:x3 specify the two interaction terms for 1st-2nd and 3rd-4th columns respectively. Only these terms will be used for training the model, but X must have at least as many columns as referenced in the formula. If Predictor Variable names have been defined, then the terms in the formula must reference to those. When "formula" is specified, all terms used for training the model are referenced in the IntMatrix field of the obj class object as a matrix containing the column indexes for each term including both the predictors and the interactions used.
InteractionsA logical matrix, a positive integer scalar, or the string "all" for defining the interactions between predictor variables. When given a logical matrix, it must have the same number of columns as X and each row corresponds to a different interaction term combining the predictors indexed as true. Each interaction term is appended as a column vector after the available predictor column in X. When "all" is defined, then all possible combinations of interactions are appended in X before training. At the moment, parsing a positive integer has the same effect as the "all" option. When "interactions" is specified, only the interaction terms appended to X are referenced in the IntMatrix field of the obj class object.
KnotsA scalar or a row vector with the same columns as X. It defines the knots for fitting a polynomial when training the GAM. As a scalar, it is expanded to a row vector. The default value is 5, hence expanded to ones (1, columns (X)) * 5. You can parse a row vector with different number of knots for each predictor variable to be fitted with, although not recommended.
OrderA scalar or a row vector with the same columns as X. It defines the order of the polynomial when training the GAM. As a scalar, it is expanded to a row vector. The default values is 3, hence expanded to ones (1, columns (X)) * 3. You can parse a row vector with different number of polynomial order for each predictor variable to be fitted with, although not recommended.
DoFA scalar or a row vector with the same columns as X. It defines the degrees of freedom for fitting a polynomial when training the GAM. As a scalar, it is expanded to a row vector. The default value is 8, hence expanded to ones (1, columns (X)) * 8. You can parse a row vector with different degrees of freedom for each predictor variable to be fitted with, although not recommended.

You can parse either a "Formula" or an "Interactions" optional parameter. Parsing both parameters will result an error. Accordingly, you can only pass up to two parameters among "Knots", "Order", and "DoF" to define the required polynomial for training the GAM model.

See also: fitcgam, CompactClassificationGAM, ClassificationPartitionedModel

Source Code: ClassificationGAM

Method: compact

ClassificationGAM: CVMdl = compact (obj)

Create a CompactClassificationGAM object.

CVMdl = compact (obj) creates a compact version of the ClassificationGAM object, obj.

See also: fitcdiscr, ClassificationGAM, CompactClassificationGAM

Method: crossval

ClassificationGAM: CVMdl = crossval (obj)
ClassificationGAM: CVMdl = crossval (…, Name, Value)

Cross Validate a Generalized Additive Model classification object.

CVMdl = crossval (obj) returns a cross-validated model object, CVMdl, from a trained model, obj, using 10-fold cross-validation by default.

CVMdl = crossval (obj, name, value) specifies additional name-value pair arguments to customize the cross-validation process.

NameValue
"KFold"Specify the number of folds to use in k-fold cross-validation. "KFold", k, where k is an integer greater than 1.
"Holdout"Specify the fraction of the data to hold out for testing. "Holdout", p, where p is a scalar in the range (0,1).
"Leaveout"Specify whether to perform leave-one-out cross-validation. "Leaveout", Value, where Value is ’on’ or ’off’.
"CVPartition"Specify a cvpartition object used for cross-validation. "CVPartition", cv, where isa (cv, "cvpartition") = 1.

See also: fitcgam, ClassificationGAM, cvpartition, ClassificationPartitionedModel

Method: predict

ClassificationGAM: label = predict (obj, XC)
ClassificationGAM: label = predict (…, 'IncludeInteractions', includeInteractions)
ClassificationGAM: [label, score] = predict (…)

Predict labels for new data using the Generalized Additive Model (GAM) stored in a ClassificationGAM object.

label = predict (obj, XC) returns the predicted labels for the data in X based on the model stored in the ClassificationGAM object, obj.

label = predict (obj, XC, 'IncludeInteractions', includeInteractions) allows you to specify whether interaction terms should be included when making predictions.

[label, score] = predict (…) also returns score, which contains the predicted class scores or posterior probabilities for each observation.

  • obj must be a ClassificationGAM class object.
  • XC must be an M×P numeric matrix where each row is an observation and each column corresponds to a predictor variable.
  • includeInteractions is a ’true’ or ’false’ indicating whether to include interaction terms in the predictions.

See also: fitcgam, ClassificationGAM

Method: savemodel

ClassificationGAM: savemodel (obj, filename)

Save a ClassificationGAM object.

savemodel (obj, filename) saves a ClassificationGAM object into a file defined by filename.

See also: loadmodel, fitcgam, ClassificationGAM, cvpartition, ClassificationPartitionedModel

Example: 1

 

 ## Train a GAM classifier for binary classification
 ## using specific data and plot the decision boundaries.

 ## Define specific data
 X = [1, 2; 2, 3; 3, 3; 4, 5; 5, 5; ...
     6, 7; 7, 8; 8, 8; 9, 9; 10, 10];
 Y = [0; 0; 0; 0; 0; ...
     1; 1; 1; 1; 1];

 ## Train the GAM model
 obj = fitcgam (X, Y, "Interactions", "all")

 ## Create a grid of values for prediction
 x1 = [min(X(:,1)):0.1:max(X(:,1))];
 x2 = [min(X(:,2)):0.1:max(X(:,2))];
 [x1G, x2G] = meshgrid (x1, x2);
 XGrid = [x1G(:), x2G(:)];
 [labels, score] = predict (obj, XGrid);

obj =

  ClassificationGAM object with properties:

            BaseModel: [1x1 struct]
           ClassNames: [2x1 cell]
                 Cost: [2x2 double]
                  DoF: [1x2 double]
              Formula: [0x0 double]
            IntMatrix: [1x1 double]
         Interactions: all
                Knots: [1x2 double]
         LearningRate: [1x1 double]
            ModelwInt: [1x1 struct]
        NumIterations: [1x1 double]
      NumObservations: [1x1 double]
        NumPredictors: [1x1 double]
                Order: [1x2 double]
       PredictorNames: [1x2 cell]
                Prior: [0x0 double]
         ResponseName: Y
             RowsUsed: [10x1 double]
       ScoreTransform: none
                    X: [10x2 double]
                    Y: [10x1 double]