cvpartition
Partition data for cross-validation
The cvpartition
class generates a partitioning scheme on a dataset to
facilitate cross-validation of statistical models utilizing training and
testing subsets of the dataset.
See also: crossval
Source Code: cvpartition
A logical scalar specifying whether the cvpartition
object
was created using custom partition partitioning (true
) or
not (false
). This property is read-only.
A logical scalar specifying whether the cvpartition
object was
created using grouping variables (true
) or not (false
).
This property is read-only.
A logical scalar specifying whether the cvpartition
object was
created with a 'stratifyOption'
value of true
.
This property is read-only.
A positive integer scalar specifying the number of observations in the dataset (including any missing data, where applicable). This property is read-only.
A positive integer scalar specifying the number of folds for partition
types 'kfold'
and 'leaveout'
. When partition type is
'holdout'
and 'resubstitution'
, then NumTestSets
is 1. This property is read-only.
A positive integer scalar specifying the size of the test set for
partition types 'holdout'
and 'resubstitution'
or a
vector of positive integers specifying the size of each testing set for
partition types 'kfold'
and 'leaveout'
. This property
is read-only.
A positive integer scalar specifying the size of the train set for
partition types 'holdout'
and 'resubstitution'
or a
vector of positive integers specifying the size of each training set for
partition types 'kfold'
and 'leaveout'
. This property
is read-only.
A character vector specifying the type of the cvpartition
object.
It can be kfold
, holdout
, leaveout
, or
resubstitution
. This property is read-only.
'KFold'
)'KFold'
, k)'KFold'
, k, 'GroupingVariables'
, grpvars)'Holdout'
)'Holdout'
, p)'Leaveout'
)'Resubstitution'
)'KFold'
)'KFold'
, k)'KFold'
, k, 'Stratify'
, opt)'Holdout'
)'Holdout'
, p)'Holdout'
, p, 'Stratify'
, opt)'CustomPartition'
, testSets)Repartition data for cross-validation.
C = cvpartition (n,
creates a
'KFold'
)cvpartition
object C, which defines a random nonstratified
partition for k-fold cross-validation on n observations with each
fold (subsample) having approximately the same number of observations.
The default number of folds is 10 for n >= 10
or equal to
n otherwise.
C = cvpartition (n,
also
creates a nonstratified random partition for k-fold cross-validation with
the number of folds defined by k, which must be a positive integer
scalar smaller than the number of observations n.
'KFold'
, k)
C = cvpartition (n,
creates a 'KFold'
, k,
'GroupingVariables'
, grpvars)cvpartition
object C that defines a random partition for k-fold cross-validation
with each fold containing the same combination of group labels as defined
by grpvars. The grouping variables specified in grpvars can
be one of the following:
C = cvpartition (n,
creates a
'Holdout'
)cvpartition
object C, which defines a random nonstratified
partition for holdout validation on n observations. 90% of the
observations are assigned to the training set and the remaining 10% to
the test set.
C = cvpartition (n,
also
creates a nonstratified random partition for holdout validation with the
percentage of training and test sets defined by p, which can be a
scalar value in the range or a positive integer scalar in
the range .
'Holdout'
, p)
C = cvpartition (n,
creates a
'Leaveout'
)cvpartition
object C, which defines a random partition for
leave-one-out cross-validation on n observations. This is a
special case of k-fold cross-validation with the number of folds equal to
the number of observations.
C = cvpartition (n,
creates
a 'Resubstitution'
)cvpartition
object C without partitioning the data and
both training and test sets containing all observations n.
C = cvpartition (X,
creates a
'KFold'
)cvpartition
object C, which defines a stratified random
partition for k-fold cross-validation according to the class proportions
in Χ. X can be a numeric, logical, categorical, or string
vector, or a character array or a cell array of character vectors.
Missing values in X are discarded. The default number of folds is
10 for numel (X) >= 10
or equal to numel (X)
otherwise.
C = cvpartition (X,
also
creates a stratified random partition for k-fold cross-validation with
the number of folds defined by k, which must be a positive integer
scalar smaller than the number of observations in X.
'KFold'
, k)
C = cvpartition (X,
creates a random partition for k-fold
cross-validation, which is stratified if opt is 'KFold'
, k,
'Stratify'
, opt)true
, or
nonstratified if opt is false
.
C = cvpartition (X,
creates a
'Holdout'
)cvpartition
object C, which defines a stratified random
partition for holdout validation while maintaining the class proportions
in Χ. 90% of the observations are assigned to the training set and
the remaining 10% to the test set.
C = cvpartition (X,
also
creates a stratified random partition for holdout validation with the
percentage of training and test sets defined by p, which can be a
scalar value in the range or a positive integer scalar in
the range .
'Holdout'
, p)
C = cvpartition (X,
creates a random partition for holdout
validation, which is stratified if opt is 'Holdout'
, p,
'Stratify'
, opt)true
, or
nonstratified if opt is false
.
C =
cvpartition ('CustomPartition'
, testSets)
creates a custom partition according to testSets, which can be a
positive integer vector, a logical vector, or a logical matrix according
to the following options:
true
elements correspond to the test set and the false
elements correspond to the traning set.
true
elements correspond to the
test set and the false
elements correspond to the traning set.
See also: cvpartition, summary, test, training
'legacy'
)Repartition data for cross-validation.
Cnew = repartition (C)
creates a cvpartition
object Cnew that defines a new random partition of the same type as
the cvpartition
C.
Cnew = repartition (C, sval)
also uses the value
of sval to set the state of the random generator used in
repartitioning C. If sval is a vector, then the random
generator is set using the "state"
keyword as in
rand ("state", sval)
. If sval is a scalar, then the
"seed"
keyword is used as in rand ("seed", sval)
to
specify that old generators should be used.
Cnew = repartition (C,
only applies
to 'legacy'
)cvpartition
objects C that use k-fold partitioning and
it will repartition C in the same non-random manner that was
previously used by the old-style cvpartition
class of the
statistics package.
See also: cvpartition, summary, test, training
Summarize cross-validation partition.
tbl = summary (C)
returns a summary table tbl of
the cvpartition
object C as long as its type is either
k-fold or holdout and it is either stratified of grouped. This function
requires support for the table
class, which is provided by the
datatypes
package.
See also: cvpartition, repartition, test, training
"all"
)Test indices for cross-validation.
idx = test (C)
returns a logical vector idx with
true
values indicating the elements corresponding to the test
set defined in the code{cvpartition object C. For k-fold and
leave-one-out partitions, the indices corresponding to the first test set
are returned.
idx = test (C, i)
returns a logical vector or
matrix with the indices of the test set indicated by i. If i
is a scalar, then idx is a logical vector with the indices of the
set. If i is a vector, then idx is a logical
matrix in which idx(:,j)
specified the observations in the
test set i(j)
. The value(s) in i must not excced the
number of tests in the cvpartition
object C.
idx = test (C,
returns a logical vector
or matrix for all test sets defined in the "all"
)cvpartition
object
C. For holdout and resubstitution partition types, a vector is
returned. For k-fold and leave-one-out, a matrix is returned.
See also: cvpartition, repartition, summary, training
"all"
)Training indices for cross-validation.
idx = training (C)
returns a logical vector idx
with true
values indicating the elements corresponding to the
training set defined in the code{cvpartition object C. For k-fold
and leave-one-out partitions, the indices corresponding to the first
training set are returned.
idx = training (C, i)
returns a logical vector
or matrix with the indices of the training set indicated by i. If
i is a scalar, then idx is a logical vector with the indices
of the set. If i is a vector, then idx is a
logical matrix in which idx(:,j)
specified the observations
in the training set i(j)
. The value(s) in i must not
excced the number of tests in the cvpartition
object C.
idx = training (C,
returns a logical
vector or matrix for all training sets defined in the "all"
)cvpartition
object C. For holdout and resubstitution partition types, a vector
is returned. For k-fold and leave-one-out, a matrix is returned.
See also: cvpartition, repartition, summary, test