Bootstrap: Resample with replacement to generate new samples and return the statistic(s) calculated by evaluating the specified function on each resample. -- Function File: BOOTSTAT = bootstrp (NBOOT, BOOTFUN, D) -- Function File: BOOTSTAT = bootstrp (NBOOT, BOOTFUN, D1, ..., DN) -- Function File: BOOTSTAT = bootstrp (..., D1, ..., DN, 'match', MATCH) -- Function File: BOOTSTAT = bootstrp (..., 'Options', PAROPT) -- Function File: BOOTSTAT = bootstrp (..., 'Weights', WEIGHTS) -- Function File: BOOTSTAT = bootstrp (..., 'loo', LOO) -- Function File: BOOTSTAT = bootstrp (..., 'seed', SEED) -- Function File: [BOOTSTAT, BOOTSAM] = bootstrp (...) -- Function File: [BOOTSTAT, BOOTSAM, STATS] = bootstrp (...) 'BOOTSTAT = bootstrp (NBOOT, BOOTFUN, D)' draws NBOOT bootstrap resamples with replacement from the rows of the data D and returns the statistic computed by BOOTFUN in BOOTSTAT [1]. BOOTFUN is a function handle (e.g. specified with @) or name, a string indicating the function name, or a cell array, where the first cell is one of the above function definitions and the remaining cells are (additional) input arguments to that function (after the data argument(s)). The third input argument is the data (column vector, matrix or cell array), which is supplied to BOOTFUN. This function is the only function in the statistics-resampling package to also accept cell arrays for the data arguments. The simulation method used by default is bootstrap resampling with first order balance [2-3]; see help for the 'boot' function for more information. 'BOOTSTAT = bootstrp (NBOOT, BOOTFUN, D1,...,DN)' is as above except that the third and subsequent input arguments are multiple data objects, (column vectors, matrices or cell arrays,) which are used as input for BOOTFUN. 'BOOTSTAT = bootstrp (..., D1, ..., DN, 'match', MATCH)' controls the resampling strategy when multiple data arguments are provided. When MATCH is true, row indices of D1 to DN are the same (i.e. matched) for each resample. This is the default strategy when D1 to DN all have the same number of rows. If MATCH is set to false, then row indices are resampled independently for D1 to DN in each of the resamples. When any of the data D1 to DN, have a different number of rows, this input argument is ignored and MATCH is enforced to have a value of false. Note that the MATLAB bootstrp function only operates in a mode equivalent to MATCH = true. One application of setting MATCH to false is to perform stratified bootstrap resampling. 'BOOTSTAT = bootstrp (..., 'Options', PAROPT)' specifies options that govern if and how to perform bootstrap iterations using multiple processors (if the Parallel Computing Toolbox or Octave Parallel package). is available This argument is a structure with the following recognised fields: o 'UseParallel': If true, use parallel processes to accelerate bootstrap computations on multicore machines. Default is false for serial computation. In MATLAB, the default is true if a parallel pool has already been started. o 'nproc': nproc sets the number of parallel processes (optional) 'BOOTSTAT = bootstrp (..., D, 'weights', WEIGHTS)' sets the resampling weights. WEIGHTS must be a column vector with the same number of rows as the data, D. If WEIGHTS is empty or not provided, the default is a vector of length N with uniform weighting 1/N. 'BOOTSTAT = bootstrp (..., D1, ... DN, 'weights', WEIGHTS)' as above if MATCH is true. If MATCH is false, a 1-by-N cell array of column vectors can be provided to specify independent resampling weights for D1 to DN. 'BOOTSTAT = bootstrp (..., 'loo', LOO)' sets the simulation method. If LOO is false, the resampling method used is balanced bootstrap resampling. If LOO is true, the resampling method used is balanced bootknife resampling [4]. The latter involves creating leave-one-out (jackknife) samples of size N - 1, and then drawing resamples of size N with replacement from the jackknife samples, thereby incorporating Bessel's correction into the resampling procedure. LOO must be a scalar logical value. The default value of LOO is false. 'BOOTSTAT = bootstrp (..., 'seed', SEED)' initialises the Mersenne Twister random number generator using an integer SEED value so that bootci results are reproducible. '[BOOTSTAT, BOOTSAM] = bootstrp (...)' also returns indices used for bootstrap resampling. If MATCH is true or only one data argument is provided, BOOTSAM is a matrix. If multiple data arguments are provided and MATCH is false, BOOTSAM is returned in a 1-by-N cell array of matrices, where each cell corresponds to the respective data argument D1 to DN. To get the output samples BOOTSAM without applying a function, set BOOTFUN to empty (i.e. []). '[BOOTSTAT, BOOTSAM, STATS] = bootstrp (...)' also calculates and returns the following basic statistics relating to each column of BOOTSTAT: - original: the original estimate(s) calculated by BOOTFUN and the DATA - mean: the mean of the bootstrap distribution(s) - bias: bootstrap estimate of the bias of the sampling distribution(s) - bias_corrected: original estimate(s) after subtracting the bias - var: bootstrap variance of the original estimate(s) - std_error: bootstrap estimate(s) of the standard error(s) If BOOTSTAT is not numeric, STATS only returns the 'original' field. If BOOTFUN is empty, then the value of the 'original' field is also empty. Bibliography: [1] Efron, and Tibshirani (1993) An Introduction to the Bootstrap. New York, NY: Chapman & Hall [2] Davison et al. (1986) Efficient Bootstrap Simulation. Biometrika, 73: 555-66 [3] Booth, Hall and Wood (1993) Balanced Importance Resampling for the Bootstrap. The Annals of Statistics. 21(1):286-298 [4] Hesterberg T.C. (2004) Unbiasing the Bootstrap—Bootknife Sampling vs. Smoothing; Proceedings of the Section on Statistics & the Environment. Alexandria, VA: American Statistical Association. bootstrp (version 2024.05.24) Author: Andrew Charles Penn https://www.researchgate.net/profile/Andrew_Penn/ Copyright 2019 Andrew Charles Penn This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/
The following code
% Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41]'; % Compute 500 bootstrap statistics for the mean and calculate the bootstrap % standard error of the mean bootstat = bootstrp (500, @mean, data, 'seed', 1); % Or equivalently bootstat = bootstrp (500, @mean, data, 'seed', 1, 'loo', false); std (bootstat)
Produces the following output
ans = 2.5977
The following code
% Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41]'; % Compute 500 bootknife statistics for the mean and calculate the unbiased % bootstrap standard error of the mean bootstat = bootstrp (500, @mean, data, 'seed', 1, 'loo', true); std (bootstat)
Produces the following output
ans = 2.6441
The following code
% Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41]'; % Split data into consecutive blocks of two data observations per cell data_blocks = mat2cell (data, 2 * (ones (13, 1)), 1); % Compute 500 bootknife statistics for the mean and calculate the unbiased % bootstrap standard error of the mean bootstat = bootstrp (500, @(x) mean (cell2mat (x)), data_blocks, 'seed', 1, ... 'loo', true); std (bootstat)
Produces the following output
ans = 3.045
The following code
% Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41]'; % Compute 500 bootknife statistics for the variance and calculate the % unbiased standard error of the variance bootstat = bootstrp (500, {@var, 1}, data, 'loo', true); std (bootstat)
Produces the following output
ans = 42.137
The following code
% Input two-sample dataset X = [212 435 339 251 404 510 377 335 410 335 ... 415 356 339 188 256 296 249 303 266 300]'; Y = [247 461 526 302 636 593 393 409 488 381 ... 474 329 555 282 423 323 256 431 437 240]'; % Compute 500 bootknife statistics for the mean difference between X and Y % and calculate the unbiased bootstrap standard error of the mean difference bootstat = bootstrp (500, @(x, y) mean (x - y), X, Y, 'loo', true); % Or equivalently bootstat = bootstrp (500, @(x, y) mean (x - y), X, Y, 'loo', true, ... 'match', true); std (bootstat)
Produces the following output
ans = 18.185
The following code
% Input two-sample dataset X = [212 435 339 251 404 510 377 335 410 335 ... 415 356 339 188 256 296 249 303 266 300]'; Y = [247 461 526 302 636 593 393 409 488 381 ... 474 329 555 282 423 323 256 431 437 240]'; % Compute 500 bootknife statistics for the difference in mean between % between independent samples X and Y and calculate the unbiased bootstrap % standard error of the difference in mean bootstat = bootstrp (500, @(x, y) mean (x) - mean(y), X, Y, 'loo', true, ... 'match', false); std (bootstat)
Produces the following output
ans = 31.797
The following code
% Input bivariate dataset X = [212 435 339 251 404 510 377 335 410 335 ... 415 356 339 188 256 296 249 303 266 300]'; Y = [247 461 526 302 636 593 393 409 488 381 ... 474 329 555 282 423 323 256 431 437 240]'; % Compute 500 bootstrap statistics for the correlation coefficient and % calculate the bootstrap standard error of the correlation coefficient bootstat = bootstrp (500, @cor, X, Y); std (bootstat)
Produces the following output
ans = 0.10017
The following code
% Input bivariate dataset X = [212 435 339 251 404 510 377 335 410 335 ... 415 356 339 188 256 296 249 303 266 300]'; Y = [247 461 526 302 636 593 393 409 488 381 ... 474 329 555 282 423 323 256 431 437 240]'; % Compute 500 bootstrap statistics for the coefficient of determination and % calculate it's bootstrap standard error bootstat = bootstrp (500, {@cor,'squared'}, X, Y); std (bootstat)
Produces the following output
ans = 0.12767
The following code
% Input bivariate dataset X = [212 435 339 251 404 510 377 335 410 335 ... 415 356 339 188 256 296 249 303 266 300]'; Y = [247 461 526 302 636 593 393 409 488 381 ... 474 329 555 282 423 323 256 431 437 240]'; % Compute 4999 bootstrap statistics for the coefficient of determination and % calculate 95% percentile confidence intervals bootstat = bootstrp (4999, {@cor,'squared'}, X, Y); bootint (bootstat)
Produces the following output
ans = 0.25642 0.743
The following code
% Input bivariate dataset X = [212 435 339 251 404 510 377 335 410 335 ... 415 356 339 188 256 296 249 303 266 300]'; Y = [247 461 526 302 636 593 393 409 488 381 ... 474 329 555 282 423 323 256 431 437 240]'; % Compute 500 bootstrap statistics for the slope and intercept of a linear % regression and calculate their bootstrap standard errors bootstat = bootstrp (500, @mldivide, cat (2, ones (20, 1), X), Y); std (bootstat)
Produces the following output
ans = 63.468 0.18955
Package: statistics-resampling