Performs one or two levels of bootknife resampling and calculates bootstrap bias, standard errors and confidence intervals. -- Function File: bootknife (DATA) -- Function File: bootknife (DATA, NBOOT) -- Function File: bootknife (DATA, NBOOT, BOOTFUN) -- Function File: bootknife ({D1, D2, ...}, NBOOT, BOOTFUN) -- Function File: bootknife (DATA, NBOOT, {BOOTFUN, ...}) -- Function File: bootknife (DATA, NBOOT, BOOTFUN, ALPHA) -- Function File: bootknife (DATA, NBOOT, BOOTFUN, ALPHA, STRATA) -- Function File: bootknife (DATA, NBOOT, BOOTFUN, ALPHA, STRATA, NPROC) -- Function File: bootknife (DATA, NBOOT, BOOTFUN, ALPHA, STRATA, NPROC, BOOTSAM) -- Function File: STATS = bootknife (...) -- Function File: [STATS, BOOTSTAT] = bootknife (...) -- Function File: [STATS, BOOTSTAT] = bootknife (...) -- Function File: [STATS, BOOTSTAT, BOOTSAM] = bootknife (...) 'bootknife (DATA)' uses a variant of nonparametric bootstrap, called bootknife [1], to generate 1999 resamples from the rows of the DATA (column vector or matrix) and compute their means and display the following statistics: - original: the original estimate(s) calculated by BOOTFUN and the DATA - bias: bootstrap estimate of the bias of the sampling distribution(s) - std_error: bootstrap estimate(s) of the standard error(s) - CI_lower: lower bound(s) of the 95% bootstrap confidence interval(s) - CI_upper: upper bound(s) of the 95% bootstrap confidence interval(s) 'bootknife (DATA, NBOOT)' specifies the number of bootstrap resamples, where NBOOT can be either: <> scalar: A positive integer specifying the number of bootstrap resamples [2,3] for single bootstrap, or <> vector: A pair of positive integers defining the number of outer and inner (nested) resamples for iterated (a.k.a. double) bootstrap and coverage calibration [3-6]. The default value of NBOOT is the scalar: 1999. 'bootknife (DATA, NBOOT, BOOTFUN)' also specifies BOOTFUN: the function calculated on the original sample and the bootstrap resamples. BOOTFUN must be either a: <> function handle, function name or an anonymous function, <> string of a function name, or <> a cell array where the first cell is one of the above function definitions and the remaining cells are (additional) input arguments to that function (after the data arguments). In all cases BOOTFUN must take DATA for the initial input argument(s). BOOTFUN can return a scalar or any multidimensional numeric variable, but the output will be reshaped as a column vector. BOOTFUN must calculate a statistic representative of the finite data sample; it should NOT be an estimate of a population parameter (unless they are one of the same). If BOOTFUN is @mean or 'mean', narrowness bias of the confidence intervals for single bootstrap are reduced by expanding the probabilities of the percentiles using Student's t-distribution [7]. To compute confidence intervals for the mean without this correction for narrowness bias, define the mean within an anonymous function instead (e.g. @(x) mean(x)). By default, BOOTFUN is @mean. 'bootknife ({D1, D2, ...}, NBOOT, BOOTFUN)' resamples from the rows of D1, D2 etc and the resamples are passed to BOOTFUN as multiple data input arguments. All data vectors and matrices (D1, D2 etc) must have the same number of rows. 'bootknife (DATA, NBOOT, BOOTFUN, ALPHA)', where ALPHA is numeric and sets the lower and upper bounds of the confidence interval(s). The value(s) of ALPHA must be between 0 and 1. ALPHA can either be: <> scalar: To set the (nominal) central coverage of equal-tailed percentile confidence intervals to 100*(1-ALPHA)%. The intervals are either simple percentiles for single bootstrap, or percentiles with calibrated central coverage for double bootstrap. <> vector: A pair of probabilities defining the (nominal) lower and upper percentiles of the confidence interval(s) as 100*(ALPHA(1))% and 100*(ALPHA(2))% respectively. The percentiles are either bias-corrected and accelerated (BCa) for single bootstrap, or calibrated for double bootstrap. Note that the type of coverage calibration (i.e. equal-tailed or not) depends on whether NBOOT is a scalar or a vector. Confidence intervals are not calculated when the value(s) of ALPHA is/are NaN. The default value of ALPHA is the vector: [.025, .975], for a 95% confidence interval. 'bootknife (DATA, NBOOT, BOOTFUN, ALPHA, STRATA)' also sets STRATA, which are identifiers that define the grouping of the DATA rows for stratified* bootstrap resampling. STRATA should be a column vector or cell array with the same number of rows as the DATA. 'bootknife (DATA, NBOOT, BOOTFUN, ALPHA, STRATA, NPROC)' also sets the number of parallel processes to use to accelerate computations of double bootstrap, jackknife and non-vectorized function evaluations on multicore machines. This feature requires the Parallel package (in Octave), or the Parallel Computing Toolbox (in Matlab). 'bootknife (DATA, NBOOT, BOOTFUN, ALPHA, STRATA, NPROC, BOOTSAM)' uses bootstrap resampling indices provided in BOOTSAM. The BOOTSAM should be a matrix with the same number of rows as the data. When BOOTSAM is provided, the first element of NBOOT is ignored. 'STATS = bootknife (...)' returns a structure with the following fields (defined above): original, bias, std_error, CI_lower, CI_upper. '[STATS, BOOTSTAT] = bootknife (...)' returns BOOTSTAT, a vector or matrix of bootstrap statistics calculated over the (first, or outer layer of) bootstrap resamples. '[STATS, BOOTSTAT, BOOTSAM] = bootknife (...)' also returns BOOTSAM, the matrix of indices (32-bit integers) used for the (first, or outer layer of) bootstrap resampling. Each column in BOOTSAM corresponds to one bootstrap resample and contains the row indices of the values drawn from the nonscalar DATA argument to create that sample. * For cluster resampling, use the 'bootclust' function instead. Clustered or serially dependent data can also be analysed by the 'bootstrp', 'bootwild' and 'bootbayes' functions. DETAILS: For a DATA sample with n rows, bootknife resampling involves creating leave-one-out jackknife samples of size n - 1 and then drawing resamples of size n with replacement from the jackknife samples [1]. In contrast to bootstrap, bootknife resampling produces unbiased estimates of the standard error of BOOTFUN when n is small. The resampling of DATA rows is balanced in order to reduce Monte Carlo error, particularly for estimating the bias of BOOTFUN [8,9]. For single bootstrap, the confidence intervals are constructed from the quantiles of a kernel density estimate of the bootstrap statistics (with shrinkage correction). For double bootstrap, calibration is used to improve the accuracy of the bias and standard error, and coverage of the confidence intervals [2-6]. Double bootstrap confidence intervals are constructed from the empirical distribution of the bootstrap statistics by linear interpolation. This function has no input arguments for specifying a random seed. However, one can reset the random number generator with a SEED value using following command: boot (1, 1, false, SEED); Please see the help documentation for the function 'boot' for more information about setting the seed for parallel execution of bootknife. BIBLIOGRAPHY: [1] Hesterberg T.C. (2004) Unbiasing the Bootstrap—Bootknife Sampling vs. Smoothing; Proceedings of the Section on Statistics & the Environment. Alexandria, VA: American Statistical Association. [2] Davison A.C. and Hinkley D.V (1997) Bootstrap Methods And Their Application. (Cambridge University Press) [3] Efron, and Tibshirani (1993) An Introduction to the Bootstrap. New York, NY: Chapman & Hall [4] Booth J. and Presnell B. (1998) Allocation of Monte Carlo Resources for the Iterated Bootstrap. J. Comput. Graph. Stat. 7(1):92-112 [5] Lee and Young (1999) The effect of Monte Carlo approximation on coverage error of double-bootstrap con®dence intervals. J R Statist Soc B. 61:353-366. [6] Hall, Lee and Young (2000) Importance of interpolation when constructing double-bootstrap confidence intervals. Journal of the Royal Statistical Society. Series B. 62(3): 479-491 [7] Hesterberg, Tim (2014), What Teachers Should Know about the Bootstrap: Resampling in the Undergraduate Statistics Curriculum, http://arxiv.org/abs/1411.5279 [8] Davison et al. (1986) Efficient Bootstrap Simulation. Biometrika, 73: 555-66 [9] Gleason, J.R. (1988) Algorithms for Balanced Bootstrap Simulations. The American Statistician. Vol. 42, No. 4 pp. 263-266 bootknife (version 2024.05.15) Author: Andrew Charles Penn https://www.researchgate.net/profile/Andrew_Penn/ Copyright 2019 Andrew Charles Penn This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/
The following code
% Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; % 95% expanded BCa bootstrap confidence intervals for the mean bootknife (data, 1999, @mean); % Please be patient, the calculations will be completed soon...
Produces the following output
Summary of nonparametric bootstrap estimates of bias and precision ****************************************************************************** Bootstrap settings: Function: mean Resampling method: Balanced, bootknife resampling Number of resamples (outer): 1999 Number of resamples (inner): 0 Confidence interval (CI) type: Expanded bias-corrected and accelerated (BCa) Nominal coverage (and the percentiles used): 95% (1.2%, 96.9%) Bootstrap Statistics: original bias std_error CI_lower CI_upper +29.65 +4.263e-14 2.652 +23.52 +34.45
The following code
% Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; % 95% calibrated percentile bootstrap confidence intervals for the mean bootknife (data, [1999, 199], @mean); % Please be patient, the calculations will be completed soon...
Produces the following output
Summary of nonparametric bootstrap estimates of bias and precision ****************************************************************************** Bootstrap settings: Function: mean Resampling method: Iterated, balanced, bootknife resampling Number of resamples (outer): 1999 Number of resamples (inner): 199 Confidence interval (CI) type: Calibrated percentile Nominal coverage (and the percentiles used): 95% (1.2%, 97.4%) Bootstrap Statistics: original bias std_error CI_lower CI_upper +29.65 -1.279e-13 2.610 +23.61 +34.50
The following code
% Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; % 95% calibrated percentile bootstrap confidence intervals for the median % with smoothing. bootknife (data, [1999, 199], @smoothmedian); % Please be patient, the calculations will be completed soon...
Produces the following output
Summary of nonparametric bootstrap estimates of bias and precision ****************************************************************************** Bootstrap settings: Function: smoothmedian Resampling method: Iterated, balanced, bootknife resampling Number of resamples (outer): 1999 Number of resamples (inner): 199 Confidence interval (CI) type: Calibrated percentile Nominal coverage (and the percentiles used): 95% (2.0%, 98.0%) Bootstrap Statistics: original bias std_error CI_lower CI_upper +30.86 -0.002738 3.036 +24.45 +37.32
The following code
% Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; % 90% equal-tailed percentile bootstrap confidence intervals for % the variance bootknife (data, 1999, {@var, 1}, 0.1); % Please be patient, the calculations will be completed soon...
Produces the following output
Summary of nonparametric bootstrap estimates of bias and precision ****************************************************************************** Bootstrap settings: Function: var Resampling method: Balanced, bootknife resampling Number of resamples (outer): 1999 Number of resamples (inner): 0 Confidence interval (CI) type: Percentile (equal-tailed) Nominal coverage (and the percentiles used): 90% (5.0%, 95.0%) Bootstrap Statistics: original bias std_error CI_lower CI_upper +171.5 -6.596 42.33 +97.46 +235.6
The following code
% Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; % 90% BCa bootstrap confidence intervals for the variance bootknife (data, 1999, {@var, 1}, [0.05 0.95]); % Please be patient, the calculations will be completed soon...
Produces the following output
Summary of nonparametric bootstrap estimates of bias and precision ****************************************************************************** Bootstrap settings: Function: var Resampling method: Balanced, bootknife resampling Number of resamples (outer): 1999 Number of resamples (inner): 0 Confidence interval (CI) type: Bias-corrected and accelerated (BCa) Nominal coverage (and the percentiles used): 90% (12.5%, 98.8%) Bootstrap Statistics: original bias std_error CI_lower CI_upper +171.5 -7.008 42.73 +115.8 +263.4
The following code
% Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; % 90% calibrated equal-tailed percentile bootstrap confidence intervals for % the variance. bootknife (data, [1999, 199], {@var, 1}, 0.1); % Please be patient, the calculations will be completed soon...
Produces the following output
Summary of nonparametric bootstrap estimates of bias and precision ****************************************************************************** Bootstrap settings: Function: var Resampling method: Iterated, balanced, bootknife resampling Number of resamples (outer): 1999 Number of resamples (inner): 199 Confidence interval (CI) type: Calibrated percentile (equal-tailed) Nominal coverage (and the percentiles used): 90% (2.2%, 97.8%) Bootstrap Statistics: original bias std_error CI_lower CI_upper +171.5 -7.034 44.28 +83.34 +251.2
The following code
% Input univariate dataset data = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... 0 33 28 34 4 32 24 47 41 24 26 30 41].'; % 90% calibrated percentile bootstrap confidence intervals for the variance bootknife (data, [1999, 199], {@var, 1}, [0.05, 0.95]); % Please be patient, the calculations will be completed soon...
Produces the following output
Summary of nonparametric bootstrap estimates of bias and precision ****************************************************************************** Bootstrap settings: Function: var Resampling method: Iterated, balanced, bootknife resampling Number of resamples (outer): 1999 Number of resamples (inner): 199 Confidence interval (CI) type: Calibrated percentile Nominal coverage (and the percentiles used): 90% (12.3%, 99.5%) Bootstrap Statistics: original bias std_error CI_lower CI_upper +171.5 -7.291 43.09 +115.5 +272.9
The following code
% Input dataset y = randn (20,1); x = randn (20,1); X = [ones(20,1), x]; % 90% BCa confidence interval for regression coefficients bootknife ({X,y}, 1999, @mldivide, [0.05 0.95]); % N.B: Consider using either the 'bootwild', 'bootbayes' or 'bootlm' % functions for bootstrapping linear regression problems instead % Please be patient, the calculations will be completed soon...
Produces the following output
Summary of nonparametric bootstrap estimates of bias and precision ****************************************************************************** Bootstrap settings: Function: mldivide Resampling method: Balanced, bootknife resampling Number of resamples (outer): 1999 Number of resamples (inner): 0 Confidence interval (CI) type: Bias-corrected and accelerated (BCa) Nominal coverage: 90% Bootstrap Statistics: original bias std_error CI_lower CI_upper +0.09989 -0.007025 0.2859 -0.3832 +0.5533 -0.3207 +0.003965 0.3694 -0.8856 +0.3101
The following code
% Input bivariate dataset x = [576 635 558 578 666 580 555 661 651 605 653 575 545 572 594].'; y = [3.39 3.3 2.81 3.03 3.44 3.07 3 3.43 ... 3.36 3.13 3.12 2.74 2.76 2.88 2.96].'; % 95% BCa bootstrap confidence intervals for the correlation coefficient bootknife ({x, y}, 1999, @cor); % Please be patient, the calculations will be completed soon...
Produces the following output
Summary of nonparametric bootstrap estimates of bias and precision ****************************************************************************** Bootstrap settings: Function: cor Resampling method: Balanced, bootknife resampling Number of resamples (outer): 1999 Number of resamples (inner): 0 Confidence interval (CI) type: Bias-corrected and accelerated (BCa) Nominal coverage (and the percentiles used): 95% (0.3%, 92.0%) Bootstrap Statistics: original bias std_error CI_lower CI_upper +0.7764 -0.005773 0.1423 +0.2298 +0.9392
The following code
% Calculating confidence intervals for the coefficients from logistic % regression using an example with an ordinal response from: % https://uk.mathworks.com/help/stats/mnrfit.html %>>>>>>>>> This code block must be run first in Octave only >>>>>>>>>>>> try pkg load statistics load carbig info = ver; if ( str2num ({info.Version}{strcmp({info.Name},'statistics')}(1:3)) < 1.5) error ('statistics package version must be > 1.5') end if (~ exist ('mnrfit', 'file')) % Octave Statistics package does not currently have the mnrfit function, % so we will use it's logistic_regression function for fitting ordinal % models instead. function [B, DEV] = mnrfit (X, Y, varargin) % Note that if the outcome has more than two levels, the % logistic_regression function is only suitable when the % outcome is ordinal, so we would need to use append 'model', % 'ordinal' as a name-value pair in MATLAB when executing % it's mnrfit function (see below) [INTERCEPT, SLOPE, DEV] = logistic_regression (Y - 1, X, false); B = cat (1, INTERCEPT, SLOPE); end end stats_pkg = true; catch stats_pkg = false; fprintf ('\nSkipping this demo...') fprintf ('\nRequired features of the statistics package not found.\n\n'); end %<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< if (stats_pkg) %>>>>>>>>>>>>>>>>>>> This code block is the demo >>>>>>>>>>>>>>>>>>>>>> % This demo requires the statistics package in Octave (equivalent to % the Statistics and Machine Learning Toolbox in Matlab) % Create the dataset load carbig X = [Acceleration Displacement Horsepower Weight]; % The responses 1 - 4 correspond to the following classification: % 1: 9 - 19 miles per gallon % 2: 19 - 29 miles per gallon % 3: 29 - 39 miles per gallon % 4: 39 - 49 miles per gallon miles = [1,1,1,1,1,1,1,1,1,1,NaN,NaN,NaN,NaN,NaN,1,1,NaN,1,1,2,2,1,2, ... 2,2,2,2,2,2,2,1,1,1,1,2,2,2,2,NaN,2,1,1,2,1,1,1,1,1,1,1,1,1, ... 2,2,1,2,2,3,3,3,3,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,2,1,1,1,1, ... 1,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,1,1,1, ... 1,1,2,2,2,1,2,2,2,1,1,3,2,2,2,1,2,2,1,2,2,2,1,3,2,3,2,1,1,1, ... 1,1,1,1,1,3,2,2,3,3,2,2,2,2,2,3,2,1,1,1,1,1,1,1,1,1,1,1,2,2, ... 1,3,2,2,2,2,2,2,1,3,2,2,2,2,2,3,2,2,2,2,2,1,1,1,1,2,2,2,2,3, ... 2,3,3,2,1,1,1,3,3,2,2,2,1,2,2,1,1,1,1,1,3,3,3,2,3,1,1,1,1,1, ... 2,2,1,1,1,1,1,3,2,2,2,3,3,3,3,2,2,2,4,3,3,4,3,2,2,2,2,2,2,2, ... 2,2,2,2,1,1,2,1,1,1,3,2,2,3,2,2,2,2,2,1,2,1,3,3,2,2,2,2,2,1, ... 1,1,1,1,1,2,1,3,3,3,2,2,2,2,2,3,3,3,3,2,2,2,3,4,3,3,3,2,2,2, ... 2,3,3,3,3,3,4,2,4,4,4,3,3,4,4,3,3,3,2,3,2,3,2,2,2,2,3,4,4,3, ... 3,3,3,3,3,3,3,3,3,3,3,3,3,2,NaN,3,2,2,2,2,2,1,2,2,3,3,3,2,2, ... 2,3,3,3,3,3,3,3,3,3,3,3,2,3,2,2,3,3,2,2,4,3,2,3]'; % Bootsrap confidence intervals for each logistic regression coefficient bootknife ({X, miles}, 1999, ... @(X, miles) mnrfit (X, miles, 'model', 'ordinal')); % Where the first 3 rows are the intercept terms, and the last 4 rows % are the slope coefficients. For each predictor, the slope coefficient % corresponds to how a unit change in the predictor impacts on the odds, % which are proportional across the (ordered) catagories, where each % log-odds in each case is: % % ln ( ( P[below] ) / ( P[above] ) ) % % i.e. in mnrfit, the reference class is the higher of the two classes. % Therefore, a positive slope value indicates that a unit increase in the % predictor increases the odds of running at fewer miles per gallon. % Note that ordinal and multinomial logistic regression (appropriate % for ordinal and nominal responses respectively) would be equivalent % for any binary outcome %<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< end % Please be patient, the calculations will be completed soon...
Produces the following output
Summary of nonparametric bootstrap estimates of bias and precision ****************************************************************************** Bootstrap settings: Function: @(X, miles) mnrfit (X, miles, 'model', 'ordinal') Resampling method: Balanced, bootknife resampling Number of resamples (outer): 1999 Number of resamples (inner): 0 Confidence interval (CI) type: Bias-corrected and accelerated (BCa) Nominal coverage: 95% Bootstrap Statistics: original bias std_error CI_lower CI_upper -16.69 -0.5205 2.058 -20.17 -12.46 -11.72 -0.3644 1.819 -14.92 -8.059 -8.061 -0.2589 1.722 -11.13 -4.496 +0.1048 +0.005361 0.09151 -0.07761 +0.2784 +0.01034 +0.0006530 0.006532 -0.003006 +0.02331 +0.06452 +0.002386 0.01657 +0.03328 +0.09535 +0.001664 +8.603e-06 0.0007476 +0.0001654 +0.003172
The following code
% Air conditioning failure times (x) in Table 1.2 of Davison A.C. and % Hinkley D.V (1997) Bootstrap Methods And Their Application. (Cambridge % University Press) % AIM: to construct 95% nonparametric bootstrap confidence intervals for % the mean failure time from the sample x (n = 12). The mean(x,1) = 108.1 % and exact intervals based on an exponential model are [65.9, 209.2]. % Calculations using the 'bootstrap' and 'resample' packages in R % % x <- c(3, 5, 7, 18, 43, 85, 91, 98, 100, 130, 230, 487); % % library (bootstrap) % Functions from Efron and Tibshirani (1993) % set.seed(1); % ci1 <- boott (x, mean, nboott=19999, nbootsd=499, perc=c(.025,.975)) % set.seed(1); % ci2a <- bcanon (x, 19999, mean, alpha = c(0.025,0.975)) % % library (resample) % Functions from Hesterberg, Tim (2014) % bootout <- bootstrap (x, mean, R=19999, seed=1) % ci2b <- CI.bca (bootout, confidence=0.95, expand=FALSE) % ci3 <- CI.bca (bootout, confidence=0.95, expand=TRUE) % ci4 <- CI.percentile (bootout, confidence=0.95, expand=FALSE) % ci5 <- CI.percentile (bootout, confidence=0.95, expand=TRUE) % % Confidence intervals from 'bootstrap' and 'resample' packages in R % % method | 0.05 | 0.95 | length | shape | % --------------------------------------|--------|--------|--------|-------| % ci1 - bootstrap-t (bootstrap) | 45.2 | 301.6 | 256.4 | 3.08 | % ci2a - BCa (bootstrap) | 57.1 | 226.5 | 169.4 | 2.32 | % ci2b - BCa (resample) | 57.5 | 223.4 | 165.9 | 2.27 | % ci3 - expanded BCa (resample) | 52.0 | 252.5 | 200.0 | 2.57 | % ci4 - percentile (resample) | 47.7 | 191.8 | 144.1 | 1.39 | % ci5 - expanded percentile (resample) | 41.1 | 209.0 | 167.9 | 1.51 | % Calculations using the 'statistics-resampling' package for Octave/Matlab % % x = [3 5 7 18 43 85 91 98 100 130 230 487]'; % boot (1,1,false,1); ci3 = bootknife (x, 19999, @mean, [.025,.975]); % boot (1,1,false,1); ci5 = bootknife (x, 19999, @mean, 0.05); % boot (1,1,false,1); ci6 = bootknife (x, [19999,499], @mean, [.025,.975]); % % Confidence intervals from 'statistics-resampling' package for Octave/Matlab % % method | 0.025 | 0.975 | length | shape | % --------------------------------------|--------|--------|--------|-------| % ci3 - expanded BCa | 51.4 | 255.6 | 204.2 | 2.60 | % ci5 - expanded percentile | 37.3 | 207.4 | 170.1 | 1.40 | % ci6 - calibrated | 50.3 | 245.3 | 194.9 | 2.37 | % --------------------------------------|--------|--------|--------|-------| % parametric - exact | 65.9 | 209.2 | 143.3 | 3.40 | % % Simulation results for constructing 95% confidence intervals for the % mean of populations with different distributions. The simulation was % of 1000 random samples of size 12 (analagous to the situation above). % Simulation performed using the bootsim script with nboot of 1999 (for % single bootstrap) or [1999,199] (for double bootstrap). % % -------------------------------------------------------------------------- % expanded BCa % -------------------------------------------------------------------------- % Population | coverage | lower | upper | length | shape | % ---------------------------|----------|--------|--------|--------|-------| % Normal N(0,1) | 94.8% | 2.7% | 2.5% | 1.22 | 0.99 | % Folded normal |N(0,1)| | 94.9% | 1.8% | 3.3% | 0.75 | 1.34 | % Laplace exp(1) - exp(1) | 92.0% | 3.1% | 4.9% | 1.67 | 0.99 | % Log-normal exp(N(0,1)) | 87.4% | 0.6% | 12.0% | 1.95 | 1.82 | % ---------------------------|----------|--------|--------|--------|-------| % % -------------------------------------------------------------------------- % expanded percentile % -------------------------------------------------------------------------- % Population | coverage | lower | upper | length | shape | % ---------------------------|----------|--------|--------|--------|-------| % Normal N(0,1) | 94.8% | 2.2% | 3.0% | 1.22 | 1.00 | % Folded normal |N(0,1)| | 92.1% | 1.5% | 6.4% | 0.71 | 1.10 | % Laplace exp(1) - exp(1) | 94.7% | 1.9% | 3.4% | 1.61 | 1.00 | % Log-normal exp(N(0,1)) | 86.4% | 0.1% | 13.5% | 1.74 | 1.24 | % ---------------------------|----------|--------|--------|--------|-------| % % -------------------------------------------------------------------------- % calibrated percentile (equal-tailed) % -------------------------------------------------------------------------- % Population | coverage | lower | upper | length | shape | % ---------------------------|----------|--------|--------|--------|-------| % Normal N(0,1) | 95.5% | 2.9% | 1.6% | 1.30 | 1.00 | % Folded normal |N(0,1)| | 95.1% | 0.8% | 4.1% | 0.79 | 1.14 | % Laplace exp(1) - exp(1) | 94.7% | 2.3% | 3.0% | 1.76 | 0.99 | % Log-normal exp(N(0,1)) | 88.8% | 0.3% | 10.9% | 1.99 | 1.39 | % ---------------------------|----------|--------|--------|--------|-------| % % -------------------------------------------------------------------------- % calibrated percentile % -------------------------------------------------------------------------- % Population | coverage | lower | upper | length | shape | % ---------------------------|----------|--------|--------|--------|-------| % Normal N(0,1) | 95.5% | 3.1% | 1.4% | 1.28 | 1.01 | % Folded normal |N(0,1)| | 95.5% | 0.9% | 3.6% | 0.75 | 1.40 | % Laplace exp(1) - exp(1) | 93.3% | 3.4% | 3.3% | 1.74 | 1.02 | % Log-normal exp(N(0,1)) | 89.4% | 0.9% | 9.7% | 1.97 | 1.78 | % ---------------------------|----------|--------|--------|--------|-------|
gives an example of how 'bootknife' is used.
The following code
% Spatial Test Data (A) from Table 14.1 of Efron and Tibshirani (1993) % An Introduction to the Bootstrap in Monographs on Statistics and Applied % Probability 57 (Springer) % AIM: to construct 90% nonparametric bootstrap confidence intervals for % var(A,1), where var(A,1) = 171.5 and n = 23, and exact intervals based % on Normal theory are [118.4, 305.2]. % % (i.e. (n - 1) * var (A, 0) ./ chi2inv (1 - [0.05; 0.95], n - 1)) % Calculations using the 'boot' and 'bootstrap' packages in R % % library (boot) # Functions from Davison and Hinkley (1997) % A <- c(48,36,20,29,42,42,20,42,22,41,45,14,6, % 0,33,28,34,4,32,24,47,41,24,26,30,41); % n <- length(A) % var.fun <- function (d, i) { % # Function to compute the population variance % n <- length (d); % return (var (d[i]) * (n - 1) / n) }; % boot.fun <- function (d, i) { % # Compute the estimate % t <- var.fun (d, i); % # Compute sampling variance of the estimate using Tukey's jackknife % n <- length (d); % U <- empinf (data=d[i], statistic=var.fun, type="jack", stype="i"); % var.t <- sum (U^2 / (n * (n - 1))); % return ( c(t, var.t) ) }; % set.seed(1) % var.boot <- boot (data=A, statistic=boot.fun, R=19999, sim='balanced') % ci1 <- boot.ci (var.boot, conf=0.90, type="norm") % ci2 <- boot.ci (var.boot, conf=0.90, type="perc") % ci3 <- boot.ci (var.boot, conf=0.90, type="basic") % ci4 <- boot.ci (var.boot, conf=0.90, type="bca") % ci5 <- boot.ci (var.boot, conf=0.90, type="stud") % % library (bootstrap) # Functions from Efron and Tibshirani (1993) % set.seed(1); % ci4a <- bcanon (A, 19999, var.fun, alpha=c(0.05,0.95)) % set.seed(1); % ci5a <- boott (A, var.fun, nboott=19999, nbootsd=499, perc=c(.05,.95)) % % Confidence intervals from 'boot' and 'bootstrap' packages in R % % method | 0.05 | 0.95 | length | shape | % --------------------------------------|--------|--------|--------|-------| % ci1 - normal | 109.6 | 246.7 | 137.1 | 1.21 | % ci2 - percentile | 97.9 | 234.8 | 136.9 | 0.86 | % ci3 - basic | 108.3 | 245.1 | 136.8 | 1.16 | % ci4 - BCa | 116.0 | 260.7 | 144.7 | 1.60 | % ci4a - BCa | 115.8 | 260.6 | 147.8 | 1.59 | % ci5 - bootstrap-t | 112.0 | 291.8 | 179.8 | 2.02 | % ci5a - bootstrap-t | 116.1 | 290.9 | 174.7 | 2.16 | % --------------------------------------|--------|--------|--------|-------| % parametric - exact | 118.4 | 305.2 | 186.8 | 2.52 | % % Summary of bias statistics from 'boot' package in R % % method | original | bias | bias-corrected | % -----------------------------------|----------|---------|----------------| % single bootstrap | 171.53 | -6.58 | 178.11 | % -----------------------------------|----------|---------|----------------| % parametric - exact | 171.53 | -6.86 | 178.40 | % Calculations using the 'statistics-resampling' package for Octave/Matlab % % A = [48 36 20 29 42 42 20 42 22 41 45 14 6 ... % 0 33 28 34 4 32 24 47 41 24 26 30 41].'; % boot (1,1,false,1); ci2 = bootknife (A,19999,{@var,1},0.1); % boot (1,1,false,1); ci4 = bootknife (A,19999,{@var,1},[0.05,0.95]); % boot (1,1,false,1); ci6a = bootknife (A,[19999,499],{@var,1},0.1); % boot (1,1,false,1); ci6b = bootknife (A,[19999,499],{@var,1},[0.05,0.95]); % % Confidence intervals from 'statistics-resampling' package for Octave/Matlab % % method | 0.05 | 0.95 | length | shape | % --------------------------------------|--------|--------|--------|-------| % ci2 - percentile (equal-tailed) | 96.2 | 237.2 | 141.0 | 0.87 | % ci4 - BCa | 116.1 | 265.1 | 149.0 | 1.69 | % ci6a - calibrated (equal-tailed) | 83.0 | 252.9 | 169.9 | 0.92 | % ci6b - calibrated | 113.2 | 282.4 | 169.2 | 1.90 | % --------------------------------------|--------|--------|--------|-------| % parametric - exact | 118.4 | 305.2 | 186.8 | 2.52 | % % Simulation results for constructing 90% confidence intervals for the % variance of a population N(0,1) from 1000 random samples of size 26 % (analagous to the situation above). Simulation performed using the % bootsim script with nboot of 1999 (for single bootstrap) or [1999,499] % (for double bootstrap). % % method | coverage | lower | upper | length | shape | % ---------------------------|----------|--------|--------|--------|-------| % percentile (equal-tailed) | 81.9% | 1.3% | 16.8% | 0.78 | 0.91 | % BCa | 85.0% | 4.6% | 10.4% | 0.85 | 1.82 | % calibrated (equal-tailed) | 90.0% | 0.1% | 9.9% | 1.01 | 1.04 | % calibrated | 90.3% | 4.5% | 5.2% | 0.99 | 2.21 | % ---------------------------|----------|--------|--------|--------|-------| % parametric - exact | 90.8% | 3.7% | 5.5% | 0.99 | 2.52 | % Summary of bias statistics computed using the 'statistics-resampling' % package for Octave/Matlab % % method | original | bias | bias-corrected | % -----------------------------------|----------|---------|----------------| % single bootstrap | 171.53 | -6.70 | 178.24 | % double bootstrap | 171.53 | -7.12 | 178.65 | % -----------------------------------|----------|---------|----------------| % parametric - exact | 171.53 | -6.86 | 178.40 | % The equivalent methods for constructing bootstrap intervals in the 'boot' % and 'bootstrap' packages (in R) and the statistics-resampling package (in % Octave/Matlab) produce intervals with very similar end points, length and % shape. However, all intervals calculated using the 'statistics-resampling' % package are slightly longer than the equivalent intervals calculated in % R because the 'statistics-resampling' package uses bootknife resampling. % The scale of the sampling distribution for small samples is approximated % better by bootknife (rather than bootstrap) resampling.
gives an example of how 'bootknife' is used.
Package: statistics-resampling