Performs wild bootstrap and calculates bootstrap-t confidence intervals and p-values for the mean, or the regression coefficients from a linear model. -- Function File: bootwild (y) -- Function File: bootwild (y, X) -- Function File: bootwild (y, X, CLUSTID) -- Function File: bootwild (y, X, BLOCKSZ) -- Function File: bootwild (y, X, ..., NBOOT) -- Function File: bootwild (y, X, ..., NBOOT, ALPHA) -- Function File: bootwild (y, X, ..., NBOOT, ALPHA, SEED) -- Function File: bootwild (y, X, ..., NBOOT, ALPHA, SEED, L) -- Function File: STATS = bootwild (y, ...) -- Function File: [STATS, BOOTSTAT] = bootwild (y, ...) -- Function File: [STATS, BOOTSTAT, BOOTSSE] = bootwild (y, ...) -- Function File: [STATS, BOOTSTAT, BOOTSSE, BOOTFIT] = bootwild (y, ...) 'bootwild (y)' performs a null hypothesis significance test for the mean of y being equal to 0. This function implements wild bootstrap-t resampling of Webb's 6-point distribution of the residuals and computes confidence intervals and p-values [1-4]. The following statistics are printed to the standard output: - original: the mean of the data vector y - std_err: heteroscedasticity-consistent standard error(s) (HC1) - CI_lower: lower bound(s) of the 95% bootstrap-t confidence interval - CI_upper: upper bound(s) of the 95% bootstrap-t confidence interval - tstat: Student's t-statistic - pval: two-tailed p-value(s) for the parameter(s) being equal to 0 - fpr: minimum false positive risk for the corresponding p-value By default, the confidence intervals are symmetric bootstrap-t confidence intervals. The p-values are computed following both of the guidelines by Hall and Wilson [5]. The minimum false positive risk (FPR) is computed according to the Sellke-Berger approach as as described in [6,7]. 'bootwild (y, X)' also specifies the design matrix (X) for least squares regression of y on X. X should be a column vector or matrix the same number of rows as y. If the X input argument is empty, the default for X is a column of ones (i.e. intercept only) and thus the statistic computed reduces to the mean (as above). The statistics calculated and returned in the output then relate to the coefficients from the regression of y on X. 'bootwild (y, X, CLUSTID)' specifies a vector or cell array of numbers or strings respectively to be used as cluster labels or identifiers. Rows in y (and X) with the same CLUSTID value are treated as clusters with dependent errors. Rows of y (and X) assigned to a particular cluster will have identical resampling during wild bootstrap. If empty (default), no clustered resampling is performed and all errors are treated as independent. The standard errors computed are cluster robust. 'bootwild (y, X, BLOCKSZ)' specifies a scalar, which sets the block size for bootstrapping when the residuals have serial dependence. Identical resampling occurs within each (consecutive) block of length BLOCKSZ during wild bootstrap. Rows of y (and X) within the same block are treated as having dependent errors. If empty (default), no block resampling is performed and all errors are treated as independent. The standard errors computed are cluster robust. 'bootwild (y, X, ..., NBOOT)' specifies the number of bootstrap resamples, where NBOOT must be a positive integer. If empty, the default value of NBOOT is 1999. 'bootwild (y, X, ..., NBOOT, ALPHA)' is numeric and sets the lower and upper bounds of the confidence interval(s). The value(s) of ALPHA must be between 0 and 1. ALPHA can either be: o scalar: To set the (nominal) central coverage of SYMMETRIC bootstrap-t confidence interval(s) to 100*(1-ALPHA)%. For example, 0.05 for a 95% confidence interval. o vector: A pair of probabilities defining the (nominal) lower and upper bounds of ASYMMETRIC bootstrap-t confidence interval(s) as 100*(ALPHA(1))% and 100*(ALPHA(2))% respectively. For example, [.025, .975] for a 95% confidence interval. The default value of ALPHA is the scalar: 0.05, for symmetric 95% bootstrap-t confidence interval(s). 'bootwild (y, X, ..., NBOOT, ALPHA, SEED)' initialises the Mersenne Twister random number generator using an integer SEED value so that 'bootwild' results are reproducible. 'bootwild (y, X, ..., NBOOT, ALPHA, SEED, L)' multiplies the regression coefficients by the hypothesis matrix L. If L is not provided or is empty, it will assume the default value of 1 (i.e. no change to the design). 'STATS = bootwild (...) returns a structure with the following fields: original, std_err, CI_lower, CI_upper, tstat, pval, fpr and the sum-of- squared error (sse). '[STATS, BOOTSTAT] = bootwild (...) also returns a vector (or matrix) of bootstrap statistics (BOOTSTAT) calculated over the bootstrap resamples (before studentization). '[STATS, BOOTSTAT, BOOTSSE] = bootwild (...) also returns a vector containing the sum-of-squared error for the fit on each bootstrap resample. '[STATS, BOOTSTAT, BOOTSSE, BOOTFIT] = bootwild (...) also returns an N-by-NBOOT matrix containing the N fitted values for each of the NBOOT bootstrap resamples. Bibliography: [1] Wu (1986). Jackknife, bootstrap and other resampling methods in regression analysis (with discussions). Ann Stat.. 14: 1261–1350. [2] Cameron, Gelbach and Miller (2008) Bootstrap-based Improvements for Inference with Clustered Errors. Rev Econ Stat. 90(3), 414-427 [3] Webb (2023) Reworking wild bootstrap-based inference for clustered errors. Can J Econ. https://doi.org/10.1111/caje.12661 [4] Cameron and Miller (2015) A Practitioner’s Guide to Cluster-Robust Inference. J Hum Resour. 50(2):317-372 [5] Hall and Wilson (1991) Two Guidelines for Bootstrap Hypothesis Testing. Biometrics, 47(2), 757-762 [6] Colquhoun (2019) The False Positive Risk: A Proposal Concerning What to Do About p-Values, Am Stat. 73:sup1, 192-201 [7] Sellke, Bayarri and Berger (2001) Calibration of p-values for Testing Precise Null Hypotheses. Am Stat. 55(1), 62-71 bootwild (version 2024.05.23) Author: Andrew Charles Penn https://www.researchgate.net/profile/Andrew_Penn/ Copyright 2019 Andrew Charles Penn This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/
The following code
% Input univariate dataset heights = [183, 192, 182, 183, 177, 185, 188, 188, 182, 185].'; % Compute test statistics, confidence intervals and p-values (H0 = 0) bootwild (heights); % Please be patient, the calculations will be completed soon...
Produces the following output
Summary of wild bootstrap null hypothesis significance tests for linear models ******************************************************************************* Bootstrap settings: Function: pinv (X) * y Resampling method: Wild bootstrap-t Number of resamples: 1999 Standard error calculations: Heteroscedasticity-Consistent (HC1) Confidence interval (CI) type: Symmetric bootstrap-t interval Nominal central coverage: 95% Null value (H0) used for hypothesis testing (p-values): 0 Test Statistics: original std_err CI_lower CI_upper t-stat p-val FPR +184.5 1.310 +181.6 +187.4 +141. <.001 .010
The following code
% Input bivariate dataset X = [ones(43,1),... [01,02,03,04,05,06,07,08,09,10,11,... 12,13,14,15,16,17,18,19,20,21,22,... 23,25,26,27,28,29,30,31,32,33,34,... 35,36,37,38,39,40,41,42,43,44]']; y = [188.0,170.0,189.0,163.0,183.0,171.0,185.0,168.0,173.0,183.0,173.0,... 173.0,175.0,178.0,183.0,192.4,178.0,173.0,174.0,183.0,188.0,180.0,... 168.0,170.0,178.0,182.0,180.0,183.0,178.0,182.0,188.0,175.0,179.0,... 183.0,192.0,182.0,183.0,177.0,185.0,188.0,188.0,182.0,185.0]'; % Compute test statistics, confidence intervals and p-values (H0 = 0) bootwild (y, X); % Please be patient, the calculations will be completed soon...
Produces the following output
Summary of wild bootstrap null hypothesis significance tests for linear models ******************************************************************************* Bootstrap settings: Function: pinv (X) * y Resampling method: Wild bootstrap-t Number of resamples: 1999 Standard error calculations: Heteroscedasticity-Consistent (HC1) Confidence interval (CI) type: Symmetric bootstrap-t interval Nominal central coverage: 95% Null value (H0) used for hypothesis testing (p-values): 0 Test Statistics: original std_err CI_lower CI_upper t-stat p-val FPR +175.5 2.563 +169.8 +181.2 +68.5 <.001 .010 +0.1904 0.08460 +0.003534 +0.3773 +2.25 .047 .280
Package: statistics-resampling