randtest

 Performs permutation or randomization tests for regression coefficients.

 -- Function File: PVAL = randtest (X, Y)
 -- Function File: PVAL = randtest (X, Y, NREPS)
 -- Function File: PVAL = randtest (X, Y, NREPS, FUNC)
 -- Function File: PVAL = randtest (X, Y, NREPS, FUNC, SEED)
 -- Function File: [PVAL, STAT] = randtest (...)
 -- Function File: [PVAL, STAT, FPR] = randtest (...)
 -- Function File: [PVAL, STAT, FPR, PERMSTAT] = randtest (...)

     'PVAL = randtest (X, Y)' uses the approach of Manly [1,2] to perform
     a randomization (or permutation) test of the null hypothesis that
     coefficients from the regression of Y on X are significantly different
     from 0. The value returned is a 2-tailed p-value. Note that the Y values
     are centered before randomization or permutation to also provide valid
     null hypothesis tests of the intercept. To include an intercept term in
     the regression, X must contain a column of ones.

     Hint: For one-sample or two-sample randomization/permutation tests,
     please use the 'randtest1' or 'randtest2' functions respectively.

     'PVAL = randtest (X, Y, NREPS)' specifies the number of resamples without
     replacement to take in the randomization test. By default, NREPS is 5000.
     If the number of possible permutations is smaller than NREPS, the test
     becomes exact. For example, if the number of sampling units (i.e. rows
     in Y) is 6, then the number of possible permutations is factorial (6) =
     720, so NREPS will be truncated at 720 and sampling will systematically
     evaluate all possible permutations. 

     'PVAL = randtest (X, Y, NREPS, FUNC)' also specifies a custom function
     calculated on the original samples, and the permuted or randomized
     resamples. Note that FUNC must compute statistics related to regression,
     and should either be a:
        o function handle or anonymous function,
        o string of function name, or
        o a cell array where the first cell is one of the above function
          definitions and the remaining cells are (additional) input arguments 
          to that function (other than the data arguments).
        See the built-in demos for example usage with @mldivide for linear
        regression coefficients, or with @cor for the correlation coefficient.
        The default value of FUNC is @mldivide.

     'PVAL = randtest (X, Y, NREPS, FUNC, SEED)' initialises the Mersenne
     Twister random number generator using an integer SEED value so that
     the results of 'randtest' results are reproducible when the
     test is approximate (i.e. when using randomization if not all permutations
     can be evaluated systematically).

     '[PVAL, STAT] = randtest (...)' also returns the test statistic.

     '[PVAL, STAT, FPR] = randtest (...)' also returns the minimum false
     positive risk (FPR) calculated for the p-value, computed using the
     Sellke-Berger approach.

     '[PVAL, STAT, FPR, PERMSTAT] = randtest (...)' also returns the
     statistics of the permutation distribution.

  Bibliography:
  [1] Manly (1997) Randomization, Bootstrap and Monte Carlo Method in Biology.
       2nd Edition. London: Chapman & Hall.
  [2] Hesterberg, Moore, Monaghan, Clipson, and Epstein (2011) Bootstrap
       Methods and Permutation Tests (BMPT) by in Introduction to the Practice
       of Statistics, 7th Edition by Moore, McCabe and Craig.

  randtest (version 2024.04.17)
  Author: Andrew Charles Penn
  https://www.researchgate.net/profile/Andrew_Penn/

  Copyright 2019 Andrew Charles Penn
  This program is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.

  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see http://www.gnu.org/licenses/

Demonstration 1

The following code


 % Randomization or permutation test for linear regression (without intercept)
 % cd4 data in DiCiccio and Efron (1996) Statistical Science
 X = [212 435 339 251 404 510 377 335 410 335 ...
      415 356 339 188 256 296 249 303 266 300]';
 Y = [247 461 526 302 636 593 393 409 488 381 ...
      474 329 555 282 423 323 256 431 437 240]';

 % Randomization test to assess the statistical significance of the
 % regression coefficient being different from 0. (Model: y ~ x or y = 0 + x,
 % i.e. linear regression through the origin)
 [pval, stat]  = randtest (X, Y, 5000) % Default value of FUNC is @mldivide

Produces the following output

pval = 0.0002
stat = 1.2334

Demonstration 2

The following code


 % Randomization or permutation test for linear regression (with intercept)
 % cd4 data in DiCiccio and Efron (1996) Statistical Science
 X = [212 435 339 251 404 510 377 335 410 335 ...
      415 356 339 188 256 296 249 303 266 300]';
 Y = [247 461 526 302 636 593 393 409 488 381 ...
      474 329 555 282 423 323 256 431 437 240]';
 N = numel (Y);

 % Randomization test to assess the statistical significance of the
 % regression coefficients (intercept and slope) being different from 0.
 % (Model: y ~ 1 + x, i.e. linear regression with intercept)
 X1 = cat (2, ones (N, 1), X);
 [pval, stat]  = randtest (X1, Y, 5000) % Default value of FUNC is @mldivide

Produces the following output

pval =

      0.53905
   0.00035035

stat =

       69.038
       1.0349

Demonstration 3

The following code


 % Randomization or permutation test for the correlation coefficient
 % cd4 data in DiCiccio and Efron (1996) Statistical Science
 X = [212 435 339 251 404 510 377 335 410 335 ...
      415 356 339 188 256 296 249 303 266 300]';
 Y = [247 461 526 302 636 593 393 409 488 381 ...
      474 329 555 282 423 323 256 431 437 240]';

 % Randomization test to assess the statistical significance of the
 % correlation coefficient being different from 0. This is equivalent to 
 % the slope regression coefficient for a linear regression (with intercept)
 % of standardized x and y values.
 [pval, stat] = randtest (X, Y, 5000, @cor)

Produces the following output

pval = 0.0002
stat = 0.72317

Package: statistics-resampling