Statistics: jackknife

Function Reference: `jackknife`

statistics: jackstat = jackknife (E, x)
statistics: jackstat = jackknife (E, x, …)

Compute jackknife estimates of a parameter taking one or more given samples as parameters.

In particular, E is the estimator to be jackknifed as a function name, handle, or inline function, and x is the sample for which the estimate is to be taken. The i-th entry of jackstat will contain the value of the estimator on the sample x with its i-th row omitted.

 jackstat (i) = E(x(1 : i - 1, i + 1 : length(x)))

Depending on the number of samples to be used, the estimator must have the appropriate form:

If only one sample is used, then the estimator need not be concerned with cell arrays, for example jackknifing the standard deviation of a sample can be performed with jackstat = jackknife (@std, rand (100, 1)).
If, however, more than one sample is to be used, the samples must all be of equal size, and the estimator must address them as elements of a cell-array, in which they are aggregated in their order of appearance:

 jackstat = jackknife (@(x) std(x{1})/var(x{2}),
 rand (100, 1), randn (100, 1))

If all goes well, a theoretical value P for the parameter is already known, n is the sample size,

t = n * E(x) - (n - 1) * mean(jackstat)

and

v = sumsq(n * E(x) - (n - 1) * jackstat - t) / (n * (n - 1))

then

(t-P)/sqrt(v) should follow a t-distribution with n-1 degrees of freedom.

Jackknifing is a well known method to reduce bias. Further details can be found in:

References

Rupert G. Miller. The jackknife - a review. Biometrika (1974), 61(1):1-15. doi:10.1093/biomet/61.1.1
Rupert G. Miller. Jackknifing Variances. Ann. Math. Statist. (1968), Volume 39, Number 2, 567-582. doi:10.1214/aoms/1177698418

Source Code: jackknife

Example: 1


 for k = 1:1000
   rand ("seed", k);  # for reproducibility
   x = rand (10, 1);
   s(k) = std (x);
   jackstat = jackknife (@std, x);
   j(k) = 10 * std (x) - 9 * mean (jackstat);
 endfor
 figure();
 hist ([s', j'], 0:sqrt(1/12)/10:2*sqrt(1/12))

Example: 2


 for k = 1:1000
   randn ("seed", k); # for reproducibility
   x = randn (1, 50);
   rand ("seed", k);  # for reproducibility
   y = rand (1, 50);
   jackstat = jackknife (@(x) std(x{1})/std(x{2}), y, x);
   j(k) = 50 * std (y) / std (x) - 49 * mean (jackstat);
   v(k) = sumsq ((50 * std (y) / std (x) - 49 * jackstat) - j(k)) / (50 * 49);
 endfor
 t = (j - sqrt (1 / 12)) ./ sqrt (v);
 figure();
 plot (sort (tcdf (t, 49)), ...
       "-;Almost linear mapping indicates good fit with t-distribution.;")

Categories &

Functions List

Clustering

Clustering

Classification Classes

Classification Classes

Clustering Classes

Clustering Classes

Regression Classes

Regression Classes

Data Manipulation

Data Manipulation

Descriptive Statistics

Descriptive Statistics

Distribution Classes

Distribution Classes

Distribution Fitting

Distribution Fitting

Distribution Functions

Distribution Functions

Distribution Statistics

Distribution Statistics

Distribution Wrappers

Distribution Wrappers

Experimental Design

Experimental Design

Machine Learning

Machine Learning

Model Fitting

Model Fitting

Hypothesis Testing

Hypothesis Testing

I/O

I/O

Interpolation

Interpolation

Plotting

Plotting

Regression

Regression

Transforms

Transforms

Function Reference: jackknife

statistics: jackstat = jackknife (E, x)

statistics: jackstat = jackknife (E, x, …)

References

Example: 1

Example: 2

Function Reference: `jackknife`

`statistics: jackstat = jackknife (E, x)`

`statistics: jackstat = jackknife (E, x, …)`