1 Estimating the Error of the Error

1.1 Recap of The Bootstrap

The bootstrap is very general because – if used properly – it does require only minimal assumptions on the random sample. In particular, it is applicable for and non- distributions. It works for primary and derived quantities, and automatically takes correlation into account. The bootstrap can be applied in an automated way. However, it comes for the price that it is computationally demanding, but with nowadays computing power this is no longer a serious issue. For a very good introduction to the bootstrap see Efron and Tibshirani (1994).

It is based on sampling from the empirical distribution function \(\hat f\) instead of the true distribution function \(f\). Let \(\hat\theta\) be an estimator of some quantity to be determined inclusive its uncertainty. We generate \(R\) bootstrap samples \(x^{\star1}, ..., x^{\star R}\) from \(\hat f\), compute \(R\) bootstrap estimates \(\hat\theta^{\star 1}, ..., \hat\theta^{\star R}\) and estimate the standard error from the standard deviation over these \(R\) bootstrap samples. In single steps this amounts to

  1. draw \(R\) random samples from \(\hat f\) by randomly sampling from \(x_1, ..., x_N\) \[\begin{equation} \hat f\quad\to\quad (x_1^{\star b}, ..., x_N^{\star b}) = (x_{\sigma(1)}, ..., x_{\sigma(N)})\,. \end{equation}\] Here \(\sigma(i)\), \(i=1,...,N\) is a random permutation drawn from a discrete uniform distribution \(L_N\) with replacement. So, for instance this could be implemented by setting for all \(b=1,...,R\) and all \(i=1,...,N\) \[\begin{equation} \begin{split} u\ \sim\ \mathcal{U}_{\left[0,1\right]}\,,\qquad k = \lfloor u \cdot N\rfloor +1\,,\\ \Rightarrow x_{i}^{\star b}\ =\ x_{k}\,,\\ \end{split} \end{equation}\] where \(\mathcal{U}_{\left[0,1\right]}\) is the continuous uniform distribution over the interval \([0,1]\). Each \(x^{\star b}\) for \(b=1,...,R\) we call a bootstrap sample.

  2. evaluate \[ \hat \theta^{\star b} = t(x^{\star b})\,,\qquad b=1, ..., m\,. \] Each \(\hat\theta^{\star b}\) we call a bootstrap replication of \(\hat\theta\).

  3. the estimate of the standard error is then given by the standard deviation over the bootstrap replications \[\begin{equation} \label{eq:bootse} \Delta^\star_R\ =\ \mathrm{sd}_R(\theta^{\star b})\ =\ \ \sqrt{\frac{\sum_b(\hat\theta^{\star b} - \bar\theta^{\star})^2}{(R-1)}}\,, \end{equation}\] with the bootstrap mean \[\begin{equation} \bar\theta^{\star}\ =\ \frac{1}{R}\ \sum_{b=1}^R\ \hat\theta^{\star b}\,. \end{equation}\]
  4. the bootstrap bias is given by \[ \mathrm{bias}^\star_\theta\ =\ \bar\theta^\star - \bar\theta \] which should fulfil \(\mathrm{bias}^\star_\theta\ll \Delta^\star_R\) in practice.

So far we have implicitly assumed that all errors are normally distributed. However, this does not need to be the case. If the errors are not normally distributed or the distribution is not known the standard deviation over the bootstrap samples needs only be replaced by the \(68\%\) confidence interval of the empirical distribution of \(\hat\theta\), i.e. \(\hat f_{\hat\theta}\). The \(\alpha\)–quantile of \(\hat f_{\hat\theta}\) can be estimated from \[\begin{equation} x_\alpha\ =\ \sup_{x_1, ..., x_N}\{x_i: \hat F_{\hat\theta}(x_i) < \alpha\}\,, \end{equation}\] with \(\hat F_{\hat\theta}\) the corresponding cumulative empirical distribution function. Alternatives could be to interpolate between the infimum of the set \(\{x_i: \hat F(x_i) < \alpha\}\) and the supremum of the set \(\{x_i: \hat F(x_i) > \alpha\}\) over \(x_1, ..., x_N\). In particular for smallish \(N\) the estimate of \(x_\alpha\) is not very precise. Of course, \(x_\alpha\) as a function of \(\alpha\) it is non-continuous per definition.

Having introduced this, it is important to always review the distribution of \(\hat\theta^\star\). If the distribution is normal, this is most conveniently done by producing a so-called QQ-plot: the empirical quantiles of \(\hat\theta^\star\) standardised by the bootstrap mean and error are plotted versus the theoretical quantiles of the standard normal distribution \(\mathcal{N}_{0,1}\). The resulting points should be close to the bisecting line. Depending on \(N\), the tails of the empirical distribution are often not sampled very well, but around the mean the agreement with the theoretical quantiles should be good. See section~ for an example.

We have discussed so far how to estimate the statistical uncertainty of some statistics \(t(x_1, ..., x_N)\). Since this statistical error is itself estimated from the data it is useful to also estimate the error of the error. Given the bootstrap method, there is a natural way to determine the error on the error. Namely by performing a bootstrap analysis on the bootstrap samples itself. This is called double bootstrap. However, the double bootstrap can be rather computer time intensive, because it involves a full bootstrap analysis on each of the \(R\) bootstrap samples, as we will see next.

1.2 Double Bootstrap

In order to estimate the error of the bootstrap error one needs to estimate the standard deviation on each bootstrap sample separately. This can be achieved by a double bootstrap procedure: for every bootstrap sample \(x^{\star b}\), \(b=1,\dotsc,R\) one generates \(R^\prime\) bootstrap samples \(x^{\star\star bb^\prime}\), \(b^\prime=1,\dotsc,R^\prime\) by uniformly sampling from \(x^{\star b}\) with replacement.

For each \(b=1,\dotsc,R\) one can now estimate \(\Delta^{\star\star b}\) from \[ \Delta^{\star\star b}_{R^\prime}\ = \ \mathrm{sd}_{R^\prime}(x^{\star\star bb^\prime})\,. \] The double bootstrap estimate of standard error on \(\Delta^\star\) is then given as \[\begin{equation} \Delta^{\star\star}_{RR^\prime}(\Delta^\star)\ =\ \mathrm{sd}_R(\Delta^{\star\star b})\ =\ \sqrt{\frac{\sum_{b=1}^R(\Delta_{R^\prime}^{\star\star b}-\bar\Delta_{R^\prime}^{\star\star})^2}{R-1}}\,. \end{equation}\] This procedure can be generalised directly for functions other than the standard deviation. One obvious example is the covariance matrix needed for \(\chi^2\)–fits as discussed previously. It can be computed using the double bootstrap procedure and the frozen covariance matrix can be replaced by the covariance matrix computed on each bootstrap sample.

Clearly, the computing resources required for the double bootstrap scale like \(R\cdot R^\prime\), which is the major downside of this method. This is even more true when a tripple bootstrap is attempted. However, it is certainly the most direct way to estimate statistical uncertainties of error estimates.

2 Exercise

  1. Implement the double bootstrap procedure for the standard estimator of the mean
  2. Study the error of the error of the mean on the plaquette history data set assuming no autocorrelation

References

Efron, B., and R.J. Tibshirani. 1994. An Introduction to the Bootstrap. Chapman; Hall/CRC.