In regression analysis, bootstrapping is an efficient tool for statistical
deduction, which focused on making a sampling distribution with the key idea of
resampling the originally observed data with replacement1. The term
bootstrapping, proposed by Bradley Efron in his “Bootstrap methods:
another look at the jackknife” published in 1979, is extracted from the cliché
of ‘pulling oneself up by one’s bootstraps’2. So, from the meaning
of this concept, sample data is considered as a population and replacement
samples are constantly drawn from the sample data, which is considered as a
population, to generate the statistical deduction about original sample data. The essential bootstrap analogy states that “the
population is to the sample as the sample is to the bootstrap samples”2.
The bootstrap falls into two types, parametric and nonparametric. Parametric
bootstrapping assumes that the original data set is drawn from some specific
distributions, e.g. normal distribution2. And the samples generally are
pulled as the same size as the original data set. Nonparametric
bootstrapping is right the one described in the start of this summary, which repeatedly
and randomly draws a certain size of bootstrapping samples from the original
data. According to our regression analysis lecture, bootstrapping is quite useful
in non-linear regression and generalized linear models. For small sample size,
the parametric bootstrapping method is highly preferred.2 In large
sample size, nonparametric bootstrapping method would be preferably utilized. For
a more detailed clarification of nonparametric bootstrapping, a sample data
set, A = {x1, x2, …, xk} is randomly drawn from a population B = {X1, X2,
…, XK} and K is much larger than k. The statistic T = t(A) is considered as
an estimate of the corresponding population parameter P = t(B).2 Nonparametric
bootstrapping generates the estimate of the sampling distribution of a
statistic in an empirical way. No
assumptions of the form of the population is necessary. Next, a sample of size k
is drawn from the elements of A with replacement, which represents as A?1 = {x?11, x?12, …, x?1k}. In the resampling,
a * note is added to distinguish resampled data from original data. Replacement
is mandatory and supposed to be repeated typically one thousand or ten thousand
times, which is still developing since computation power develops, otherwise
only original sample A would be generated.1 And for each bootstrap estimate of
these samples, mean is calculated to estimate the expectation of the
bootstrapped statistics. Mean minus T is
the estimate of T’s bias. And T?, the bootstrap variance estimate, estimates the sampling variance of the population, P. Then bootstrap confidence
intervals can be calculated using either bootstrap percentile interval approach
or normal theory interval approach. Confidence intervals by bootstrap percentile
method is to use the empirical quantiles of the bootstrap estimates, which is
written as
T?(lower)
Hi!
I'm Josephine!
Would you like to get a custom essay? How about receiving a customized one?
Check it out