How to Perform Bootstrapping in Excel
Bootstrapping is a statistical technique used to estimate the accuracy of a sample by resampling the data with replacement. In Excel, you can perform bootstrapping using a combination of formulas and random sampling. Here's a step-by-step guide on how to perform bootstrapping in Excel:
- Organize your data: Make sure your data is organized in a single column. For example, let's assume you have a data set in column A, from A2 to A101 (100 data points).
- Calculate the statistic of interest: Calculate the statistic you want to estimate using bootstrapping, such as the mean, median, or standard deviation, for the original data set. For example, let's calculate the mean in cell B2 using the formula
=AVERAGE(A2:A101)
.
Example 1: Bootstrapping Mean
- Generate a random sample with replacement: In column C, create a random sample with replacement from the original data. In cell C2, enter the formula
=INDEX($A$2:$A$101, RANDBETWEEN(1, 100))
. Copy this formula down to C101. - Calculate the statistic for the random sample: Calculate the same statistic for the random sample that you calculated for the original data set. In our example, we'll calculate the mean again, so in cell D2, enter the formula
=AVERAGE(C2:C101)
. - Repeat steps 3 and 4 for multiple iterations: To get a more accurate estimate, you need to repeat the resampling process multiple times. For example, let's perform 1,000 iterations. Copy the formulas in columns C and D to columns E through IV. Each pair of columns represents one iteration.
- Calculate the average and standard deviation of bootstrapped statistics: In cells IW2 and IW3, calculate the average and standard deviation of the 1,000 bootstrapped means using the formulas
=AVERAGE(D2:IV2)
and=STDEV.S(D2:IV2)
. - Interpret the results: Compare the mean and standard deviation of the bootstrapped statistics to the original data set's statistic. This will give you an idea of the accuracy and variability of your original data set's statistic.
In our example, the bootstrapped mean and standard deviation give us an estimate of the accuracy and variability of the mean calculated from the original data set. You can perform similar bootstrapping for other statistics like median or standard deviation by replacing the AVERAGE function with the respective statistical function in step 2 and step 4.
Note: Since the RANDBETWEEN function generates random numbers, the results may change every time you recalculate the spreadsheet. To prevent this, you can copy the entire bootstrapped data (columns C through IV), and use "Paste Values" to replace the formulas with fixed numbers.