IN THE CLASSROOMRICK HESSE, Feature Editor, Mercer University
Random Samples and Confidence Intervalsby Rick Hesse, Mercer University A quick exercise for a statistics lab, homework assignment, or even a class demonstration, is to draw random samples from a finite population to illustrate confidence intervals. By entering the data in a column on a spreadsheet and in the column to the left (for convenience sake) the Lotus @RAND function (RAND() in Excel), these two columns can be sorted according to the random number, thus randomly reordering the data set. By taking the top few, a random sample, without replacement, is taken from a finite population with a known mean (m) and variance (s) and the average computed. This can be compared to various confidence intervals to see if it falls within it. Then, by recalculating the spreadsheet again, new random numbers are generated, and then by sorting the data field again, a new sample of "n" is generated, and so on. In its simplest form, Figure 1 shows the spreadsheet, with rows 12-27 hidden. The original data is in E2..E29, and consists of the prices of 28 brands of compact disk players, from Berenson and Levine (Mark L. Berenson and David M. Levine, Basic Business Statistics: Concepts and Application, Prentice Hall, Englewood Cliffs, NJ, 1992, pp. 149-150). Documentation: Each Student can be instructed to "draw" 50 or 100 random samples of size 4 (or 5, 6, etc.) and count the number of times the sample average is inside the confidence interval. Several additions could be made to the spreadsheet. First, 90% and 99% confidence intervals could be added, and computations kept on how many times the random sample was within each interval. Second, one way to allow variable sample sizes, is to add two columns next to the data in columns F and G. Column F simply numbers the reordered sample (don't sort columns F and G), and the formulas for cell C5 and column G are shown in Figure 2. Documentation: Column G contains the "n" sample values, with all the others being 0. By using the @SUM function and dividing by the number in the sample, the average is computed but it will allow cell C5 to be changed. A third addition might be a macro to sort columns D and E and then recalculate the sample average and determine if it is within the confidence interval. You may have noticed when trying this that when this recalculation is done, the random numbers in column D changed and are no longer in ascending order. Incidentally, the sorting order can be either ascending or descendingþit really shouldn't make a difference. The idea behind this simple spreadsheet exercise is to give students a hands-on feel for drawing random samples and understanding confidence intervals. The larger the finite data set, the longer it will take to calculate (and enter in the original data), but then larger samples can be taken also. Some spreadsheets have functions that will determine the critical values of z given a. |