= mean downtime for digital tachometers
The standard pooled t-test may be performed in Lotus 4+ using the
@function:
@@TTEST(DATA1,DATA2,0,1)
where @TTEST performs a t-test on DATA1, the observations on the
analog tachs, and DATA2, the observations on the digital tachs, 0
is the type of test (pooled t), and 1 is the number of tails in the
test. This function returns a p-value, in this case, 3.154%, so
that for an of 5% we would reject the null and conclude that the
digital tachs produced a lower mean downtime.
Classroom "Hands-on" Resampling Analysis
To begin the analysis, have the students think about the
implications of believing that the null hypothesis is true, that
is, that the average downtime using the analog tachs is equal to
(or less than) the average downtime using the digital tachs.
Hopefully, they will quickly pick up on the fact that it implies
that the individual values could have been observed from either
population. Furthermore, the class can be led to understand that
the population of observations is infinite and thus resampling
should be with replacement. This leads to resampling with a
bootstrap-type test which can be done using the procedure given in
Table 1 and provides understanding through seeing and reasoning
together. Plus, the involvement keeps the students interested!
Using the Computer and Lotus to Replicate
At least one software package has already been developed for
resampling. It is Resampling Stats, developed by Resampling Stats
Software (Rosa-Hatko, 1995). Since most students are already
familiar with spreadsheets, they don't have to learn a new software
package. And, since a spreadsheet approach offers certain
advantages, we have developed a series of macros in Lotus 4+ to
resample for several of the more common applications of hypothesis
testing and confidence intervals.
The data for our example is entered in the input worksheet shown in
Figure 1. The number of resamples desired is set at 1,000, and H0,
the value for the null hypothesis, is set at zero. We have also
selected U for the upper-tailed option and 95% for the confidence
level.
When through with the input, the user clicks the RESAMPLE macro
button with the mouse arrow. The macro will first automatically
clear the results of any previous resampling. Then it will combine
the observations from the two samples into one set of 45
observations, and from this will construct a probability
distribution. The Lotus @VLOOKUP function is then used to randomly
select two samples from this distribution. The macro computes the
mean of each sample, subtracts the second from the first, and
stores the differences, in our case, for 1,000 pairs of samples.
This collection of differences is converted into a frequency
distribution using the Range-Analyze-Distribution command and then
graphed as the distribution of sample differences.
To obtain p-values, the @PRANK function is used to find the
percentile of the difference between the two original sample means
in the resampled differences. The @PRANK function is also used to
compute the lower and upper limits of the confidence interval.
All of the output will appear in a separate RESULTS worksheet, in
three sections. The first section to appear is shown in Figure 2,
which lists the p-values and confidence intervals from resampling.
As checks, the theoretical p-values and confidence intervals are
also shown. As can be seen, the resampling p-value of 3.505% agrees
fairly well with the theoretical p-value of 3.154%. Likewise, the
resampled confidence interval is close to the theoretical one.
To view the graph, the user should scroll down one screen and the
graph will appear as in Figure 3. The graph can be seen to
approximate the normal fairly well, which is not too surprising.
When finished, the user may run another set of resamples by
clicking the REPEAT macro button. This is faster than the RESAMPLE
macro because the original setup of the probability distribution
and graph have already been made and don't have to be done from
scratch.
At present, we have developed resampling macros for hypothesis
testing and confidence intervals for one-mean, two-means and paired
difference situations. We plan to do one or two more special cases
such as these, and then to develop a more generalized macro in
which the user (who should be reasonably proficient in Lotus) can
write his or her own formulas and lookup functions to adapt to any
application encountered.
Conclusion
We think the macro package discussed here, or one like it, offers
a good way of introducing students to resampling. It is
spreadsheet-based, which means no time spent learning a new
package, it's free (we would be happy to share it) and it allows
the student and instructor to see what is going on by looking at
the macros, the resamples themselves and the graphs. The
spreadsheet, in other words, is transparent. We hope it allows
interested persons to be introduced to this exciting new topic of
resampling.
References
Jones, R.W. (1991) "Digital Technology vs. Analog Technology."
Unpublished class project.
Peterson, I. (1991) "Pick a Sample." Science News, v 140, pp 56-58.
Ricketts, C. and Berry, J. (1994) "Teaching Statistics Through
Resampling." Teaching Statistics, v 16, no 2, pp 41-44.
Rosa-Hatko, W. (1995) "Resampling Stats." ORMS Today, v 22, no 2, pp
72-74.
Simon, J. L. (1993) Resampling: The New Statistics. Duxbury.
Simon, J. L. (1994) "The Resampling Method for Statistical Inference."
Basics chapter of Philosophy document of Internet manuscript.
TABLE 1: Bootstrap procedure for pooled t-test.
|
Step 1: | Put disks with the observed values into an urn (or bag or
box),, mix them up,, and select one. Record the value as
observation 1 for the first sample. Replace the disk and
repeat the process until both samples have been selected.
|
|
Step 2: | Compute the sample means and their difference.
|
|
Step 3: | Replicate Steps 1 and 2 at least 10 times.
|
|
Step 4: | List the values of obtained from the replications,, and
discuss why they vary and the concept of sampling error.
|
|
Step 5: | Compare the value of for the actual data to the list
obtained in Step 4. How does it compare?
|
|
Step 6: | Discuss the need to perform many more replications,
produce a histogram of the observed differences, and
compare the original sample difference to the values
observed via resampling.
|
For copies of figures
mentioned in this article,
contact the Managing Editor
at hjacobs@gsu.edu.
RICHARD L. MORRIS is Professor of Quantitative Methods in the
School of Business Administration at Winthrop University. He
received a B.S. in mechanical engineering from West Virginia
University, an M.B.A. from the College of William and Mary, and a
Ph.D. in management science from Virginia Tech. His published work
has been in the areas of multiobjective decision analysis and
finance.
BARBARA A. PRICE is a Professor of Quantitative Methods in the
School of Business at Winthrop University. Her degrees are from
Grove City College (B.S. - mathematics) and Virginia Tech (M.S. and
Ph.D - statistics). Price is a member of the Decision Sciences
Institute, INFORMS and IIF; regularly participates in meetings as
a presenter, reviewer, discussant and session chair; and has been
an officer in SEDSI since 1990.
Dr. Rick Hesse
Industrial and Systems Engineering Department
Mercer University
Macon, GA 31207