|
IN THE CLASSROOM
Rick Hesse, Feature
Editor, Industrial and Systems Engineering Department,
Box and Whiskers Plotsby Rick Hesse, Feature Editor
It has been some time since I
have taught undergraduate statistics, and my first challenge was to
see if the new spreadsheets have added a box and whiskers plot to
its repertoire. Unfortunately, it has not, even in Excel 97, which
I got to beta-test this summer. I would have thought that this
alone was worth more than a smiling paper clip (named Bob) but
Microsoft felt otherwise. The closest Excel comes is a new graph
called Stocks, which uses four data points, not the five needed for
the box and whiskers plot. As a result I decided that for this
quarter I would make up a simple template so that my students could
"draw" a box and whiskers chart, which is shown in Figure 1 for
data that was generated using random numbers that were N(0,1) with
a mean of 45.00 and standard deviation of 10.00.
The data for the x-axis is determined from the data and includes
the minimum, maximums, and the three quartiles. The spreadsheet
functions are MIN, MAX, and QUARTILE, shown in Figure 1 next to
each first occurrence. The data for the y-axis is simply 0.5 for
the bottom line, 1.5 for the middle lines (whiskers) and 2.5 for
the top line. The simple commands in Excel that are used are shown
on the spreadsheet in Figure 1.
The data for this chart is found in cells A3:J11 and is named
"block." The data points for the chart are determined in D14:E30,
with blank rows 16, 22, 25, and 28. These blank rows serve as
separators in an XY plot and mark the beginning and ending of
lines, which in this case are indicated with small circles. The XY
plot of D14:E30 is shown in Figure 2.
Since XY plots show the y-axis by default, you must double-click on
the graph and click on the y-axis and ask to not show it. If you
have more than 90 data points, or your data is in a differently
shaped rectangle, simple bring up Insert Name Define from the menu
and change the location of the data range called "block."
The box and whiskers chart in Figure 2 shows the whiskers as the
distance between the minimum and the first quartile (Q1) and
between the third quartile (Q3) and the maximum value. For this
normal data, the whiskers are approximately equal and longer than
the two boxes. The two boxes are also approximately equal, and the
median (46.25) is extremely close to the average (46.15).
Contrast this with 90 data points randomly generated with the
exponential distribution, with the same average of 45.00. The
spreadsheet formula for this is -LN(1-RAND())*45 (see Figure 3).
Each quartile grows successively larger, the median (32.99) is
nowhere near the mean (46.15) and the whiskers are not even close
to equal length. This plot exhibits classic exponential
characteristics.
Although not an elegant template, this one can be very handy for
those of us trying to keep from using a plethora of PC programs and
keep life simple on a spreadsheet.
|