|
SOFTWARE REVIEW JACK YURKIEWICZ, Feature Editor, Lubin School of Business, Pace University I have seen JMP, a statistics program available only for the Apple Macintosh, several times on colleagues' computers, and it invoked a strong sense of Mac-envy. In this issue, Dr. Lalit Aggarwal, of Drexel University, who has used it for some time, tells us about it. JMP
by Lalit K. Aggarwal, Department of Quantitative Methods JMP is a statistics and a graphics analysis program first marketed by the SAS Institute, Inc. in 1989 for the Mac platform. The most recent version, 3.1, released in December 1994, runs on Mac Plus and later machines using the 68XXX processor, and on the Power Macintosh family of computers based on the PowerPC chip using RISC architecture. SAS released a Windows' version for PC's in May 1995. The manufacturer recommends at least 4 MB RAM and a hard disk drive to run the program. Since there are many Macs with 68XXX processors still in use even as more powerful Macs based on the PowerPC chip are now being sold, I evaluated JMP both on 68XXX and PowerPC processors. I installed the 68XXX processor version of JMP on a Mac SE running on system 6.0.8 with 4 MB RAM and about 120 MB available on the hard disk. I also tested the software on a PowerMac 710/66 with 32 MB RAM and around 100 MB of available hard disk space running on system 7. I tested each procedure in the Introductory Guide on these two systems and evaluated the data input, manipulation, and transformation capability of the software. JMP is a menu driven program. You will use the mouse extensively to work with menus, dialog boxes, and tools. SAS claims ease-of-learning and ease-of-use along with strong statistical visualization capabilities for this product. Compared to earlier, the current 3.1 version has improved the layout of dialog boxes, shows the type of variable (nominal, ordinal or continuous) in all variable lists and dialog boxes, and has improved functionality under Tables and Analyze menus. Installation and Learning The software comes on four 800K disks. A demo and tutorial is supplied on a HD disk. Three installation choices are available: 68XXX processor, PowerMac, and the Universal set. I selected appropriate versions for SE and PowerMac from a dialog box that opened after the installer was launched and followed the rest of the instructions on the screen. Installation was easy on both the SE and PowerMac. JMP can be installed on a network as well, but I did not evaluate this capability. To learn JMP, one must be familiar with the Mac user interface. Still, even for an experienced Mac user, learning JMP on the fly, that is, by experimenting or playing with different menus and dialog boxes, may be difficult. Each menu, submenu, and dialog box has much information on it. Some descriptors for the options on menus and in dialog boxes do not convey exactly what functionality is associated with them, or how a particular selection will affect a graph or a statistical analysis in conjunction with other options. However, once I gave up learning the software on the fly and used the Introductory Guide instead, things became much easier and efficient. Perhaps the only occasional stumbling block was the minor disparity between some dialog boxes on the screen and in the guide. Yet, considering the extensive capabilities packed in JMP, learning the software with the help of the Introductory Guide was easy overall. The online help and balloon help worked well, except for the online statistics guide that did not work on either SE or PowerMac. I found balloon help, available on PowerMac only, useful when starting to learn JMP. However, once I became familiar with the software, the balloon help was turned off, a nice feature. Performance I evaluated JMP for its data handling, graphics and statistical analysis capabilities. There are several ways to enter data. Each time JMP is launched, an untitled formatted spreadsheet opens. Data can be typed in cells. You can name a variable, its type, and what role it plays in a model by highlighting the column heading area. More detailed specifications for each variable can be entered in a dialog box that opens by double clicking the upper part of a column header. Data can also be entered into a JMP file from a text or a SAS output file. I created two separate files: one on Macwrite Pro and the other on Excel 5.0. I saved each as a text file and was able to import the data (using the Import option under File) into a JMP data file without difficulty. Existing JMP data files can be opened by selecting File Open sequence. JMP has a useful Calculator function, which can be used to develop a formula to specify a model, extract a cross-tab table from a data table, or transform a variable. Data can also be created using the random function from the Calculator. There are three choices: Uniform, Normal and Shuffle. These functions worked easily and without any problems. You can create JMP data files by saving certain outputs of statistical analysis, such as descriptive statistics, residuals, etc. One of the most impressive features of JMP is the variety of different ways new data tables can be created. You can transform variables by choosing from a number of options (available under the Tables menu), such as concatenating, appending, sorting, transposing, etc. These options worked well and give JMP unsurpassed data massaging capability with the click of a mouse. Figure 1 shows the Table menu and its options.The program does not restrict the size of a data table; rather, the amount of available memory and hard disk space determine this. Graphics capabilities include specific graphics options such as Bar/Pie charts, Overlay plots, Spinning Plots, and others. You get these with the Graph menu and also as part of the statistical tools. There are many powerful, yet easy-to-use features for doing exploratory analysis. Data points can be marked with special symbols or colors to identify outliers, or a group of observations. The special identifiers show up in a JMP table and on all graphs constructed from such a table. You can assign labels (names or describing characteristics) to observations. Highlighting an observation in a plot displays its label as long as the mouse key is clicked down. This feature makes it easy to read a graph, identify groups within an heterogeneous population, and display characteristics of outliers. Figure 2 shows a scatterplot matrix with a particular observation labeled. With the hand tool you can easily change the level of aggregation in a Bar chart. It works similar to changing the scale on a plot. I found this tool quite useful to graphically analyze observations for outliers, central tendency, mode, skewness, kurtosis, and to find subgroups. The hand tool is easy to use and worked without a problem on the SE and the PowerMac. Double clicking on a plot opens a dialog box which easily lets you modify a scale and other characteristics of the graph. There are a number of different ways in which graphs can be resized, offering considerable flexibility in analyzing plots. The magnifying tool can enlarge an area on a scatter diagram to give you a detailed look into a crowded area of observations. There are many other features and options, too numerous to go into here, that help in understanding the data and in the preparation of graphs. One especially desirable feature is that the various options can be toggled on or off, giving you control of the appearance of the graph and how much information it displays, which literally customizes the look and feel of a display. I found the mosaic and ternary plots visually confusing. A leverage plot, perhaps not so well known, is useful in evaluating the statistical significance of an effect once you learn to interpret it. Of all the graphical tools, the display from the Spinning tool is the most impressive to look at. This tool creates a three-dimensional display that can then be set into continuous rotation with the hand tool. Figure 3 shows an example of this plot. Students, lay persons, and experts can fully appreciate the form of data in a three-dimensional spaceşthe spinning graph makes a powerful presentation on an LCD projector. Statistical tools are selected from the Analyze menu. There are seven choices: Distribution of Y, Fit Y by X, Fit Model, Nonlinear Fit, Correlation of Y's, Cluster, and Survival. Each of these choices offers several different analytical procedures. All together, there are at least two dozen statistical tests and procedures ranging from univariate distribution analysis and means test, bivariate fitting, analysis of variance, to multivariate techniques such as principal components and factor analysis. Once an analysis platform is selected and the data types and roles of variables in a model are defined, JMP automatically selects a statistical model. For about a dozen different problems tested, the automatic mode of selecting an analytical procedure worked well. All statistical procedures tested but one worked flawlessly on the SE and PowerMac. Principal components analysis crashed on the SE with a memory ID error message, but there were no problems encountered with this or any other procedure tested on the PowerMac. The results of the statistical analysis appear in a report. Depending upon the selected options, a report typically shows one or more graphs and text data organized as tables, which can be opened or closed by toggling reveal/conceal text report buttons. Figure 4 shows part of such a report. Some reports are extensive and may require both horizontal and vertical scrolling just to see the text report buttons. I found scrolling along two dimensions inconvenient and at times confusing. Printed reports, however, are well organized and easy to read. Documentation JMP 3.1 comes with an Introductory Guide, a User's Guide, a Statistics and Graphics Guide, and a Changes and Enhancements booklet. There is also extensive online and balloon help on dialogs, statistical procedures and analyses. The Introductory Guide describes, step-by-step, how to enter, summarize, transform, and analyze data. The instructions are easy to follow and, in most cases, reproduced results illustrated in the guide. A few dialog boxes shown in the guide, such as the ``Fit Model'' box, did not exactly match their screen images. I used this guide extensively to learn about JMP--its menus, data entry, and statistical and Graphical capabilities, and would recommend the guide for tutorial learning. The User's Guide explains the menu and submenu items (FILE, EDIT, etc.) and dialog boxes in detail. There is extensive discussion about the Calculator. The last chapter describes different ways of saving, editing, pasting, and printing JMP reports and journals. Information in the User's Guide is similar to that found in the online help, except for instructions on the use of the calculator and managing reports. Both are good reference resources. However, for a quick look at a topic, the online help is definitely more accessible. The Statistics and Graphics Guide, 580 pages long, describes the purpose of each graphical and statistical tool, how to launch an analysis, the functionality of commands, the meaning of individual items on the output (report), and how to interpret graphs and statistics. There is a long and fine list of references for further reading on statistical procedures. To appreciate the power and extensive functionality of tools in JMP fully, the Statistics and Graphics Guide is an invaluable reference source. Discussion on the use of color is sparse in the three guides. Overall, the three guides are well written, amply illustrated and were easy to follow. Price and Support JMP 3.1 for the Mac retails for $695. The academic price for both students and faculty is $229.35. There are discounts available for multiple copies, site licenses, and networked installations. Conclusions JMP 3.1 can be easily installed and run on both 68XXX and PowerPC Macintoshes. There was however a big difference in performance on the two machines. For example, using the ``Compare All Pairs'' procedure on a data set took more than three minutes on SE and about a second on PowerMac. This difference in performance may be, in part, due to the huge difference in memory (4 MB RAM on SE vs. 32 MB RAM on PowerMac) installed on the two machines. There is a good set of graphics and statistical tools available to do descriptive statistics, exploratory analysis, and advanced regression and analysis of variance modeling, although I missed the goodness-of-fit procedure in the package. Labeling each variable by data type is useful, but I would have preferred interval and ratio scales in place of the continuous scale. The strengths of JMP 3.1 are simplicity of use and data visualization. Two notable features are the rotation plots and the complete connection of the observations in a data table to the observations in plots. The ability to mark observations with special symbols and color is particularly useful in exploratory analysis and for presentations. Of course, there are other statistical packages with different mixes of graphical and statistical tools, some perhaps offering more of certain types of tools and less of others. However, JMP offers great graphics and flexibility in data management. Students, teachers and professionals who do any degree of sophisticated data manipulation will find this product highly desirable.
Lalit K. Aggarwal teaches in the Department of Quantitative Methods, College of Business Administration, Drexel University, in Philadelphia. His interests include software development. He is President of Hands-On Computer, and developed X.AMS, a software program that helps instructors make tests.
mentioned in this article, contact the Managing Editor at hjacobs@gsu.edu. If you are interested in writing a software review for a future issue of Decision Line, please contact Professor Jack Yurkiewicz at the address below.
Professor Jack Yurkiewicz |