|
SOFTWARE REVIEW JACK YURKIEWICZ, Feature Editor, Lubin School of Business, Pace University This review of a new and unconventional statistics program, Optimal Data Analysis, is from Dr. Fred Bryant of Loyola University in Chicago. Dr. Bryant was one of the beta testers of the program and has been using it for some time in his research and teaching.
Analyze Your Data Optimally Using ODA 1.0by Fred B. Bryant, Loyola University When you make a prediction, would you rather be correct or incorrect? If your answer is `correct,' then ODA is the appropriate analytic methodology. (Soltysik & Yarnold, ODA Manual, 1993, p. 1) This bold opening statement from the manual for Optimal Data Analysis (ODA) aptly conveys the wide-ranging scope and versatility of this innovative statistical package. ODA is a statistical paradigm for identifying an optimal discriminant function for assigning observations to categories with theoretically maximum possible accuracy. Although the idea of obtaining an optimal cutting score for classifying observations into categories has been around since the 1950s, the computational ability to solve large-scale problems has evolved only recently. How does ODA compare to other more traditional statistical packages? There are currently two basic types of general-purpose statistical software packages on the market. The first of these correspond to what can be termed "traditional statistics." These procedures include the commonly used chi-square, F, and t tests. These procedures are heavily assumption laden (e.g., use of the latter two tests assumes that data are distributed normally and that there are equal variances across independent groups). Examples of this type of software include SAS, SPSS, BMDP, SYSTAT, and others. The second type of statistical software package offers so-called ``exact probability'' statistics. These procedures allow one to compute (or estimate for larger problems) permutation probabilities that provide exact probabilities for traditional statistical procedures. Examples of this type of statistical package include RESAMPLING STATS, MATRIX, and SAS/STAT MULTTEST. These first two approaches to data analysis share some common problems. First, each of them forces you to fit your data (more or less poorly) into a statistical model that makes assumptions about underlying distributions. That is, none of the traditional procedures is specifically tailored to a given application; rather a given application is analyzed using the procedure whose assumptions it violates least. Second, none of the traditional procedures are specifically optimized for accurate forecasting. That is, no traditional procedure specifically identifies a model that explicitly maximizes the percentage accuracy of classification (PAC) that the model achieves. Some traditional procedures, such as F-test or chi-square, do not provide models that can be used for forecasting purposes. Others, such as multiple regression or discriminant analysis, provide classification models; but these models do not explicitly maximize the criterion of achieving theoretically maximum PAC. ODA does not fall into either of the above categories. It claims to solve the above problems by performing "optimal data analysis." The ODA paradigm is as follows:
ODA's Statistical Capabilities ODA's elegantly simple paradigm gives rise to an wide array of statistical analyses. Many of these provide what the authors refer to as "optimal analogs" to traditional statistical procedures. For example, consider the following empirical example (5.4) from the ODA manual. Imagine that you wish to test the hypothesis that higher scores on a standardized medical aptitude test predict higher clinical performance ratings (as assessed on a six-point rating scale). Traditionally, one might wish to use one-way analysis of variance (ANOVA) or an F test to evaluate this directional hypothesis. In this particular case, however, the number of subjects in each level of performance rating is highly imbalanced and their test scores are heterogeneous. Such conditions violate the underlying assumptions of ANOVA. Alternatively, one can use ODA to test the hypothesis of interest, weighing by prior odds to adjust for the imbalanced sample sizes. Figure 1 shows an annotated version of the ODA input statements (taken from the manual) used to test the hypothesis that test scores (NBMESCOR) and performance ratings (CLINCLAS) increase together. Whereas ANOVA yielded only a few weak effects, ODA revealed a highly significant model (with a confidence for p<.05 of 99.99%). Figure 2 shows an annotated version of the output (taken from the manual) for this particular optimal analysis. Note that the model yields an explicit classification rule specifying the relationship between the attribute and the class variable, thus eliminating the need for follow-up contrasts to interpret main effects (as is necessary with ANOVA). The manual illustrates many other popular types of traditional analyses for which ODA provides an optimal analog, including t test, correlation, chi-square, phi, kappa, randomized block (and other experimental) designs, cluster analysis, Markov analysis, autocorrelation, item analysis, and the log-linear model. However, ODA also enables new types of analyses that are not possible using any other software system. For example, ODA can be used to maximize the PAC achieved by commonly used multi-attribute procedures such as multiple regression (e.g., Yarnold & Soltysik, 1991: Refining two-group multivariable classification models using univariate optimal discriminant analysis, DECISION SCIENCES, 22, 1158-1164). Consider the following empirical example from the ODA manual. Medical patients were randomly assigned either to an experimental group (in which their physician advised them to quit smoking and offered them a nicotine substitute) or to a control group (in which the physician did not mention smoking). Patients were also asked whether or not they were willing to make a commitment to stop smoking. These two variables (experimental condition and willingness to make a commitment) were then used as independent variables in a traditional logistic regression to find a model for predicting whether or not people actually quit smoking one week later. The logistic regression model achieved overall classification accuracy = 76.9%; sensitivity for predicting quitters = 76.2%; a sensitivity for predicting continuing smokers = 81.9%; predictive value for quitters = 73.8%; and predictive value for continuing smokers = 78.8%. ODA was then used to identify an adjusted intercept term (changed from .5 to .13) for the logistic model, which dramatically improved classification accuracy, particularly when predicting people who quit smoking: overall classification accuracy = 91.1%; sensitivity for predicting quitters = 100%; sensitivity for predicting continuing smokers = 61.1%; predictive value for quitters = 89.6%; and predictive value for continuing smokers = 94.8%. ODA can also be used to optimize the PAC achieved by other commonly used multi-attribute models such as Fisher's linear discriminant function analysis, Smith's quadratic discriminant analysis, and probit analysis, to name a few. In addition, ODA offers multiple-sample analyses, hold-out (cross-generalizability) and/or leave-one-out (jackknife) validity analyses, weighing by prior odds and/or by cost or return, and one- or two-tailed hypothesis testing via Monte Carlo simulationşall for any data configuration. Furthermore, ODA offers optimal parallel forms, split-half, inter-rater, test-retest, and intraclass reliability analyses, and optimal discriminant, convergent, and construct validation, and much more. Yet, for all its versatility, there are still some types of traditional statistical analyses that ODA cannot as yet handle (e.g., repeated measures MANOVAs, conjoint analysis, MDS). Compared to the traditional statistical paradigm, I found that the approach of ODA to have many advantages. First is conceptual clarity. In the ODA paradigm, for every data configuration there is one precise optimal analysis. In traditional statistics, for a given application, several different analyses are often feasible, and all are usually "suboptimal" in terms of PAC. Second is ease of interpretation. Every ODA analysis provides the same intuitive goodness-of-fit index: PAC. Different traditional statistical procedures, however, provide different goodness-of-fit indices that are both nonintuitive and noncomparable across procedures. Third is ease of use. Most ODA analyses require the same basic set of 6-10 commands. Fourth is maximum accuracy. Every ODA analysis provides a model that guarantees maximum possible PAC. In contrast, no traditional analysis provides a model that guarantees maximum PAC. Fifth is "valid Type I error." ODA provides permutation probabilities and requires no simplifying assumptions: p is always valid. Traditional analyses require simplifying assumptions, and p is valid only if the assumptions are true for one's data. Documentation The ODA manual (hard-bound; 200 pages) is written like a textbook. Using a minimum of formulas, the manual discusses everything you need to know to use ODA. Comprehensive and well-organized, it includes a wealth of references to published empirical examples in a host of literatures and a collection of 30 hypothetical applications in different fields, ranging from astronomy, credit screening, epidemiology, and farming, to personnel selection, target recognition, weather forecasting, and zoology. Both students and educators can use the manual to provide interesting data-driven examples that clearly illustrate how to use ODA. The ODA software also includes a collection of over 60 actual raw-data sets that are analyzed and interpreted in the manual, many including well-annotated input statements and printouts (see Figures 1 and 2). This carefully crafted package is an excellent teaching tool. Technical Information ODA is a command-driven software program that may be run in batch mode or interactively. Although ODA has no graphics capabilities, this in no way impairs one's ability to understand the results of analyses (see sample output in Figure 2). The program is very compact, (it comes on one diskette), easy to install, and extremely fast. It requires 640K of RAM and 500K of disk storage, and uses a math co-processor if available. The program currently allows a maximum of approximately 8200 observations for applications involving ordered (e.g., ordinal, interval, or ratio-scale) attributes, and an unlimited number of observations if the data are categorical (e.g., qualitative). Run times for average problems on a 386DX 40-Mhz IBM-class PC with math co-processor range from 1 to 2000 seconds. Summary ODA's simplicity lends it an appealing conceptual elegance and makes it exceptionally easy to use. Unlike any other existing statistical system, ODA provides a unifying paradigm for analyzing the full spectrum of data configurations encountered in scientific research. Pricing Information A variety of purchase options exist for Optimal Data Analysis 1.0. The regular single copy price is $499. For orders of two or more copies, there is a $100 discount per copy. In addition, academic faculty may subtract $100 from the above prices, and students may subtract $200. Site and network licenses are also available. Finally, a ``classroom'' price, available to educational institutions, is $99. This price includes the manual, diskette, and technical support, and requires a minimum order of six copies.
Optimal Data Analysis, Inc.
FRED B. BRYANT is Professor of Psychology at Loyola University Chicago, where he teaches graduate and undergraduate courses in statistics, research methods, and social psychology. After receiving his Ph.D. in social psychology from Northwestern University in 1980, he completed a three-year post-doctoral fellowship in survey research at the University of Michigan's Institute for Social Research. His research interests include structural equation modeling, meta-analysis, and the measurement of emotion. If you are interested in writing a software review for a future issue of Decision Line, please call me at Pace University (212) 346-1908, or e-mail: yurk@pacevm.dac.pace.edu. |