RESEARCH ISSUESLORI FRANZ, Feature Editor, Department of Management
It's All Random Search to Me!by Victor Cabot, Indiana University In one of his papers the great George Dantzig recalled that the original name for linear programming was Programming in a Linear Structure. While walking on the Santa Monica beach with T.C. Koopmans one day, Koopmans suggested that he shorten the name to linear programming. Now, linear programming is clearly a simpler name for this revolutionary technique and it actually does describe the area, but it is definitely not in the same class as Genetic Algorithms. My next door neighbor is a lawyer and he actually stopped mowing his lawn one time to ask me if I knew anything about GA's. I actually like talking to lawyers, but on that day I told him that is was just random search to me. Random SearchThe one thing we have a lot of these days is optimizers. We have LINDO and GINO and we can even optimize LOTUS and IFPS spreadsheets with various add ins. All these methods assume that we know our model in closed algebraic form and that the model has the mathematical properties that will ensure that the optimizer will give us the correct solution. What if that is not the case? Real mathematicians, the guys who do industrial mathematics, or even those weird stock options guys who live in fancy houses on cliffs, work on these kinds of problems every day. It's a touch-feely kind of thing. You pick some values for the decision variables, you see if the solution is feasible and then you evaluate your objective. You try some more points and then say, "There must be a better way to do this." Well, there is a better way and over the past thirty years this procedure for doing research has developed into a subarea of mathematical programming called random search. This is not to be confused with the kind of problem where we look for submarines or bags of gold, but rather where we investigate some space of feasible solutions and attempt to use information gained from evaluating those solutions to determine an optimal solution at minimum search effort. Up until around ten years ago random search methods were a subarea of nonlinear programming, but the recent use of the principles of random search in solving combinatorial problems has greatly increased its usefulness and popularity. In discussing research issues and possible applications of random search it is necessary to limit this discussion to mathematical programming and combinatorial problems and to set aside the areas where applied statisticians, meteorologists, physicists and oncologists would say they were doing random search. A Toy ProblemTo best describe the random search philosophy, suppose we wanted to search the unit cube in 3-space randomly to determine the optimal value of some function defined over the cube. To be practical we will attempt to discover the cubelette with sides of length .001 which contains the optimal point. Let's call this cubelette a and generate random points in the cube in the hope of finding it. If we assume that each such cubelette is equally probable then the probability of a success on any given trial is the volume of a which is (p=(.001) sup 3) or one billionth. If we sample n times the probability that all n trials fail is ((1-p) sup n). The probability of at least one success is (S=1-(1-p) sup n). We consider (S) as the confidence level that we put on our method. Then solving for n we get n=log(1-S)/log(1-p). Since p is small we can approximate n>s by the first terms of this functions' expansion to get (n approx -p sup -1 log (1-S)). If we take (S=.9) as our confidence level, then we get (n approx 2.3 p sup -1). This tells us that for a 90% confidence level we need to take well over two billion samples. Incidentally, I have an IBM 486 PC with a math coprocessor in my office. My doctoral student estimates that it would take over five hours for me to generate one billion points in the above cube. One can understand the gleem in the eyes of Spaans and Luus [1992] when they reported that they had solved a five variable problem with twelve constraints to within .02% of their estimated optimum after using only nine million random guesses. Nonlinear ProgrammingOne could ask if a simple algorithm like the one above can be used to solve problems given any function. The answer is yes. There are many proofs that such a method will yield a point in an arbitrary neighborhood of the optimal point with probability one for some large sample size N. A good one, with pictures, is Solis and Wets [1981]. No one, to my knowledge, has presented estimates of the sample size needed. The first real attempts at analyzing random search came in the late fifties with the papers by Box [1957] and Brooks [1958]. They both recognized the difficulties in generating so many points and independently suggested that random search be conducted using cycles of points generated sequentially with an information gathering stage between cycles. This, of course, is the manner in which most optimization methods proceed and seemed to make sense at the time. Brooks called the concept "creeping randomness" and made the prophesy, "There are rather intriguing analogies that can be made between the creeping random method and evolution." The major question is how to use the information given by random samples at each cycle or whether they should be random at all. Box took the latter approach and concentrated on experimental design. His contributions led to the development of Response Surface Methodology. Mathematical programmers were spurred by the fact that optimizers had been developed for determining local optimums. For problems with smooth functions and the presence of many local optima the search could be simplified. In this case the optimizer could be employed at sample points to determine the best reachable point. The question here is which of the generated points are best suitable as initial points for unleashing the optimizer. This multi-start technique is the subject of a great amount of study, see Rinnooy Kan and Timmer [1987]. One result of this research is that many sample points may lead to the same local optimum point. Why do optimization from each point? Recent research in this area centers around preprocessing sample points to determine the most promising starting points for the optimizer. One idea is to use cluster analysis to group points first so that one may need only the centroid of the cluster as a representative for many points, Rinnooy and Timmer [1987]. Another idea is to use these clusters to construct an approximation to the objective function over the entire box. In this way, one tries to guess the number of local optima present and reduce the number of times the optimizer is used. A student of mine attempted this procedure using mixtures of normal density functions to construct contours like the clusters. If one does this sequentially it is often possible to gain information about the form of the actual function as a process proceeds. Over a set of test problems, there were always a few surprises, but in general, points in the same clusters went to the same local optima and there were usually more clusters than local optima, Keller, Cabot and Flury [1992]. These types of methods were often successful in characterizing a complete solution space using only several hundred random points for problems with up to ten variables. At this point I would conclude that the progress being made on nonlinear programs using random search is proceeding slowly but steadily. We can't solve giant problems, but given a problem and enough work, advances can be made. Combinatorial OptimizationCombinatorial optimization problems are appealing to all of us. The fact that some of them are impossible to solve and have been proved impossible to solve does not faze us. We think we can still solve them. Given an assignment problem with 70 men and 70 jobs we can find the most efficient assignment in several seconds using LINDO, for example. The total number of possible solutions to the problem is 70! What does Dantzig say about 70!
This kind of statement should have two effects on us. First, it should fill us with awe at the power of the simplex method and the notions of continuity and convexity. Second, it should fill us with fear at the notion of having to schedule 70 jobs with early and tardiness penalties and changeover costs. Let's assume that we know that we will never find the optimal solution to a combinatorial problem. For lots of these problems it is possible to use heuristics to obtain what we can term a "good" solution. Sometimes not. Given that the heuristic solution will cost more than the optimal solution we should be content to settle for a result like: prob {heuristic solution value/optimal solution value less than or equal to 1+epsilon} less than or equal to 1-delta>. We are a long way from any such result except for the very simplest heuristics. In robust combinatorial problems like Traveling Salesman problems or job sequencing problems there are often several improving heuristics that can be applied to a trial point. In this case the actual heuristic used may be generated randomly from some list and score can be kept as to what seems to be working good or should not be used. Good examples are Glovers' [1992] TABU search and the work of Maffiolli [1987]. In this sense the algorithm becomes a learning machine as its experience is increased by solving more problems. The set of points at each cycle provides another heuristic source. The concept of genetic search, Goldberg [1989] can be used. In this method one takes "elite" solutions at one stage and randomly interchanges their components "crossover" in a step of the method called "reproduction." Of course, one usually throws in a few randomly generated "mutants" to keep a global philosophy to the population of points at that cycle. Once again these "breeding rules" can be randomly chosen and join the list of heuristic rules in any search method. The advantage of genetic search is, of course, that it can be used when no heuristics are available or none have shown to have an improving impact. One of our students has successfully solved many p-median like problems using a genetic approach. She solves one 25 node problem using a population of 20 "good individuals" for over 200 generations (cycles). Out of a total of 2 (sup 25) points in search space she examines only .012% to obtain near optimal solutions in most cases, Helm and Venkataramanan [1992]. Another method growing in popularity is simulated annealing, Davis [1987]. This method, borrowed from statistical mechanics, attempts to overcome the curse of local optima by periodically (according to a probabalistic decision rule) pruning the elite set and letting the quality of the overall solution set "cool off." We then restart the optimization process in some different part of the solution space. This seems like a good idea to me and, as opposed to genetic algorithms, there is a very good deal of mathematical and empirical research to back up the efficacy of this approach. Looking AheadOne of the truly great potentials that random search gives us is its inherent parallelism. Most optimization procedures are sequential in nature and move from point to point, processing all the information needed to get the next point at each stage. In random search this is not so important. We can have several populations being analyzed simultaneously and information passed back and forth among them. A new journal, the SIAM Journal of Optimization recently had a special issue on this topic [1991]. Another area of interest is dynamic scheduling. Most scheduling problems do not appear in a static state. New jobs enter the system all of the time and the scheduler (possibly a computer) must make decisions as where these jobs fit in. Fox and McMahon [1991] report that they can currently schedule and evaluate dynamically a sequence of 1000 scheduling operations in about 30 minutes using a genetic algorithm. They report that this is a conservative estimate of the number of activities required by the Onboard Short Term Plan For Space Station Freedom. A good example of how all these concepts can be brought together in a learning environment is the paper by Aytung, Koehler and Snowden [1992]. References Aytung, H., Koehler, G. J., & Snowden, J. L. Genetic learning of dynamic scheduling within a simulation environment. To appear in Computers in Operations Research, (1992). Box, G. E. P. Evolutionary operation: A method for increasing industrial productivity. Journal of the Royal Statistical Society, 1957, 6(2), 81-101. Brooks, S. H. A discussion of random methods for seeking maxima. ORSA, 1957, 6, 244-251. Dantzig, G. B. Reminiscences about the origins of linear programming. Mathematical Programming, 1984, 6, 105-112. Davis, L. Genetic algorithms and simulated annealing. San Marco, CA: Morgan Kaufmann Publishers, 1987. Fox, B. R., & McMahon, M. B. Genetic operators for sequencing problems. In G.J.E. Raolins, Editor, Foundations of genetic algorithms. Morgan Kaufmann Publishers, San Marco, CA: 1991. Glover, F. Tabu search for nonlinear and parametric optimization (with links to genetic algorithms). To appear in Discrete Applied Mathematics, (1992). Goldberg, D. E. Genetic algorithms in search, optimization and machine learning. Addison-Wesley Publishing Company, 1989. Helm, S., & Venkataramanan, M. Solving the hub location problem using genetic search. Submitted for publication, 1992. Keller, C., Cabot, V., & Flury, B. Two approaches to global optimization. Submitted for publication, 1992. Maffiolli, F. Randomized algorithms in combinational optimization: A survey. Discrete Applied Mathematics, 1986, 14, 157-170. Rinnooy Kan and Timmer. Stochastic global optimization methods Part I: Clustering methods. Mathematical Programming, 1987, 39, 27-56. Rinnooy Kan and Timmer. Stochastic global optimization methods Part III: Multi-level methods. Mathematical Programming, 1987, 39, 57-78. SIAM Journal on Optimization, 1991, 2(2). Solis, R., & Wets, T. Minimization by random search techniques. Mathematics of Operations Research, 1981, 6, 19-30. Spaans, R., & Luus, R. Importance of search-domain reduction in random optimization. JOTA, 1992, 75, 635-638. VICTOR CABOT is a Professor of Decision and Information Systems in the Business School at Indiana University. He has held positions in Computer Science and Operations Research departments at The University of Grenoble and The Naval Postgraduate School. |