We consider the problem of placing a limited number of sensors in a municipal water distribution network to minimize the impact over a given suite of contamination incidents. In its simplest form, the sensor placement problem is a p-median problem that has structure extremely amenable to exact and heuristic solution methods. We describe the solution of real-world instances using integer programming or local search or a Lagrangian method. The Lagrangian method is necessary for solution of large problems on small PCs. We summarize a number of other heuristic methods for effectively addressing issues such as sensor failures, tuning sensors based on local water quality variability, and problem size/approximation quality tradeoffs. These algorithms are incorporated into the TEVA-SPOT toolkit, a software suite that the US Environmental Protection Agency has used and is using to design contamination warning systems for US municipal water systems.
A range of core operations and planning problems for the national electrical grid are naturally formulated and solved as stochastic programming problems, which minimize expected costs subject to a range of uncertain outcomes relating to, for example, uncertain demands or generator output. A critical decision issue relating to such stochastic programs is: How many scenarios are required to ensure a specific error bound on the solution cost? Scenarios are the key mechanism used to sample from the uncertainty space, and the number of scenarios drives computational difficultly. We explore this question in the context of a long-term grid generation expansion problem, using a bounding procedure introduced by Mak, Morton, and Wood. We discuss experimental results using problem formulations independently minimizing expected cost and down-side risk. Our results indicate that we can use a surprisingly small number of scenarios to yield tight error bounds in the case of expected cost minimization, which has key practical implications. In contrast, error bounds in the case of risk minimization are significantly larger, suggesting more research is required in this area in order to achieve rigorous solutions for decision makers.
The Python Optimization Modeling Objects (Pyomo) package [1] is an open source tool for modeling optimization applications within Python. Pyomo provides an objected-oriented approach to optimization modeling, and it can be used to define symbolic problems, create concrete problem instances, and solve these instances with standard solvers. While Pyomo provides a capability that is commonly associated with algebraic modeling languages such as AMPL, AIMMS, and GAMS, Pyomo's modeling objects are embedded within a full-featured high-level programming language with a rich set of supporting libraries. Pyomo leverages the capabilities of the Coopr software library [2], which integrates Python packages (including Pyomo) for defining optimizers, modeling optimization applications, and managing computational experiments. A central design principle within Pyomo is extensibility. Pyomo is built upon a flexible component architecture [3] that allows users and developers to readily extend the core Pyomo functionality. Through these interface points, extensions and applications can have direct access to an optimization model's expression objects. This facilitates the rapid development and implementation of new modeling constructs and as well as high-level solution strategies (e.g. using decomposition- and reformulation-based techniques). In this presentation, we will give an overview of the Pyomo modeling environment and model syntax, and present several extensions to the core Pyomo environment, including support for Generalized Disjunctive Programming (Coopr GDP), Stochastic Programming (PySP), a generic Progressive Hedging solver [4], and a tailored implementation of Bender's Decomposition.
Although stochastic programming is a powerful tool for modeling decision-making under uncertainty, various impediments have historically prevented its widespread use. One key factor involves the ability of non-specialists to easily express stochastic programming problems as extensions of deterministic models, which are often formulated first. A second key factor relates to the difficulty of solving stochastic programming models, particularly the general mixed-integer, multi-stage case. Intricate, configurable, and parallel decomposition strategies are frequently required to achieve tractable run-times. We simultaneously address both of these factors in our PySP software package, which is part of the COIN-OR Coopr open-source Python project for optimization. To formulate a stochastic program in PySP, the user specifies both the deterministic base model and the scenario tree with associated uncertain parameters in the Pyomo open-source algebraic modeling language. Given these two models, PySP provides two paths for solution of the corresponding stochastic program. The first alternative involves writing the extensive form and invoking a standard deterministic (mixed-integer) solver. For more complex stochastic programs, we provide an implementation of Rockafellar and Wets Progressive Hedging algorithm. Our particular focus is on the use of Progressive Hedging as an effective heuristic for approximating general multi-stage, mixed-integer stochastic programs. By leveraging the combination of a high-level programming language (Python) and the embedding of the base deterministic model in that language (Pyomo), we are able to provide completely generic and highly configurable solver implementations. PySP has been used by a number of research groups, including our own, to rapidly prototype and solve difficult stochastic programming problems.
Discrete models of large, complex systems like national infrastructures and complex logistics frameworks naturally incorporate many modeling uncertainties. Consequently, there is a clear need for optimization techniques that can robustly account for risks associated with modeling uncertainties. This report summarizes the progress of the Late-Start LDRD 'Robust Analysis of Largescale Combinatorial Applications'. This project developed new heuristics for solving robust optimization models, and developed new robust optimization models for describing uncertainty scenarios.
The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic finite element methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a developers manual for the DAKOTA software and describes the DAKOTA class hierarchies and their interrelationships. It derives directly from annotation of the actual source code and provides detailed class documentation, including all member functions and attributes.
The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic finite element methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a user's manual for the DAKOTA software and provides capability overviews and procedures for software execution, as well as a variety of example studies.
The DAKOTA (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a flexible and extensible interface between simulation codes and iterative analysis methods. DAKOTA contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quantification with sampling, reliability, and stochastic finite element methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components required for iterative systems analyses, the DAKOTA toolkit provides a flexible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a reference manual for the commands specification for the DAKOTA software, providing input overviews, option descriptions, and example specifications.
Over the last decade and a half, tabu search algorithms for machine scheduling have gained a near-mythical reputation by consistently equaling or establishing state-of-the-art performance levels on a range of academic and real-world problems. Yet, despite these successes, remarkably little research has been devoted to developing an understanding of why tabu search is so effective on this problem class. In this paper, we report results that provide significant progress in this direction. We consider Nowicki and Smutnicki's i-TSAB tabu search algorithm, which represents the current state-of-the-art for the makespan-minimization form of the classical jobshop scheduling problem. Via a series of controlled experiments, we identify those components of i-TSAB that enable it to achieve state-of-the-art performance levels. In doing so, we expose a number of misconceptions regarding the behavior and/or benefits of tabu search and other local search metaheuristics for the job-shop problem. Our results also serve to focus future research, by identifying those specific directions that are most likely to yield further improvements in performance.
In this paper, we analyze the relationship between pool maintenance schemes, long-term memory mechanisms, and search space structure, with the goal of placing metaheuristic design on a more concrete foundation.
We have found that developing a computational framework for reconstructing error control codes for engineered data and ultimately for deciphering genetic regulatory coding sequences is a challenging and uncharted area that will require advances in computational technology for exact solutions. Although exact solutions are desired, computational approaches that yield plausible solutions would be considered sufficient as a proof of concept to the feasibility of reverse engineering error control codes and the possibility of developing a quantitative model for understanding and engineering genetic regulation. Such evidence would help move the idea of reconstructing error control codes for engineered and biological systems from the high risk high payoff realm into the highly probable high payoff domain. Additionally this work will impact biological sensor development and the ability to model and ultimately develop defense mechanisms against bioagents that can be engineered to cause catastrophic damage. Understanding how biological organisms are able to communicate their genetic message efficiently in the presence of noise can improve our current communication protocols, a continuing research interest. Towards this end, project goals include: (1) Develop parameter estimation methods for n for block codes and for n, k, and m for convolutional codes. Use methods to determine error control (EC) code parameters for gene regulatory sequence. (2) Develop an evolutionary computing computational framework for near-optimal solutions to the algebraic code reconstruction problem. Method will be tested on engineered and biological sequences.
We consider the accuracy of predictions made by integer programming (IP) models of sensor placement for water security applications. We have recently shown that IP models can be used to find optimal sensor placements for a variety of different performance criteria (e.g. minimize health impacts and minimize time to detection). However, these models make a variety of simplifying assumptions that might bias the final solution. We show that our IP modeling assumptions are similar to models developed for other sensor placement methodologies, and thus IP models should give similar predictions. However, this discussion highlights that there are significant differences in how temporal effects are modeled for sensor placement. We describe how these modeling assumptions can impact sensor placements.
Tabu search is one of the most effective heuristics for locating high-quality solutions to a diverse array of NP-hard combinatorial optimization problems. Despite the widespread success of tabu search, researchers have a poor understanding of many key theoretical aspects of this algorithm, including models of the high-level run-time dynamics and identification of those search space features that influence problem difficulty. We consider these questions in the context of the job-shop scheduling problem (JSP), a domain where tabu search algorithms have been shown to be remarkably effective. Previously, we demonstrated that the mean distance between random local optima and the nearest optimal solution is highly correlated with problem difficulty for a well-known tabu search algorithm for the JSP introduced by Taillard. In this paper, we discuss various shortcomings of this measure and develop a new model of problem difficulty that corrects these deficiencies. We show that Taillard's algorithm can be modeled with high fidelity as a simple variant of a straightforward random walk. The random walk model accounts for nearly all of the variability in the cost required to locate both optimal and sub-optimal solutions to random JSPs, and provides an explanation for differences in the difficulty of random versus structured JSPs. Finally, we discuss and empirically substantiate two novel predictions regarding tabu search algorithm behavior. First, the method for constructing the initial solution is highly unlikely to impact the performance of tabu search. Second, tabu tenure should be selected to be as small as possible while simultaneously avoiding search stagnation; values larger than necessary lead to significant degradations in performance.
We present a model for optimizing the placement of sensors in municipal water networks to detect maliciously injected contaminants. An optimal sensor configuration minimizes the expected fraction of the population at risk. We formulate this problem as a mixed-integer program, which can be solved with generally available solvers. We find optimal sensor placements for three test networks with synthetic risk and population data. Our experiments illustrate that this formulation can be solved relatively quickly and that the predicted sensor configuration is relatively insensitive to uncertainties in the data used for prediction.
A fundamental challenge for all communication systems, engineered or living, is the problem of achieving efficient, secure, and error-free communication over noisy channels. Information theoretic principals have been used to develop effective coding theory algorithms to successfully transmit information in engineering systems. Living systems also successfully transmit biological information through genetic processes such as replication, transcription, and translation, where the genome of an organism is the contents of the transmission. Decoding of received bit streams is fairly straightforward when the channel encoding algorithms are efficient and known. If the encoding scheme is unknown or part of the data is missing or intercepted, how would one design a viable decoder for the received transmission? For such systems blind reconstruction of the encoding/decoding system would be a vital step in recovering the original message. Communication engineers may not frequently encounter this situation, but for computational biologists and biotechnologist this is an immediate challenge. The goal of this work is to develop methods for detecting and reconstructing the encoder/decoder system for engineered and biological data. Building on Sandia's strengths in discrete mathematics, algorithms, and communication theory, we use linear programming and will use evolutionary computing techniques to construct efficient algorithms for modeling the coding system for minimally errored engineered data stream and genomic regulatory DNA and RNA sequences. The objective for the initial phase of this project is to construct solid parallels between biological literature and fundamental elements of communication theory. In this light, the milestones for FY2003 were focused on defining genetic channel characteristics and providing an initial approximation for key parameters, including coding rate, memory length, and minimum distance values. A secondary objective addressed the question of determining similar parameters for a received, noisy, error-control encoded data set. In addition to these goals, we initiated exploration of algorithmic approaches to determine if a data set could be approximated with an error-control code and performed initial investigations into optimization based methodologies for extracting the encoding algorithm given the coding rate of an encoded noise-free and noisy data stream.
Iterated local search, or ILS, is among the most straightforward meta-heuristics for local search. ILS employs both small-step and large-step move operators. Search proceeds via iterative modifications to a single solution, in distinct alternating phases. In the first phase, local neighborhood search (typically greedy descent) is used in conjunction with the small-step operator to transform solutions into local optima. In the second phase, the large-step operator is applied to generate perturbations to the local optima obtained in the first phase. Ideally, when local neighborhood search is applied to the resulting solution, search will terminate at a different local optimum, i.e., the large-step perturbations should be sufficiently large to enable escape from the attractor basins of local optima. ILS has proven capable of delivering excellent performance on numerous N P-Hard optimization problems. [LMS03]. However, despite its implicity, very little is known about why ILS can be so effective, and under what conditions. The goal of this paper is to advance the state-of-the-art in the analysis of meta-heuristics, by providing answers to this research question. They focus on characterizing both the relationship between the structure of the underlying search space and ILS performance, and the dynamic behavior of ILS. The analysis proceeds in the context of the job-shop scheduling problem (JSP) [Tai94]. They begin by demonstrating that the attractor basins of local optima in the JSP are surprisingly weak, and can be escaped with high probaiblity by accepting a short random sequence of less-fit neighbors. this result is used to develop a new ILS algorithms for the JSP, I-JAR, whose performance is competitive with tabu search on difficult benchmark instances. They conclude by developing a very accurate behavioral model of I-JAR, which yields significant insights into the dynamics of search. The analysis is based on a set of 100 random 10 x 10 problem instances, in addition to some widely used benchmark instances. Both I-JAR and the tabu search algorithm they consider are based on the N1 move operator introduced by van Laarhoven et al. [vLAL92]. The N1 operator induces a connected search space, such that it is always possible to move from an arbitrary solution to an optimal solution; this property is integral to the development of a behavioral model of I-JAR. However, much of the analysis generalizes to other move operators, including that of Nowicki and Smutnick [NS96]. Finally the models are based on the distance between two solutions, which they take as the well-known disjunctive graph distance [MBK99].