Method Commands

Method Commands Table of Contents

Method Description

The method section in a DAKOTA input file specifies the name and controls of an iterator. The terms "method" and "iterator" can be used interchangeably, although method often refers to an input specification whereas iterator usually refers to an object within the Iterator hierarchy. A method specification, then, is used to select an iterator from the iterator hierarchy, which includes optimization, uncertainty quantification, least squares, design of experiments, and parameter study iterators (see the Users Manual [Adams et al., 2010] for more information on these iterator branches). This iterator may be used alone or in combination with other iterators as dictated by the strategy specification (refer to Strategy Commands for strategy command syntax and to the Users Manual [Adams et al., 2010] for strategy algorithm descriptions).

Several examples follow. The first example shows a minimal specification for an optimization method.

method,
	dot_sqp
This example uses all of the defaults for this method.

A more sophisticated example would be

method,
	id_method = 'NLP1'
	model_pointer = 'M1'
	dot_sqp
	  max_iterations = 50
	  convergence_tolerance = 1e-4
	  output verbose
	  optimization_type minimize
This example demonstrates the use of identifiers and pointers (see Method Independent Controls) as well as some method independent and method dependent controls for the sequential quadratic programming (SQP) algorithm from the DOT library. The max_iterations, convergence_tolerance, and output settings are method independent controls, in that they are defined for a variety of methods (see DOT method independent controls for DOT usage of these controls). The optimization_type control is a method dependent control, in that it is only meaningful for DOT methods (see DOT method dependent controls).

The next example shows a specification for a least squares method.

method,
	optpp_g_newton
	  max_iterations = 10
	  convergence_tolerance = 1.e-8
	  search_method trust_region
	  gradient_tolerance = 1.e-6
Some of the same method independent controls are present along with a new set of method dependent controls (search_method and gradient_tolerance) which are only meaningful for OPT++ methods (see OPT++ method dependent controls).

The next example shows a specification for a nondeterministic method with several method dependent controls (refer to Nondeterministic sampling method).

method,
	nond_sampling
	  samples = 100	seed = 12345
	  sample_type lhs
	  response_levels = 1000. 500.

The last example shows a specification for a parameter study method where, again, each of the controls are method dependent (refer to Vector parameter study).

method,
	vector_parameter_study
	  step_vector = 1. 1. 1.
	  num_steps = 10

Method Specification

As alluded to in the examples above, the method specification has the following structure:
method,
	<method independent controls>
	<method selection>
	  <method dependent controls>

where <method selection> is for example one of the following: dot_frcg, dot_mmfd, dot_bfgs, dot_slp, dot_sqp, conmin_frcg, conmin_mfd, npsol_sqp, nlssol_sqp, nlpql_sqp, nl2sol, nonlinear_cg, optpp_cg, optpp_q_newton, optpp_fd_newton, optpp_g_newton, optpp_newton, optpp_pds, asynch_pattern_search, coliny_cobyla, coliny_direct, coliny_pattern_search, coliny_solis_wets, coliny_ea, moga, soga, ncsu_direct, dl_solver, surrogate_based_local, surrogate_based_global, efficient_global, nond_polynomial_chaos, nond_stoch_collocation, nond_sampling, nond_importance, nond_local_reliability, nond_global_reliability, nond_local_evidence, nond_global_evidence, nond_local_interval_est, nond_global_interval_est, nond_bayes_calib, dace, fsu_quasi_mc, fsu_cvt, psuade_moat, vector_parameter_study, list_parameter_study, centered_parameter_study, or multidim_parameter_study.

The <method independent controls> are those controls which are valid for a variety of methods. In some cases, these controls are abstractions which may have slightly different implementations from one method to the next. The <method dependent controls> are those controls which are only meaningful for a specific method or library. Referring to dakota.input.summary, the method independent controls are those controls defined externally from and prior to the method selection blocks. They are all optional. The method selection blocks are all required group specifications separated by logical OR's. The method dependent controls are those controls defined within the method selection blocks. Defaults for method independent and method dependent controls are defined in DataMethod. The following sections provide additional detail on the method independent controls followed by the method selections and their corresponding method dependent controls.

Method Independent Controls

The method independent controls include a method identifier string, a model type specification with pointers to variables, interface, and responses specifications, a speculative gradient selection, an output verbosity control, maximum iteration and function evaluation limits, constraint and convergence tolerance specifications, a scaling selection, and a set of linear inequality and equality constraint specifications. While each of these controls is not valid for every method, the controls are valid for enough methods that it was reasonable to pull them out of the method dependent blocks and consolidate the specifications.

The method identifier string is supplied with id_method and is used to provide a unique identifier string for use with strategy specifications (refer to Strategy Description). It is appropriate to omit a method identifier string if only one method is included in the input file and single_method is the selected strategy (all other strategies require one or more method pointers), since the single method to use is unambiguous in this case.

The model pointer string is specified with model_pointer and is used to identify the model used to perform function evaluations for the method. If a model pointer string is specified and no corresponding id is available, DAKOTA will exit with an error message. If no model pointer string is specified, then the last model specification parsed will be used. If no model pointer string is specified and no model specification is provided by the user, then a default model specification is used (similar to the default strategy specification, see Strategy Description). This default model specification is of type single with no variables_pointer, interface_pointer, or responses_pointer (see Single Model Controls). It is appropriate to omit a model specification whenever the relationships are unambiguous due to the presence of single variables, interface, and responses specifications.

When performing gradient-based optimization in parallel, speculative gradients can be selected to address the load imbalance that can occur between gradient evaluation and line search phases. In a typical gradient-based optimization, the line search phase consists primarily of evaluating the objective function and any constraints at a trial point, and then testing the trial point for a sufficient decrease in the objective function value and/or constraint violation. If a sufficient decrease is not observed, then one or more additional trial points may be attempted sequentially. However, if the trial point is accepted then the line search phase is complete and the gradient evaluation phase begins. By speculating that the gradient information associated with a given line search trial point will be used later, additional coarse grained parallelism can be introduced by computing the gradient information (either by finite difference or analytically) in parallel, at the same time as the line search phase trial-point function values. This balances the total amount of computation to be performed at each design point and allows for efficient utilization of multiple processors. While the total amount of work performed will generally increase (since some speculative gradients will not be used when a trial point is rejected in the line search phase), the run time will usually decrease (since gradient evaluations needed at the start of each new optimization cycle were already performed in parallel during the line search phase). Refer to [Byrd et al., 1998] for additional details. The speculative specification is implemented for the gradient-based optimizers in the DOT, CONMIN, and OPT++ libraries, and it can be used with dakota numerical or analytic gradient selections in the responses specification (refer to Gradient Specification for information on these specifications). It should not be selected with vendor numerical gradients since vendor internal finite difference algorithms have not been modified for this purpose. In full-Newton approaches, the Hessian is also computed speculatively. NPSOL and NLSSOL do not support speculative gradients, as their gradient-based line search in user-supplied gradient mode (dakota numerical or analytic gradients) is a superior approach for load-balanced parallel execution.

Output verbosity control is specified with output followed by silent, quiet, verbose or debug. If there is no user specification for output verbosity, then the default setting is normal. This gives a total of five output levels to manage the volume of data that is returned to the user during the course of a study, ranging from full run annotation plus internal debug diagnostics (debug) to the bare minimum of output containing little more than the total number of simulations performed and the final solution (silent). Output verbosity is observed within the Iterator (algorithm verbosity), Model (synchronize/fd_gradients verbosity), Interface (map/synch verbosity), Approximation (global data fit coefficient reporting),and AnalysisCode (file operation reporting) class hierarchies; however, not all of these software components observe the full granularity of verbosity settings. Specific mappings are as follows:

Note that iterators and interfaces utilize the full granularity in verbosity, whereas models, approximations, and file operations do not. With respect to iterator verbosity, different iterators implement this control in slightly different ways (as described below in the method independent controls descriptions for each iterator), however the meaning is consistent. For models, interfaces, approximations, and file operations, quiet suppresses parameter and response set reporting and silent further suppresses function evaluation headers and scheduling output. Similarly, verbose adds file management, approximation evaluation, and global approximation coefficient details, and debug further adds diagnostics from nonblocking schedulers.

The constraint_tolerance specification determines the maximum allowable value of infeasibility that any constraint in an optimization problem may possess and still be considered to be satisfied. It is specified as a positive real value. If a constraint function is greater than this value then it is considered to be violated by the optimization algorithm. This specification gives some control over how tightly the constraints will be satisfied at convergence of the algorithm. However, if the value is set too small the algorithm may terminate with one or more constraints being violated. This specification is currently meaningful for the NPSOL, NLSSOL, DOT and CONMIN constrained optimizers (refer to DOT method independent controls and NPSOL method independent controls).

The convergence_tolerance specification provides a real value for controlling the termination of iteration. In most cases, it is a relative convergence tolerance for the objective function; i.e., if the change in the objective function between successive iterations divided by the previous objective function is less than the amount specified by convergence_tolerance, then this convergence criterion is satisfied on the current iteration. Since no progress may be made on one iteration followed by significant progress on a subsequent iteration, some libraries require that the convergence tolerance be satisfied on two or more consecutive iterations prior to termination of iteration. This control is used with optimization and least squares iterators (DOT, CONMIN, NPSOL, NLSSOL, OPT++, and Coliny) and is not used within the uncertainty quantification, design of experiments, or parameter study iterator branches. Refer to DOT method independent controls, NPSOL method independent controls, OPT++ method independent controls, and Coliny method independent controls for specific interpretations of the convergence_tolerance specification.

The max_iterations and max_function_evaluations controls provide integer limits for the maximum number of iterations and maximum number of function evaluations, respectively. The difference between an iteration and a function evaluation is that a function evaluation involves a single parameter to response mapping through an interface, whereas an iteration involves a complete cycle of computation within the iterator. Thus, an iteration generally involves multiple function evaluations (e.g., an iteration contains descent direction and line search computations in gradient-based optimization, population and multiple offset evaluations in nongradient-based optimization, etc.). The max_function_evaluations control is not currently used within the uncertainty quantification, design of experiments, and parameter study iterator branches, and in the case of gradient-based methods, does not currently capture function evaluations that occur as part of the method_source dakota finite difference routine (since these additional evaluations are intentionally isolated from the iterators).

Continuous design variable, function, and constraint scaling can be turned on for optimizers and least squares minimizers by providing the scaling keyword. Discrete variable scaling is not supported. When scaling is enabled, variables, functions, gradients, Hessians, etc., are transformed such that the optimizer iterates in scaled variable space, whereas evaluations of the computational model as specified in the interface are performed on the original problem scale. Therefore using scaling does not require rewriting the interface to the simulation code. The user may specify no, one, or a vector of scaling type strings through each of the scale_types (see Variables Commands); objective_function_scale_types, least_squares_term_scale_types, nonlinear_inequality_scale_types, nonlinear_equality_scale_types (see Function Specification); linear_inequality_scale_types, and linear_equality_scale_types (see Method Independent Controls below) specifications. Valid options for types include 'none' (default), 'value', 'auto', or 'log', for no, characteristic value, automatic, or logarithmic scaling, respectively, although not all types are valid for scaling all entities (see the references for details). If a single string is specified using any of these keywords it will apply to each component of the relevant vector, e.g., scale_types = 'value' will enable characteristic value scaling for each continuous design variable. The user may specify no, one, or a vector of nonzero characteristic scale values through each of the scales (see Variables Commands); objective_function_scales, least_squares_term_scales, nonlinear_inequality_scales, nonlinear_equality_scales (see Function Specification); linear_inequality_scales, and linear_equality_scales (see Method Independent Controls below) specifications. These values are ignored for scaling type 'none', required for 'value', and optional for 'auto' and 'log'. If a single value is specified using any of these keywords it will apply to each component of the relevant vector, e.g., scales = 3.0 will apply a characteristic scaling value of 3.0 to each continuous design variable. When the scaling keyword is omitted, all _scale_types and *_scales specifications are ignored in the method, variables, and responses sections.

When scaling is enabled, the following procedures determine the transformations used to scale each component of a variables or response vector. A warning is issued if scaling would result in division by a value smaller in magnitude than 1.0e10*DBL_MIN. User-provided values violating this lower bound are accepted unaltered, whereas for automatically calculated scaling, the lower bound is enforced.

Table 5.1 provides the specification detail for the method independent controls involving identifiers, pointers, tolerances, limits, output verbosity, speculative gradients, and scaling.

Table 5.1 Specification detail for the method independent controls: identifiers, pointers, tolerances, limits, output verbosity, speculative gradients, and scaling
Description Keyword Associated Data Status Default
Method set identifier id_method string Optional strategy use of last method parsed
Model pointer model_pointer string Optional method use of last model parsed (or use of default model if none parsed)
Speculative gradients and Hessians speculative none Optional no speculation
Output verbosity output silent | quiet | verbose | debug Optional normal
Maximum iterations max_iterations integer Optional 100 (exceptions: fsu_cvt/nond_local_reliability: 25, nond_global_{reliability,interval_est,evidence}/efficient_global: 25*n)
Maximum function evaluations max_function_evaluations integer Optional 1000
Constraint tolerance constraint_tolerance real Optional Library default
Convergence tolerance convergence_tolerance real Optional 1.e-4
Scaling flag scaling none Optional no scaling

Linear inequality constraints can be supplied with the linear_inequality_constraint_matrix, linear_inequality_lower_bounds, and linear_inequality_upper_bounds specifications, and linear equality constraints can be supplied with the linear_equality_constraint_matrix and linear_equality_targets specifications. In the inequality case, the constraint matrix provides coefficients for the variables and the lower and upper bounds provide constraint limits for the following two-sided formulation:

\[a_l \leq Ax \leq a_u\]

As with nonlinear inequality constraints (see Objective and constraint functions (optimization data set)), the default linear inequality constraint bounds are selected so that one-sided inequalities of the form

\[Ax \leq 0.0\]

result when there are no user bounds specifications (this provides backwards compatibility with previous DAKOTA versions). In a user bounds specification, any upper bound values greater than +bigRealBoundSize (1.e+30, as defined in Minimizer) are treated as +infinity and any lower bound values less than -bigRealBoundSize are treated as -infinity. This feature is commonly used to drop one of the bounds in order to specify a 1-sided constraint (just as the default lower bounds drop out since -DBL_MAX < -bigRealBoundSize). In the equality case, the constraint matrix again provides coefficients for the variables and the targets provide the equality constraint right hand sides:

\[Ax = a_t\]

and the defaults for the equality constraint targets enforce a value of 0. for each constraint

\[Ax = 0.0\]

Currently, DOT, CONMIN, NPSOL, NLSSOL, and OPT++ all support specialized handling of linear constraints (either directly through the algorithm itself or indirectly through the DAKOTA wrapper). Coliny optimizers will support linear constraints in future releases. Linear constraints need not be computed by the user's interface on every function evaluation; rather the coefficients, bounds, and targets of the linear constraints can be provided at start up, allowing the optimizers to track the linear constraints internally. It is important to recognize that linear constraints are those constraints that are linear in the design variables, e.g.:

\[0.0 \leq 3x_1 - 4x_2 + 2x_3 \leq 15.0\]

\[x_1 + x_2 + x_3 \geq 2.0\]

\[x_1 + x_2 - x_3 = 1.0\]

which is not to be confused with something like

\[s(X) - s_{fail} \leq 0.0\]

where the constraint is linear in a response quantity, but may be a nonlinear implicit function of the design variables. For the three linear constraints above, the specification would appear as:

linear_inequality_constraint_matrix =  3.0 -4.0  2.0
                                       1.0  1.0  1.0
linear_inequality_lower_bounds =       0.0  2.0
linear_inequality_upper_bounds =      15.0  1.e+50
linear_equality_constraint_matrix =    1.0  1.0 -1.0
linear_equality_targets =              1.0
where the 1.e+50 is a dummy upper bound value which defines a 1-sided inequality since it is greater than bigRealBoundSize. The constraint matrix specifications list the coefficients of the first constraint followed by the coefficients of the second constraint, and so on. They are divided into individual constraints based on the number of design variables, and can be broken onto multiple lines for readability as shown above.

The linear_inequality_scale_types and linear_equality_scale_types specifications provide strings specifying the scaling type for each linear inequality or equality constraint, respectively, in methods that support scaling, when scaling is enabled (see Method Independent Controls for details). Each entry in linear_*_scale_types may be selected from 'none', 'value', or 'auto' to select no, characteristic value, or automatic scaling, respectively. If a single string is specified it will apply to each constraint component. Each entry in linear_inequality_scales or linear_equality_scales may be a user-specified nonzero characteristic value to be used in scaling each constraint. These values are ignored for scaling type 'none', required for 'value', and optional for 'auto'. If a single real value is specified it will apply to all components of the constraint. Scaling for linear constraints is applied after any continuous variable scaling. For example, for variable scaling on continuous design variables x:

\[ \tilde{x}^j = \frac{x^j - x^j_O}{x^j_M} \]

we have the following system for linear inequality constraints

\[ a_L \leq A_i x \leq a_U \]

\[ a_L \leq A_i \left( \mathrm{diag}(x_M) \tilde{x} + x_O \right) \leq a_U \]

\[ a_L - A_i x_O \leq A_i \mathrm{diag}(x_M) \tilde{x} \leq a_U - A_i x_O \]

\[ \tilde{a}_L \leq \tilde{A}_i \tilde{x} \leq \tilde{a}_U \]

and user-specified or automatically computed scaling multipliers are appplied to this final transformed system, which accounts for continuous design variable scaling. When automatic scaling is in use for linear constraints they are linearly scaled by a computed characteristic value, but not affinely to [0,1].

Table 5.2 provides the specification detail for the method independent controls involving linear constraints.

Table 5.2 Specification detail for the method independent controls: linear inequality and equality constraints
Description Keyword Associated Data Status Default
Linear inequality coefficient matrix linear_inequality_constraint_matrix list of reals Optional no linear inequality constraints
Linear inequality lower bounds linear_inequality_lower_bounds list of reals Optional vector values = -DBL_MAX
Linear inequality upper bounds linear_inequality_upper_bounds list of reals Optional vector values = 0.
Linear inequality scaling types linear_inequality_scale_types list of strings Optional vector values = 'none'
Linear inequality scales linear_inequality_scales list of reals Optional vector values = 1. (no scaling)
Linear equality coefficient matrix linear_equality_constraint_matrix list of reals Optional no linear equality constraints
Linear equality targets linear_equality_targets list of reals Optional vector values = 0.
Linear equality scaling types linear_equality_scale_types list of strings Optional vector values = 'none'
Linear equality scales linear_equality_scales list of reals Optional vector values = 1. (no scaling)

Optimization Methods

The DAKOTA project started as an toolbox for optimization methods, and has accumulated a broad variety of gradient-based and nongradient-based optimizers from the DOT, NPSOL, NLPQL, CONMIN, OPT++, APPS, COLINY, NCSU, and JEGA packages. These capabilities are described below.

DOT Methods

The DOT library [Vanderplaats Research and Development, 1995] contains nonlinear programming optimizers, specifically the Broyden-Fletcher-Goldfarb-Shanno (DAKOTA's dot_bfgs method) and Fletcher-Reeves conjugate gradient (DAKOTA's dot_frcg method) methods for unconstrained optimization, and the modified method of feasible directions (DAKOTA's dot_mmfd method), sequential linear programming (DAKOTA's dot_slp method), and sequential quadratic programming (DAKOTA's dot_sqp method) methods for constrained optimization. DAKOTA provides access to the DOT library through the DOTOptimizer class.

DOT method independent controls

The method independent controls for max_iterations and max_function_evaluations limit the number of major iterations and the number of function evaluations that can be performed during a DOT optimization. The convergence_tolerance control defines the threshold value on relative change in the objective function that indicates convergence. This convergence criterion must be satisfied for two consecutive iterations before DOT will terminate. The constraint_tolerance specification defines how tightly constraint functions are to be satisfied at convergence. The default value for DOT constrained optimizers is 0.003. Extremely small values for constraint_tolerance may not be attainable. The output verbosity specification controls the amount of information generated by DOT: the silent and quiet settings result in header information, final results, and objective function, constraint, and parameter information on each iteration; whereas the verbose and debug settings add additional information on gradients, search direction, one-dimensional search results, and parameter scaling factors. DOT contains no parallel algorithms which can directly take advantage of concurrent evaluations. However, if numerical_gradients with method_source dakota is specified, then the finite difference function evaluations can be performed concurrently (using any of the parallel modes described in the Users Manual [Adams et al., 2010]). In addition, if speculative is specified, then gradients (dakota numerical or analytic gradients) will be computed on each line search evaluation in order to balance the load and lower the total run time in parallel optimization studies. Lastly, specialized handling of linear constraints is supported with DOT; linear constraint coefficients, bounds, and targets can be provided to DOT at start-up and tracked internally. Specification detail for these method independent controls is provided in Tables 5.1 through 5.2.

DOT method dependent controls

DOT's only method dependent control is optimization_type which may be either minimize or maximize. DOT provides the only set of methods within DAKOTA which support this control; to convert a maximization problem into the minimization formulation assumed by other methods, simply change the sign on the objective function (i.e., multiply by -1). Table 5.3 provides the specification detail for the DOT methods and their method dependent controls.

Table 5.3 Specification detail for the DOT methods
Description Keyword Associated Data Status Default
Optimization type optimization_type minimize | maximize Optional group minimize

NPSOL Method

The NPSOL library [Gill et al., 1986] contains a sequential quadratic programming (SQP) implementation (the npsol_sqp method). SQP is a nonlinear programming optimizer for constrained minimization. DAKOTA provides access to the NPSOL library through the NPSOLOptimizer class.

NPSOL method independent controls

The method independent controls for max_iterations and max_function_evaluations limit the number of major SQP iterations and the number of function evaluations that can be performed during an NPSOL optimization. The convergence_tolerance control defines NPSOL's internal optimality tolerance which is used in evaluating if an iterate satisfies the first-order Kuhn-Tucker conditions for a minimum. The magnitude of convergence_tolerance approximately specifies the number of significant digits of accuracy desired in the final objective function (e.g., convergence_tolerance = 1.e-6 will result in approximately six digits of accuracy in the final objective function). The constraint_tolerance control defines how tightly the constraint functions are satisfied at convergence. The default value is dependent upon the machine precision of the platform in use, but is typically on the order of 1.e-8 for double precision computations. Extremely small values for constraint_tolerance may not be attainable. The output verbosity setting controls the amount of information generated at each major SQP iteration: the silent and quiet settings result in only one line of diagnostic output for each major iteration and print the final optimization solution, whereas the verbose and debug settings add additional information on the objective function, constraints, and variables at each major iteration.

NPSOL is not a parallel algorithm and cannot directly take advantage of concurrent evaluations. However, if numerical_gradients with method_source dakota is specified, then the finite difference function evaluations can be performed concurrently (using any of the parallel modes described in the Users Manual [Adams et al., 2010]). An important related observation is the fact that NPSOL uses two different line searches depending on how gradients are computed. For either analytic_gradients or numerical_gradients with method_source dakota, NPSOL is placed in user-supplied gradient mode (NPSOL's "Derivative Level" is set to 3) and it uses a gradient-based line search (the assumption is that user-supplied gradients are inexpensive). On the other hand, if numerical_gradients are selected with method_source vendor, then NPSOL is computing finite differences internally and it will use a value-based line search (the assumption is that finite differencing on each line search evaluation is too expensive). The ramifications of this are: (1) performance will vary between method_source dakota and method_source vendor for numerical_gradients, and (2) gradient speculation is unnecessary when performing optimization in parallel since the gradient-based line search in user-supplied gradient mode is already load balanced for parallel execution. Therefore, a speculative specification will be ignored by NPSOL, and optimization with numerical gradients should select method_source dakota for load balanced parallel operation and method_source vendor for efficient serial operation.

Lastly, NPSOL supports specialized handling of linear inequality and equality constraints. By specifying the coefficients and bounds of the linear inequality constraints and the coefficients and targets of the linear equality constraints, this information can be provided to NPSOL at initialization and tracked internally, removing the need for the user to provide the values of the linear constraints on every function evaluation. Refer to Method Independent Controls for additional information and to Tables 5.1 through 5.2 for method independent control specification detail.

NPSOL method dependent controls

NPSOL's method dependent controls are verify_level, function_precision, and linesearch_tolerance. The verify_level control instructs NPSOL to perform finite difference verifications on user-supplied gradient components. The function_precision control provides NPSOL an estimate of the accuracy to which the problem functions can be computed. This is used to prevent NPSOL from trying to distinguish between function values that differ by less than the inherent error in the calculation. And the linesearch_tolerance setting controls the accuracy of the line search. The smaller the value (between 0 and 1), the more accurately NPSOL will attempt to compute a precise minimum along the search direction. Table 5.4 provides the specification detail for the NPSOL SQP method and its method dependent controls.

Table 5.4 Specification detail for the NPSOL SQP method
Description Keyword Associated Data Status Default
Gradient verification level verify_level integer Optional -1 (no gradient verification)
Function precision function_precision real Optional 1.e-10
Line search tolerance linesearch_tolerance real Optional 0.9 (inaccurate line search)

NLPQL Methods

The NLPQL library is a commercially-licensed library containing a sequential quadratic programming (SQP) optimizer, specified as DAKOTA's nlpql_sqp method, for constrained optimization. The particular implementation used is NLPQLP [Schittkowski, 2004], a variant with distributed and non-monotone line search. DAKOTA provides access to the NLPQL library through the NLPQLPOptimizer class.

NLPQL method independent controls

The method independent controls for maximum iterations and output verbosity are mapped to NLPQL controls MAXIT and IPRINT, respectively. The maximum number of function evaluations is enforced within the NLPQLPOptimizer class.

NLPQL method dependent controls

NLPQL does not currently support any method dependent controls.

CONMIN Methods

The CONMIN library [Vanderplaats, 1973] is a public domain library of nonlinear programming optimizers, specifically the Fletcher-Reeves conjugate gradient (DAKOTA's conmin_frcg method) method for unconstrained optimization, and the method of feasible directions (DAKOTA's conmin_mfd method) for constrained optimization. As CONMIN was a predecessor to the DOT commercial library, the algorithm controls are very similar. DAKOTA provides access to the CONMIN library through the CONMINOptimizer class.

CONMIN method independent controls

The interpretations of the method independent controls for CONMIN are essentially identical to those for DOT. Therefore, the discussion in DOT method independent controls is relevant for CONMIN.

CONMIN method dependent controls

CONMIN does not currently support any method dependent controls.

OPT++ Methods

The OPT++ library [Meza et al., 2007] contains primarily gradient-based nonlinear programming optimizers for unconstrained, bound-constrained, and nonlinearly constrained minimization: Polak-Ribiere conjugate gradient (DAKOTA's optpp_cg method), quasi-Newton (DAKOTA's optpp_q_newton method), finite difference Newton (DAKOTA's optpp_fd_newton method), and full Newton (DAKOTA's optpp_newton method). The conjugate gradient method is strictly unconstrained, and each of the Newton-based methods are automatically bound to the appropriate OPT++ algorithm based on the user constraint specification (unconstrained, bound-constrained, or generally-constrained). In the generally-constrained case, the Newton methods use a nonlinear interior-point approach to manage the constraints. The library also contains a direct search algorithm, PDS (parallel direct search, DAKOTA's optpp_pds method), which supports bound constraints. DAKOTA provides access to the OPT++ library through the SNLLOptimizer class, where "SNLL" denotes Sandia National Laboratories - Livermore.

OPT++ method independent controls

The method independent controls for max_iterations and max_function_evaluations limit the number of major iterations and the number of function evaluations that can be performed during an OPT++ optimization. The convergence_tolerance control defines the threshold value on relative change in the objective function that indicates convergence. The output verbosity specification controls the amount of information generated from OPT++ executions: the debug setting turns on OPT++'s internal debug mode and also generates additional debugging information from DAKOTA's SNLLOptimizer wrapper class. OPT++'s gradient-based methods are not parallel algorithms and cannot directly take advantage of concurrent function evaluations. However, if numerical_gradients with method_source dakota is specified, a parallel DAKOTA configuration can utilize concurrent evaluations for the finite difference gradient computations. OPT++'s nongradient-based PDS method can directly exploit asynchronous evaluations; however, this capability has not yet been implemented in the SNLLOptimizer class.

The speculative specification enables speculative computation of gradient and/or Hessian information, where applicable, for parallel optimization studies. By speculating that the derivative information at the current point will be used later, the complete data set (all available gradient/Hessian information) can be computed on every function evaluation. While some of these computations will be wasted, the positive effects are a consistent parallel load balance and usually shorter wall clock time. The speculative specification is applicable only when parallelism in the gradient calculations can be exploited by DAKOTA (it will be ignored for vendor numerical gradients).

Lastly, linear constraint specifications are supported by each of the Newton methods (optpp_newton, optpp_q_newton, optpp_fd_newton, and optpp_g_newton); whereas optpp_cg must be unconstrained and optpp_pds can be, at most, bound-constrained. Specification detail for the method independent controls is provided in Tables 5.1 through 5.2.

OPT++ method dependent controls

OPT++'s method dependent controls are max_step, gradient_tolerance, search_method, merit_function, central_path, steplength_to_boundary, centering_parameter, and search_scheme_size. The max_step control specifies the maximum step that can be taken when computing a change in the current design point (e.g., limiting the Newton step computed from current gradient and Hessian information). It is equivalent to a move limit or a maximum trust region size. The gradient_tolerance control defines the threshold value on the L2 norm of the objective function gradient that indicates convergence to an unconstrained minimum (no active constraints). The gradient_tolerance control is defined for all gradient-based optimizers.

max_step and gradient_tolerance are the only method dependent controls for the OPT++ conjugate gradient method. Table 5.5 covers this specification.

Table 5.5 Specification detail for the OPT++ conjugate gradient method
Description Keyword Associated Data Status Default
OPT++ conjugate gradient method optpp_cg none Required N/A
Maximum step size max_step real Optional 1000.
Gradient tolerance gradient_tolerance real Optional 1.e-4

The search_method control is defined for all Newton-based optimizers and is used to select between trust_region, gradient_based_line_search, and value_based_line_search methods. The gradient_based_line_search option uses the line search method proposed by [More and Thuente, 1994]. This option satisfies sufficient decrease and curvature conditions; whereas, value_base_line_search only satisfies the sufficient decrease condition. At each line search iteration, the gradient_based_line_search method computes the function and gradient at the trial point. Consequently, given expensive function evaluations, the value_based_line_search method is preferred to the gradient_based_line_search method. Each of these Newton methods additionally supports the tr_pds selection for unconstrained problems. This option performs a robust trust region search using pattern search techniques. Use of a line search is the default for bound-constrained and generally-constrained problems, and use of a trust_region search method is the default for unconstrained problems.

The merit_function, central_path, steplength_to_boundary, and centering_parameter selections are additional specifications that are defined for the solution of generally-constrained problems with nonlinear interior-point algorithms. A merit_function is a function in constrained optimization that attempts to provide joint progress toward reducing the objective function and satisfying the constraints. Valid string inputs are "el_bakry", "argaez_tapia", or "van_shanno", where user input is not case sensitive in this case. Details for these selections are as follows:

If the function evaluation is expensive or noisy, set the merit_function to "argaez_tapia" or "van_shanno".

The central_path specification represents a measure of proximity to the central path and specifies an update strategy for the perturbation parameter mu. Refer to [Argaez et al., 2002] for a detailed discussion on proximity measures to the central region. Valid options are, again, "el_bakry", "argaez_tapia", or "van_shanno", where user input is not case sensitive. The default value for central_path is the value of merit_function (either user-selected or default). The steplength_to_boundary specification is a parameter (between 0 and 1) that controls how close to the boundary of the feasible region the algorithm is allowed to move. A value of 1 means that the algorithm is allowed to take steps that may reach the boundary of the feasible region. If the user wishes to maintain strict feasibility of the design parameters this value should be less than 1. Default values are .8, .99995, and .95 for the "el_bakry", "argaez_tapia", and "van_shanno" merit functions, respectively. The centering_parameter specification is a parameter (between 0 and 1) that controls how closely the algorithm should follow the "central path". See [Wright] for the definition of central path. The larger the value, the more closely the algorithm follows the central path, which results in small steps. A value of 0 indicates that the algorithm will take a pure Newton step. Default values are .2, .2, and .1 for the "el_bakry", "argaez_tapia", and "van_shanno" merit functions, respectively.

Table 5.6 provides the details for the Newton-based methods.

Table 5.6 Specification detail for OPT++ Newton-based optimization methods
Description Keyword Associated Data Status Default
OPT++ Newton-based methods optpp_q_newton | optpp_fd_newton | optpp_newton none Required group N/A
Search method value_based_line_search | gradient_based_line_search | trust_region | tr_pds none Optional group trust_region (unconstrained), value_based_line_search (bound/general constraints)
Maximum step size max_step real Optional 1000.
Gradient tolerance gradient_tolerance real Optional 1.e-4
Merit function merit_function string Optional "argaez_tapia"
Central path central_path string Optional value of merit_function
Steplength to boundary steplength_to_boundary real Optional Merit function dependent: 0.8 (el_bakry), 0.99995 (argaez_tapia), 0.95 (van_shanno)
Centering parameter centering_parameter real Optional Merit function dependent: 0.2 (el_bakry), 0.2 (argaez_tapia), 0.1 (van_shanno)

The search_scheme_size is defined for the PDS method to specify the number of points to be used in the direct search template. PDS does not support parallelism at this time due to current limitations in the OPT++ interface. Table 5.7 provides the detail for the parallel direct search method.

Table 5.7 Specification detail for the OPT++ PDS method
Description Keyword Associated Data Status Default
OPT++ parallel direct search method optpp_pds none Required group N/A
Search scheme size search_scheme_size integer Optional 32

Asynchronous Parallel Pattern Search (APPS)

The asynchronous parallel pattern search algorithm [Gray and Kolda, 2006] is a fully asynchronous pattern search technique in that the search along each offset direction continues without waiting for searches along other directions to finish. APPSPACK can handle unconstrained problems as well as those with bound constraints, linear constraints, and general nonlinear constraints. APPSPACK is available to the public under the GNU LGPL and the source code is included with DAKOTA. APPS-specific software documentation is available from http://software.sandia.gov/appspack.

APPS method independent controls

The only method independent controls that are currently mapped to APPS are max_function_evaluations, constraint_tolerance, and the output verbosity control. The APPS internal "debug" level is mapped to the DAKOTA debug, verbose, normal, quiet, and silent settings as follows:

APPS exploits parallelism through the use of DAKOTA's concurrent function evaluations. The nature of the algorithm, however, limits the amount of concurrency that can be exploited. In particular, APPS can leverage an evaluation concurrency level of at most twice the number of variables.

APPS method dependent controls

The APPS method is invoked using a asynch_pattern_search group specification. Some of the method dependent controls are similar to the Coliny controls for coliny_pattern_search described in Pattern Search. In particular, APPS supports the following step length control parameters

When the solution to the optimization problem is known to be zero, the user may specify a value for solution_target as a termination criteria. APPS will terminate when the function value falls below solution_target.

Currently, APPS only supports coordinate bases with a total of 2n function evaluations in the pattern, and these patterns may only contract. The synchronization specification can be used to specify the use of either blocking or nonblocking schedulers for APPS. The blocking option causes APPS to behave as a synchronous algorithm. The nonblocking option is not available when Dakota is used in message-passing mode.

APPS solves nonlinearly constrained problems by solving a sequence of linearly constrained merit function-base subproblems. There are several exact and smoothed exact penalty functions that can be specified with the merit_function control. The options are as follows:

The user can also specify a constraint_penalty and smoothing_parameter. Table 5.8 summarizes the APPS specification.

Table 5.8 Specification detail for the APPS method
Description Keyword Associated Data Status Default
APPS method asynch_pattern_search none Required group N/A
Initial offset value initial_delta real Optional 1.0
Threshold for offset values threshold_delta real Optional 0.01
Pattern contraction factor contraction_factor real Optional 0.5
Solution target solution_target real Optional not used
Evaluation synchronization synchronization blocking | nonblocking Optional nonblocking
Merit function merit_function merit_max | merit_max_smooth | merit1 | merit1_smooth | merit2 | merit2_smooth | merit2_squared Optional merit2_smooth
Constraint penalty constraint_penalty real Optional 1.0
Smoothing factor smoothing_factor real Optional 1.0

Coliny Methods

Coliny is a collection of nongradient-based optimizers that support the Common Optimization Library INterface (COLIN). Coliny optimizers currently include coliny_cobyla, coliny_direct, coliny_ea, coliny_pattern_search and coliny_solis_wets. Additional Coliny information is available from http://software.sandia.gov/Acro/Coliny/.

Coliny solvers now support bound constraints and general nonlinear constraints. Supported nonlinear constraints include both equality and two-sided inequality constraints. Coliny solvers do not yet support linear constraints. Most Coliny optimizers treat constraints with a simple penalty scheme that adds constraint_penalty times the sum of squares of the constraint violations to the objective function. Specific exceptions to this method for handling constraint violations are noted below. (The default value of constraint_penalty is 1000.0, except for methods that dynamically adapt their constraint penalty, for which the default value is 1.0.)

Coliny method independent controls

The method independent controls for max_iterations and max_function_evaluations limit the number of major iterations and the number of function evaluations that can be performed during a Coliny optimization, respectively. The convergence_tolerance control defines the threshold value on relative change in the objective function that indicates convergence. The output verbosity specification controls the amount of information generated by Coliny: the silent, quiet, and normal settings correspond to minimal reporting from Coliny, whereas the verbose setting corresponds to a higher level of information, and debug outputs method initialization and a variety of internal Coliny diagnostics. The majority of Coliny's methods perform independent function evaluations that can directly take advantage of DAKOTA's parallel capabilities. Only coliny_solis_wets, coliny_cobyla, and certain configurations of coliny_pattern_search are inherently serial (see Pattern Search). The parallel methods automatically utilize parallel logic when the DAKOTA configuration supports parallelism. Lastly, neither speculative gradients nor linear constraints are currently supported with Coliny. Specification detail for method independent controls is provided in Tables 5.1 through 5.2.

Some COLINY methods exploit parallelism through the use of DAKOTA's concurrent function evaluations. The nature of the algorithms, however, limits the amount of concurrency that can be exploited. The maximum amount of evaluation concurrency that can be leveraged by the various methods is as follows:

Coliny method dependent controls

All Coliny methods support the show_misc_options optional specification which results in a dump of all the allowable method inputs. Note that the information provided by this command refers to optimizer parameters that are internal to Coliny, and which may differ from corresponding parameters used by the DAKOTA interface. The misc_options optional specification provides a means for inputing additional settings supported by the Coliny methods but which are not currently mapped through the DAKOTA input specification. Care must be taken in using this specification; they should only be employed by users familiar with the full range of parameter specifications available directly from Coliny and understand any differences that exist between those specifications and the ones available through DAKOTA.

Each of the Coliny methods supports the solution_target control, which defines a convergence criterion in which the optimizer will terminate if it finds an objective function value lower than the specified target. Specification detail for method dependent controls for all Coliny methods is provided in Table 5.9.

Table 5.9 Specification detail for Coliny method dependent controls
Description Keyword Associated Data Status Default
Show miscellaneous options show_misc_options none Optional no dump of specification options
Specify miscellaneous options misc_options list of strings Optional no miscellaneous options specified
Desired solution target solution_target real Optional -DBL_MAX

Each Coliny method supplements the settings of Table 5.9 with controls which are specific to its particular class of method.

COBYLA

The Constrained Optimization BY Linear Approximations (COBYLA) algorithm is an extension to the Nelder-Mead simplex algorithm for handling general linear/nonlinear constraints and is invoked using the coliny_cobyla group specification. The COBYLA algorithm employs linear approximations to the objective and constraint functions, the approximations being formed by linear interpolation at N+1 points in the space of the variables. We regard these interpolation points as vertices of a simplex. The step length parameter controls the size of the simplex and it is reduced automatically from initial_delta to threshold_delta. One advantage that COBYLA has over many of its competitors is that it treats each constraint individually when calculating a change to the variables, instead of lumping the constraints together into a single penalty function.

COBYLA currently only supports termination based on the max_function_evaluations and solution_target specifications. The search performed by COBYLA is currently not parallelized.

Table 5.10 summarizes the COBYLA specification.

Table 5.10 Specification detail for the COBYLA method
Description Keyword Associated Data Status Default
COBYLA method coliny_cobyla none Required group N/A
Initial offset value initial_delta real Required N/A
Threshold for offset values threshold_delta real Required N/A

DIRECT

The DIviding RECTangles (DIRECT) optimization algorithm is a derivative free global optimization method that balances local search in promising regions of the design space with global search in unexplored regions. As shown in Figure 5.1, DIRECT adaptively subdivides the space of feasible design points so as to guarantee that iterates are generated in the neighborhood of a global minimum in finitely many iterations.

direct1.jpg

Figure 5.1 Design space partitioning with DIRECT

In practice, DIRECT has proven an effective heuristic for engineering design applications, for which it is able to quickly identify candidate solutions that can be further refined with fast local optimizers.

DIRECT uses the solution_target, constraint_penalty and show_misc_options specifications that are described in Coliny method dependent controls. Note, however, that DIRECT uses a fixed penalty value for constraint violations (i.e. it is not dynamically adapted as is done in coliny_pattern_search).

The division specification determines how DIRECT subdivides each subregion of the search space. If division is set to major_dimension, then the dimension representing the longest edge of the subregion is subdivided (this is the default). If division is set to all_dimensions, then all dimensions are simultaneously subdivided.

Each subregion considered by DIRECT has a size, which corresponds to the longest diagonal of the subregion. The global_balance_parameter controls how much global search is performed by only allowing a subregion to be subdivided if the size of the subregion divided by the size of the largest subregion is at least global_balance_parameter. Intuitively, this forces large subregions to be subdivided before the smallest subregions are refined. The local_balance_parameter provides a tolerance for estimating whether the smallest subregion can provide a sufficient decrease to be worth subdividing; the default value is a small value that is suitable for most applications.

DIRECT can be terminated with the standard max_function_evaluations and solution_target specifications. Additionally, the max_boxsize_limit specification terminates DIRECT if the size of the largest subregion falls below this threshold, and the min_boxsize_limit specification terminates DIRECT if the size of the smallest subregion falls below this threshold. In practice, this latter specification is likely to be more effective at limiting DIRECT's search.

Table 5.11 summarizes the DIRECT specification.

Table 5.11 Specification detail for the DIRECT method
Description Keyword Associated Data Status Default
DIRECT method coliny_direct none Required group N/A
Box subdivision approach division major_dimension | all_dimensions Optional group major_dimension
Global search balancing parameter global_balance_parameter real Optional 0.0
Local search balancing parameter local_balance_parameter real Optional 1.e-8
Maximum boxsize limit max_boxsize_limit real Optional 0.0
Minimum boxsize limit min_boxsize_limit real Optional 0.0001
Constraint penalty constraint_penalty real Optional 1000.0

Evolutionary Algorithms

DAKOTA currently provides several variants of evolutionary algorithms, invoked through the coliny_ea group specification.

The basic steps of an evolutionary algorithm are depicted in Figure 5.2.

ga.jpg

Figure 5.2 Depiction of evolutionary algorithm

They can be enumerated as follows:

  1. Select an initial population randomly and perform function evaluations on these individuals
  2. Perform selection for parents based on relative fitness
  3. Apply crossover and mutation to generate new_solutions_generated new individuals from the selected parents
  4. Perform function evaluations on the new individuals
  5. Perform replacement to determine the new population
  6. Return to step 2 and continue the algorithm until convergence criteria are satisfied or iteration limits are exceeded

Table 5.12 provides the specification detail for the controls for seeding the method, initializing a population, and for selecting and replacing population members.

Table 5.12 Specification detail for the Coliny EA method dependent controls: seed, initialization, selection, and replacement
Description Keyword Associated Data Status Default
EA selection coliny_ea none Required group N/A
Random seed seed integer Optional randomly generated seed
Number of population members population_size integer Optional 50
Initialization type initialization_type simple_random | unique_random | flat_file Required unique_random
Fitness type fitness_type linear_rank | merit_function Optional linear_rank
Replacement type replacement_type random | chc | elitist Optional group elitist = 1
Random replacement type random integer Required N/A
CHC replacement type chc integer Required N/A
Elitist replacement type elitist integer Required N/A
New solutions generated new_solutions_generated integer Optional population_size - replacement_size

The random seed control provides a mechanism for making a stochastic optimization repeatable. That is, the use of the same random seed in identical studies will generate identical results. The population_size control specifies how many individuals will comprise the EA's population.

The initialization_type defines the type of initialization for the population of the EA. There are three types: simple_random, unique_random, and flat_file. simple_random creates initial solutions with random variable values according to a uniform random number distribution. It gives no consideration to any previously generated designs. The number of designs is specified by the population_size. unique_random is the same as simple_random, except that when a new solution is generated, it is checked against the rest of the solutions. If it duplicates any of them, it is rejected. flat_file allows the initial population to be read from a flat file. If flat_file is specified, a file name must be given.

The fitness_type controls how strongly differences in "fitness" (i.e., the objective function) are weighted in the process of selecting "parents" for crossover:

The replacement_type controls how current populations and newly generated individuals are combined to create a new population. Each of the replacement_type selections accepts an integer value, which is referred to below and in Table 5.12 as the replacement_size:

Table 5.13 show the controls in the EA method associated with crossover and mutation.

Table 5.13 Specification detail for the Coliny EA method: crossover and mutation
Description Keyword Associated Data Status Default
Crossover type crossover_type two_point | blend | uniform Optional group two_point
Crossover rate crossover_rate real Optional 0.8
Mutation type mutation_type replace_uniform | offset_normal | offset_cauchy | offset_uniform Optional group offset_normal
Mutation scale mutation_scale real Optional 0.1
Mutation range mutation_range integer Optional 1
Mutation dimension ratio dimension_ratio real Optional 1.0
Mutation rate mutation_rate real Optional 1.0
Non-adaptive mutation flag non_adaptive none Optional Adaptive mutation

The crossover_type controls what approach is employed for combining parent genetic information to create offspring, and the crossover_rate specifies the probability of a crossover operation being performed to generate a new offspring. The Coliny EA method supports three forms of crossover, two_point, blend, and uniform, which generate a new individual through combinations of two parent individuals. Two-point crossover divides each parent into three regions, where offspring are created from the combination of the middle region from one parent and the end regions from the other parent. Since the Coliny EA does not utilize bit representations of variable values, the crossover points only occur on coordinate boundaries, never within the bits of a particular coordinate. Uniform crossover creates offspring through random combination of coordinates from the two parents. Blend crossover generates a new individual randomly along the multidimensional vector connecting the two parents.

The mutation_type controls what approach is employed in randomly modifying continuous design variables within the EA population. Each of the mutation methods generates coordinate-wise changes to individuals, usually by adding a random variable to a given coordinate value (an "offset" mutation), but also by replacing a given coordinate value with a random variable (a "replace" mutation). Discrete design variables are always mutated using the offset_uniform method. The mutation_rate controls the probability of mutation being performed on an individual, both for new individuals generated by crossover (if crossover occurs) and for individuals from the existing population. When mutation is performed, all dimensions of each individual are mutated. The mutation_scale specifies a scale factor which scales continuous mutation offsets; this is a fraction of the total range of each dimension, so mutation_scale is a relative value between 0 and 1. The mutation_range is used to control offset_uniform mutation used for discrete parameters. The replacement discrete value is the original value plus or minus an integer value up to mutation_range. The offset_normal, offset_cauchy, and offset_uniform mutation types are "offset" mutations in that they add a 0-mean random variable with a normal, cauchy, or uniform distribution, respectively, to the existing coordinate value. These offsets are limited in magnitude by mutation_scale. The replace_uniform mutation type is not limited by mutation_scale; rather it generates a replacement value for a coordinate using a uniformly distributed value over the total range for that coordinate.

The Coliny EA method uses self-adaptive mutation, which modifies the mutation scale dynamically. This mechanism is borrowed from EAs like evolution strategies. The non_adaptive flag can be used to deactivate the self-adaptation, which may facilitate a more global search.

Pattern Search

Pattern search techniques are nongradient-based optimization methods which use a set of offsets from the current iterate to locate improved points in the design space. The Coliny pattern search technique is invoked using a coliny_pattern_search group specification, which includes a variety of specification components.

Traditional pattern search methods search with a fixed pattern of search directions to try to find improvements to the current iterate. The Coliny pattern search methods generalize this simple algorithmic strategy to enable control of how the search pattern is adapted, as well as how each search pattern is evaluated. The stochastic and synchronization specifications denote how the the trial points are evaluated. The stochastic specification indicates that the trial points are considered in a random order. For parallel pattern search, synchronization dictates whether the evaluations are scheduled using a blocking scheduler or a nonblocking scheduler (i.e., Model::synchronize() or Model::synchronize_nowait(), respectively). In the blocking case, all points in the pattern are evaluated (in parallel), and if the best of these trial points is an improving point, then it becomes the next iterate. These runs are reproducible, assuming use of the same seed in the stochastic case. In the nonblocking case, all points in the pattern may not be evaluated, since the first improving point found becomes the next iterate. Since the algorithm steps will be subject to parallel timing variabilities, these runs will not generally be repeatable. The synchronization specification has similar connotations for sequential pattern search. If blocking is specified, then each sequential iteration terminates after all trial points have been considered, and if nonblocking is specified, then each sequential iteration terminates after the first improving trial point is evaluated.

The particular form of the search pattern is controlled by the pattern_basis specification. If pattern_basis is coordinate basis, then the pattern search uses a plus and minus offset in each coordinate direction, for a total of 2n function evaluations in the pattern. This case is depicted in Figure 5.3 for three coordinate dimensions.

pattern_search.jpg

Figure 5.3 Depiction of coordinate pattern search algorithm

If pattern_basis is simplex, then pattern search uses a minimal positive basis simplex for the parameter space, for a total of n+1 function evaluations in the pattern. Note that the simplex pattern basis can be used for unbounded problems only. The total_pattern_size specification can be used to augment the basic coordinate and simplex patterns with additional function evaluations, and is particularly useful for parallel load balancing. For example, if some function evaluations in the pattern are dropped due to duplication or bound constraint interaction, then the total_pattern_size specification instructs the algorithm to generate new offsets to bring the total number of evaluations up to this consistent total.

The exploratory_moves specification controls how the search pattern is adapted. (The search pattern can be adapted after an improving trial point is found, or after all trial points in a search pattern have been found to be unimproving points.) The following exploratory moves selections are supported by Coliny:

The initial_delta and threshold_delta specifications provide the initial offset size and the threshold size at which to terminate the algorithm. For any dimension that has both upper and lower bounds, this step length will be internally rescaled to provide search steps of length initial_delta * range. This rescaling does not occur for other dimensions, so search steps in those directions have length initial_delta.

In general, pattern search methods can expand and contract their step lengths. Coliny pattern search methods contract the step length by the value contraction_factor, and they expand the step length by the value (1/contraction_factor). The expand_after_success control specifies how many successful objective function improvements must occur with a specific step length prior to expansion of the step length, whereas the no_expansion flag instructs the algorithm to forgo pattern expansion altogether.

Finally, constraint infeasibility can be managed in a somewhat more sophisticated manner than the simple weighted penalty function. If the constant_penalty specification is used, then the simple weighted penalty scheme described above is used. Otherwise, the constraint penalty is adapted to the value constraint_penalty/L, where L is the the smallest step length used so far.

Table 5.14 and Table 5.15 provide the specification detail for the Coliny pattern search method and its method dependent controls.

Table 5.14 Specification detail for the Coliny pattern search method: randomization, delta, and constraint controls
Description Keyword Associated Data Status Default
Coliny pattern search method coliny_pattern_search none Required group N/A
Stochastic pattern search stochastic none Optional group N/A
Random seed for stochastic pattern search seed integer Optional randomly generated seed
Initial offset value initial_delta real Required N/A
Threshold for offset values threshold_delta real Required N/A
Constraint penalty constraint_penalty real Optional 1.0
Control of dynamic penalty constant_penalty none Optional algorithm dynamically adapts the constraint penalty

Table 5.15 Specification detail for the Coliny pattern search method: pattern controls
Description Keyword Associated Data Status Default
Pattern basis selection pattern_basis coordinate | simplex Optional coordinate
Total number of points in pattern total_pattern_size integer Optional no augmentation of basic pattern
No expansion flag no_expansion none Optional algorithm may expand pattern size
Number of consecutive improvements before expansion expand_after_success integer Optional 1
Pattern contraction factor contraction_factor real Optional 0.5
Evaluation synchronization synchronization blocking | nonblocking Optional nonblocking
Exploratory moves selection exploratory_moves basic_pattern | multi_step | adaptive_pattern Optional basic_pattern

Solis-Wets

DAKOTA's implementation of Coliny also contains the Solis-Wets algorithm. The Solis-Wets method is a simple greedy local search heuristic for continuous parameter spaces. Solis-Wets generates trial points using a multivariate normal distribution, and unsuccessful trial points are reflected about the current point to find a descent direction. This algorithm is inherently serial and will not utilize any parallelism. Table 5.16 provides the specification detail for this method and its method dependent controls.

Table 5.16 Specification detail for the Coliny Solis-Wets method
Description Keyword Associated Data Status Default
Coliny Solis-Wets method coliny_solis_wets none Required group N/A
Random seed for stochastic pattern search seed integer Optional randomly generated seed
Initial offset value initial_delta real Required N/A
Threshold for offset values threshold_delta real Required N/A
No expansion flag no_expansion none Optional algorithm may expand pattern size
Number of consecutive improvements before expansion expand_after_success integer Optional 5
Number of consecutive failures before contraction contract_after_failure integer Optional 3
Pattern contraction factor contraction_factor real Optional 0.5
Constraint penalty constraint_penalty real Optional 1.0
Control of dynamic penalty constant_penalty none Optional algorithm dynamically adapts the constraint penalty

These specifications have the same meaning as corresponding specifications for coliny_pattern_search. In particular, coliny_solis_wets supports dynamic rescaling of the step length, and dynamic rescaling of the constraint penalty. The only new specification is contract_after_failure, which specifies the number of unsuccessful cycles which must occur with a specific delta prior to contraction of the delta.

NCSU Methods

North Carolina State University (NCSU) has an implementation of the DIRECT algorithm (DIviding RECTangles algorithm that is outlined in the Coliny method section above). This version is documented in [Gablonsky, 2001.] We have found that the NCSU DIRECT implementation works better and is more robust for some problems than coliny_direct. Currently, we maintain both versions of DIRECT in DAKOTA; in the future, we may deprecate one. The NCSU DIRECT method is selected with ncsu_direct. We have tried to maintain consistency between the keywords in COLINY and NCSU implementation of DIRECT, but the algorithms have different parameters, so the keywords sometimes have slightly different meaning.

NCSU method independent controls

The method independent controls for max_iterations and max_function_evaluations limit the number of iterations and the number of function evaluations that can be performed during an NCSU DIRECT optimization. This methods will always strictly respect the number of iterations, but may slightly exceed the number of function evaluations, as it will always explore all sub-rectangles at the current level.

NCSU method dependent controls

There are four specification controls affecting NCSU DIRECT: solution_target, convergence_tolerance, min_boxsize_limit, and volume_boxsize_limit. The solution target specifies a goal toward which the optimizer should track. When solution_target is specified, convergence_tolerance specifies a percent error on the optimization. This is used for test problems, when the true global minimum is known (call it solution_target := fglobal). Then, the optimization terminates when 100(f_min-fglobal)/max(1,abs(fglobal) < convergence_tolerance. The default for fglobal is -1.0e100 and the default for convergence tolerance is as given above.

min_boxsize_limit is a setting that terminates the optimization when the measure of a hyperrectangle S with f(c(S)) = fmin is less than min_boxsize_limit. volume_boxsize_limit is a setting that terminates the optimization when the volume of a hyperrectangle S with f(c(S)) = fmin is less than volume_boxsize_limit percent of the original hyperrectangle. Basically, volume_boxsize_limit stops the optimization when the volume of the particular rectangle which has fmin is less than a certain percentage of the whole volume. min_boxsize_limit uses an arbitrary measure to stop the optimization. The keywords for NCSU DIRECT are described in Table 5.17 below.

Table 5.17 Specification detail for the NCSU DIRECT method
Description Keyword Associated Data Status Default
Solution Target solution_target real Optional 0
Min boxsize limit min_boxsize_limit real in [0,1] Optional 1.0e-4
Volume boxsize limit vol_boxsize_limit real in [0,1] Optional 1.0e-6

JEGA Methods

The JEGA library [Eddy and Lewis, 2001] contains two global optimization methods. The first is a Multi-objective Genetic Algorithm (MOGA) which performs Pareto optimization. The second is a Single-objective Genetic Algorithm (SOGA) which performs optimization on a single objective function. Both methods support general constraints and a mixture of real and discrete variables. The JEGA library was written by John Eddy, currently a member of the technical staff in the System Readiness and Sustainment Technologies department at Sandia National Laboratories in Albuquerque. These algorithms are accessed as moga and soga within DAKOTA. DAKOTA provides access to the JEGA library through the JEGAOptimizer class.

JEGA method independent controls

JEGA utilizes the max_iterations and max_function_evaluations method independent controls to provide integer limits for the maximum number of generations and function evaluations, respectively. Note that currently, the DAKOTA default for max_iterations is 100 and for max_function_evaluations is 1000. These are the default settings that will be used to "stop" the JEGA algorithms, unless some specific convergence criteria are set (see Tables 5.20 and 5.21 below).

Beginning with v2.0, JEGA also utilizes the output method independent control to vary the amount of information presented to the user during execution.

JEGA method dependent controls

The JEGA library currently provides two types of genetic algorithms (GAs): a multi-objective genetic algorithm (moga), and a single- objective genetic algorithm (soga). Both of these GAs can take real-valued inputs, integer-valued inputs, or a mixture of real and integer-valued inputs. "Real-valued" and "integer-valued" refer to the use of continuous or discrete variable domains, respectively (the response data are real-valued in all cases).

The basic steps of the genetic algorithm are as follows:

  1. Initialize the population (by randomly generating population members with or without duplicates allowed, or by flat-file initialization)

  2. Evaluate the initial population members (calculate the values of the objective function(s) and constraints for each population member)

  3. Perform crossover (several crossover types are available)

  4. Perform mutation (several mutation types are available)

  5. Evaluate the new population members.

  6. Assess the fitness of each member in the population. There are a number of ways to evaluate the fitness of the members of the populations. Choice of fitness assessor operators is strongly related to the type of replacement algorithm being used and can have a profound effect on the solutions selected for the next generation. For example, if using MOGA, the available assessors are the layer_rank and domination_count fitness assessors. If using either of these, it is strongly recommended that you use the replacement_type called the below_limit selector as well (although the roulette wheel selectors can also be used). The functionality of the domination_count selector of JEGA v1.0 can now be achieved using the domination_count fitness assessor and below_limit replacement selector together. If using SOGA, there are a number of possible combinations of fitness assessors and selectors.

  7. Replace the population with members selected to continue in the next generation. The pool of potential members is the current population and the current set of offspring. The replacement_type of roulette_wheel or unique_roulette_wheel may be used either with MOGA or SOGA problems however they are not recommended for use with MOGA. Given that the only two fitness assessors for MOGA are the layer_rank and domination_count, the recommended selector is the below_limit selector. The below_limit replacement will only keep designs that are dominated by fewer than a limiting number of other designs. The replacement_type of favor_feasible is specific to a SOGA. This replacement operator will always prefer a more feasible design to a less feasible one. Beyond that, it favors solutions based on an assigned fitness value which must have been installed by the weighted sum only fitness assessor (see the discussion below).

  8. Apply niche pressure to the population. This step is specific to the MOGA and is new as of JEGA v2.0. Technically, the step is carried out during runs of the SOGA but only the null_niching operator is available for use with SOGA. In MOGA, the radial or distance operators can be used. The purpose of niching is to encourage differentiation along the Pareto frontier and thus a more even and uniform sampling. The radial nicher takes information input from the user to compute a minimum allowable distance between designs in the performance space and acts as a secondary selection operator whereby it enforces this minimum distance. The distance nicher requires that solutions must be separated from other solutions by a minimum distance in each dimension (vs. Euclidean distance for the radial niching). After niching is complete, all designs in the population will be at least the minimum distance from one another in all directions.

  9. Test for convergence. There are two aspects to convergence that must be considered. The first is stopping criteria. A stopping criteria dictates some sort of limit on the algorithm that is independent of its performance. Examples of stopping criteria available for use with JEGA are the max_iterations and max_function_evaluations inputs. All JEGA convergers respect these stopping criteria in addition to anything else that they do.

    The second aspect to convergence involves repeated assessment of the algorithms progress in solving the problem. In JEGA v1.0, the SOGA fitness tracker convergers (best_fitness_tracker and average_fitness_tracker) performed this function by asserting that the fitness values (either best or average) of the population continue to improve. There was no such operator for the MOGA. As of JEGA v2.0, the same fitness tracker convergers exist for use with SOGA and there is now a converger available for use with the MOGA. The MOGA converger (metric_tracker) operates by tracking various changes in the non-dominated frontier from generation to generation. When the changes occurring over a user specified number of generations fall below a user specified threshold, the algorithm stops.

  10. Peroform post processing. This step is new as of JEGA v2.1. The purpose of this operation is to perform any needed data manipulations on the final solution deemed necessary. Currently the the distance_postprocessor is the only one other than the null_postprocessor. The distance_postprocessor is specifically for use with the MOGA and reduces the final solution set size such that a minimum distance in each direction exists between any two designs.

There are many controls which can be used for both MOGA and SOGA methods. These include among others the random seed, initialization types, crossover and mutation types, and some replacement types. These are described in Tables 5.18 and 5.19 below.

The seed control defines the starting seed for the random number generator. The algorithm uses random numbers heavily but a specification of a random seed will cause the algorithm to run identically from one trial to the next so long as all other input specifications remain the same. New as of JEGA v2.0 is the introduction of the log_file specification. JEGA now uses a logging library to output messages and status to the user. JEGA can be configured at build time to log to both the console window and a text file, one or the other, or neither. The log_file input is a string name of a file into which to log. If the build was configured without file logging in JEGA, this input is ignored. If file logging is enabled and no log_file is specified, the default file name of JEGAGlobal.log is used. Also new to JEGA v2.0 is the introduction of the print_each_pop specification. It serves as a flag and if supplied, the population at each generation will be printed to a file named "population<GEN#>.dat" where <GEN#> is the number of the current generation.

The initialization_type defines the type of initialization for the GA. There are three types: simple_random, unique_random, and flat_file. simple_random creates initial solutions with random variable values according to a uniform random number distribution. It gives no consideration to any previously generated designs. The number of designs is specified by the population_size. unique_random is the same as simple_random, except that when a new solution is generated, it is checked against the rest of the solutions. If it duplicates any of them, it is rejected. flat_file allows the initial population to be read from a flat file. If flat_file is specified, a file name must be given. Variables can be delimited in the flat file in any way you see fit with a few exceptions. The delimiter must be the same on any given line of input with the exception of leading and trailing whitespace. So a line could look like: 1.1, 2.2 ,3.3 for example but could not look like: 1.1, 2.2 3.3. The delimiter can vary from line to line within the file which can be useful if data from multiple sources is pasted into the same input file. The delimiter can be any string that does not contain any of the characters .+-dDeE or any of the digits 0-9. The input will be read until the end of the file. The algorithm will discard any configurations for which it was unable to retrieve at least the number of design variables. The objective and constraint entries are not required but if ALL are present, they will be recorded and the design will be tagged as evaluated so that evaluators may choose not to re-evaluate them. Setting the size for this initializer has the effect of requiring a minimum number of designs to create. If this minimum number has not been created once the files are all read, the rest are created using the unique_random initializer and then the simple_random initializer if necessary.

Note that the population_size only sets the size of the initial population. The population size may vary in the JEGA methods according to the type of operators chosen for a particular optimization run.

There are many crossover types available. multi_point_binary crossover requires an integer number, N, of crossover points. This crossover type performs a bit switching crossover at N crossover points in the binary encoded genome of two designs. Thus, crossover may occur at any point along a solution chromosome (in the middle of a gene representing a design variable, for example). multi_point_parameterized_binary crossover is similar in that it performs a bit switching crossover routine at N crossover points. However, this crossover type performs crossover on each design variable individually. So the individual chromosomes are crossed at N locations. multi_point_real crossover performs a variable switching crossover routing at N crossover points in the real real valued genome of two designs. In this scheme, crossover only occurs between design variables (chromosomes). Note that the standard solution chromosome representation in the JEGA algorithm is real encoded and can handle integer or real design variables. For any crossover types that use a binary representation, real variables are converted to long integers by multiplying the real number by 10^6 and then truncating. Note that this assumes a precision of only six decimal places. Discrete variables are represented as integers (indices within a list of possible values) within the algorithm and thus require no special treatment by the binary operators.

The final crossover type is shuffle_random. This crossover type performs crossover by choosing design variables at random from a specified number of parents enough times that the requested number of children are produced. For example, consider the case of 3 parents producing 2 children. This operator would go through and for each design variable, select one of the parents as the donor for the child. So it creates a random shuffle of the parent design variable values. The relative numbers of children and parents are controllable to allow for as much mixing as desired. The more parents involved, the less likely that the children will wind up exact duplicates of the parents.

All crossover types take a crossover_rate. The crossover rate is used to calculate the number of crossover operations that take place. The number of crossovers is equal to the rate * population_size.

There are five mutation types allowed. replace_uniform introduces random variation by first randomly choosing a design variable of a randomly selected design and reassigning it to a random valid value for that variable. No consideration of the current value is given when determining the new value. All mutation types have a mutation_rate. The number of mutations for the replace_uniform mutator is the product of the mutation_rate and the population_size.

The bit_random mutator introduces random variation by first converting a randomly chosen variable of a randomly chosen design into a binary string. It then flips a randomly chosen bit in the string from a 1 to a 0 or visa versa. In this mutation scheme, the resulting value has more probability of being similar to the original value. The number of mutations performed is the product of the mutation_rate, the number of design variables, and the population size.

The offset mutators all act by adding an "offset" random amount to a variable value. The random amount has a mean of zero in all cases. The offset_normal mutator introduces random variation by adding a Gaussian random amount to a variable value. The random amount has a standard deviation dependent on the mutation_scale. The mutation_scale is a fraction in the range [0, 1] and is meant to help control the amount of variation that takes place when a variable is mutated. mutation_scale is multiplied by the range of the variable being mutated to serve as standard deviation. offset_cauchy is similar to offset_normal, except that a Cauchy random variable is added to the variable being mutated. The mutation_scale also defines the standard deviation for this mutator. Finally, offset_uniform adds a uniform random amount to the variable value. For the offset_uniform mutator, the mutation_scale is interpreted as a fraction of the total range of the variable. The range of possible deviation amounts is +/- 1/2 * (mutation_scale * variable range). The number of mutations for all offset mutators is defined as the product of mutation_rate and population_size.

As of JEGA v2.0, all replacement types are common to both MOGA and SOGA. They include the roulette_wheel, unique_roulette_wheel, elitist, and below_limit selectors. In roulette_wheel replacement, each design is conceptually allotted a portion of a wheel proportional to its fitness relative to the fitnesses of the other Designs. Then, portions of the wheel are chosen at random and the design occupying those portions are duplicated into the next population. Those Designs allotted larger portions of the wheel are more likely to be selected (potentially many times). unique_roulette_wheel replacement is the same as roulette_wheel replacement, with the exception that a design may only be selected once. The below_limit selector attempts to keep all designs for which the negated fitness is below a certain limit. The values are negated to keep with the convention that higher fitness is better. The inputs to the below_limit selector are the limit as a real value, and a shrinkage_percentage as a real value. The shrinkage_percentage defines the minimum amount of selections that will take place if enough designs are available. It is interpreted as a percentage of the population size that must go on to the subsequent generation. To enforce this, below_limit makes all the selections it would make anyway and if that is not enough, it takes the remaining that it needs from the best of what is left (effectively raising its limit as far as it must to get the minimum number of selections). It continues until it has made enough selections. The shrinkage_percentage is designed to prevent extreme decreases in the population size at any given generation, and thus prevent a big loss of genetic diversity in a very short time. Without a shrinkage limit, a small group of "super" designs may appear and quickly cull the population down to a size on the order of the limiting value. In this case, all the diversity of the population is lost and it is expensive to re-diversify and spread the population. The elitist selector simply chooses the required number of designs taking the most fit. For example, if 100 selections are requested, then the top 100 designs as ranked by fitness will be selected and the remaining will be discarded.

Table 5.18 Specification detail for JEGA method dependent controls: seed, output, initialization, mutation, and replacement
Description Keyword Associated Data Status Default
GA Method moga | soga none Required group N/A
Random seed seed integer Optional Time based seed
Log file log_file string Optional JEGAGlobal.log
Number of population members population_size integer Optional 50
Population output print_each_pop none Optional No printing
Output verbosity output silent | quiet | verbose | debug Optional normal
Initialization type initialization_type simple_random | unique_random | flat_file Optimal unique_random
Mutation type mutation_type replace_uniform | bit_random | offset_cauchy | offset_uniform | offset_normal Optional group replace_uniform
Mutation scale mutation_scale real Optional 0.15
Mutation rate mutation_rate real Optional 0.08
Replacement type replacement_type below_limit | roulette_wheel | unique_roulette_wheel | elitist Required group None
Below limit selection below_limit real Optional 6
Shrinkage percentage in below limit selection shrinkage_percentage real Optional
0.9

Table 5.19 Specification detail for JEGA method dependent controls: crossover
Description Keyword Associated Data Status Default
Crossover type crossover_type multi_point_binary | multi_point_parameterized_binary | multi_point_real | shuffle_random Optional group shuffle_random
Multi point binary crossover multi_point_binary integer Required N/A
Multi point parameterized binary crossover multi_point_parameterized_binary integer Required N/A
Multi point real crossover multi_point_real integer Required N/A
Random shuffle crossover shuffle_random num_parents, num_offspring Required N/A
Number of parents in random shuffle crossover num_parents integer optional 2
Number of offspring in random shuffle crossover num_offspring integer optional 2
Crossover rate crossover_rate real optional (applies to all crossover types) 0.8

Multi-objective Evolutionary Algorithms

The specification for controls specific to Multi-objective Evolutionary algorithms are described here. These controls will be appropriate to use if the user has specified moga as the method.

The initialization, crossover, and mutation controls were all described in the preceding section. There are no MOGA specific aspects to these controls. The fitness_type for a MOGA may be domination_count or layer_rank. Both have been specifically designed to avoid problems with aggregating and scaling objective function values and transforming them into a single objective. Instead, the domination_count fitness assessor works by ordering population members by the negative of the number of designs that dominate them. The values are negated in keeping with the convention that higher fitness is better. The layer_rank fitness assessor works by assigning all non-dominated designs a layer of 0, then from what remains, assigning all the non-dominated a layer of 1, and so on until all designs have been assigned a layer. Again, the values are negated for the higher-is-better fitness convention. Use of the below_limit selector with the domination_count fitness assessor has the effect of keeping all designs that are dominated by fewer then a limiting number of other designs subject to the shrinkage limit. Using it with the layer_rank fitness assessor has the effect of keeping all those designs whose layer is below a certain threshold again subject to the shrinkage limit.

New as of JEGA v2.0 is the introduction of niche pressure operators. These operators are meant primarily for use with the moga. The job of a niche pressure operator is to encourage diversity along the Pareto frontier as the algorithm runs. This is typically accomplished by discouraging clustering of design points in the performance space. In JEGA, the application of niche pressure occurs as a secondary selection operation. The nicher is given a chance to perform a pre-selection operation prior to the operation of the selection (replacement) operator, and is then called to perform niching on the set of designs that were selected by the selection operator.

Currently, the only niche pressure operators available are the radial nicher and the distance nicher. The radial niche pressure applicator works by enforcing a minimum Euclidean distance between designs in the performance space at each generation. The algorithm proceeds by starting at the (or one of the) extreme designs along objective dimension 0 and marching through the population removing all designs that are too close to the current design. One exception to the rule is that the algorithm will never remove an extreme design which is defined as a design that is maximal or minimal in all but 1 objective dimension (for a classical 2 objective problem, the extreme designs are those at the tips of the non-dominated frontier). The distance nicher enforces a minimimum distance in each dimension.

The designs that are removed by the nicher are not discarded. They are buffered and re-inserted into the population during the next pre-selection operation. This way, the selector is still the only operator that discards designs and the algorithm will not waste time "re-filling" gaps created by the nicher.

The radial nicher requires as input a vector of fractions with length equal to the number of objectives. The elements of the vector are interpreted as percentages of the non-dominated range for each objective defining a minimum distance to all other designs. All values should be in the range (0, 1). The minimum allowable distance between any two designs in the performance space is the Euclidian (simple square-root-sum-of-squares calculation) distance defined by these percentages. The distance nicher has a similar input vector requirement, only the distance is the minimum distance in each dimension.

Also new as of JEGA v2.0 is the introduction of the MOGA specific metric_tracker converger. This converger is conceptually similar to the best and average fitness tracker convergers in that it tracks the progress of the population over a certain number of generations and stops when the progress falls below a certain threshold. The implementation is quite different however. The metric_tracker converger tracks 3 metrics specific to the non-dominated frontier from generation to generation. All 3 of these metrics are computed as percent changes between the generations. In order to compute these metrics, the converger stores a duplicate of the non-dominated frontier at each generation for comparison to the non-dominated frontier of the next generation.

The first metric is one that indicates how the expanse of the frontier is changing. The expanse along a given objective is defined by the range of values existing within the non-dominated set. The expansion metric is computed by tracking the extremes of the non-dominated frontier from one generation to the next. Any movement of the extreme values is noticed and the maximum percentage movement is computed as:

    Em = max over j of abs((range(j, i) - range(j, i-1)) / range(j, i-1))  j=1,nof
where Em is the max expansion metric, j is the objective function index, i is the current generation number, and nof is the total number of objectives. The range is the difference between the largest value along an objective and the smallest when considering only non-dominated designs.

The second metric monitors changes in the density of the non-dominated set. The density metric is computed as the number of non-dominated points divided by the hypervolume of the non-dominated region of space. Therefore, changes in the density can be caused by changes in the number of non-dominated points or by changes in size of the non-dominated space or both. The size of the non-dominated space is computed as:

    Vps(i) = product over j of range(j, i)   j=1,nof
where Vps(i) is the hypervolume of the non-dominated space at generation i and all other terms have the same meanings as above.

The density of the a given non-dominated space is then:

    Dps(i) = Pct(i) / Vps(i)
where Pct(i) is the number of points on the non-dominated frontier at generation i.

The percentage increase in density of the frontier is then calculated as

    Cd = abs((Dps(i) - Dps(i-1)) / Dps(i-1))
where Cd is the change in density metric.

The final metric is one that monitors the "goodness" of the non-dominated frontier. This metric is computed by considering each design in the previous population and determining if it is dominated by any designs in the current population. All that are determined to be dominated are counted. The metric is the ratio of the number that are dominated to the total number that exist in the previous population.

As mentioned above, each of these metrics is a percentage. The tracker records the largest of these three at each generation. Once the recorded percentage is below the supplied percent change for the supplied number of generations consecutively, the algorithm is converged.

The specification for convergence in a moga can either be metric_tracker or can be omitted all together. If omitted, no convergence algorithm will be used and the algorithm will rely on stopping criteria only. If metric_tracker is specified, then a percent_change and num_generations must be supplied as with the other metric tracker convergers (average and best fitness trackers). The percent_change is the threshold beneath which convergence is attained whereby it is compared to the metric value computed as described above. The num_generations is the number of generations over which the metric value should be tracked. Convergence will be attained if the recorded metric is below percent_change for num_generations consecutive generations.

The MOGA specific controls are described in Table 5.20 below. Note that MOGA and SOGA create additional output files during execution. "finaldata.dat" is a file that holds the final set of Pareto optimal solutions after any post-processing is complete. "discards.dat" holds solutions that were discarded from the population during the course of evolution. It can often be useful to plot objective function values from these files to visually see the Pareto front and ensure that finaldata.dat solutions dominate discards.dat solutions. The solutions are written to these output files in the format "Input1...InputN..Output1...OutputM". If MOGA is used in a hybrid optimization strategy (which requires one optimal solution from each individual optimization method to be passed to the subsequent optimization method as its starting point), the solution in the Pareto set closest to the "utopia" point is given as the best solution. This solution is also reported in the DAKOTA output. This "best" solution in the Pareto set has minimum distance from the utopia point. The utopia point is defined as the point of extreme (best) values for each objective function. For example, if the Pareto front is bounded by (1,100) and (90,2), then (1,2) is the utopia point. There will be a point in the Pareto set that has minimum L2-norm distance to this point, for example (10,10) may be such a point. In SOGA, the solution that minimizes the single objective function is returned as the best solution. If moga is used in a strategy which may require passing multiple solutions to the next level (such as the surrogate_based_global method or hybrid strategy), the orthogonal_distance postprocessor type may be used to specify the distances between each solution value to winnow down the solutions in the full Pareto front to a subset which will be passed to the next iteration.

Table 5.20 Specification detail for MOGA method controls
Description Keyword Associated Data Status Default
Fitness type fitness_type layer_rank | domination_count Required group None
Niche pressure type niching_type radial | distance Optional group No niche pressure
Niching distance radial | distance list of real Optional 0.01 for all objectives
Convergence type metric_tracker none Optional group Stopping criteria only
Percent change limit for metric_tracker converger percent_change real Optional 0.1
Number generations for metric_tracker converger num_generations integer Optional 10
Post_processor type postprocessor_type orthogonal_distance Optional No post-processing of solutions
Post_processor distance orthogonal_distance list of real Optional 0.01 for all objectives

Single-objective Evolutionary Algorithms

The specification for controls specific to Single-objective Evolutionary algorithms are described here. These controls will be appropriate to use if the user has specified soga as the method.

The initialization, crossover, and mutation controls were all described above. There are no SOGA specific aspects to these controls. The replacement_type for a SOGA may be roulette_wheel, unique_roulette_wheel, elitist, or favor_feasible. The favor_feasible replacement type first considers feasibility as a selection criteria. If that does not produce a "winner" then it moves on to considering fitness value. Because of this, any fitness assessor used with the favor_feasible selector must only account objectives in the creation of fitness. Therefore, there is such a fitness assessor and it's use is enforced when the \ favor_feasible selector is chosen. In that case, and if the output level is set high enough, a message will be presented indicating that the weighted_sum_only fitness assessor will be used. As of JEGA v2.0 and beyond, the fitness assessment operator must be specified with SOGA although the merit_function is currently the only one (note that the weighted_sum_only assessor exists but cannot be selected). The roulette wheel selectors no longer assume a fitness function. The merit_function fitness assessor uses an exterior penalty function formulation to penalize infeasible designs. The specification allows the input of a constraint_penalty which is the multiplier to use on the constraint violations.

The SOGA controls allow two additional convergence types. The convergence_type called average_fitness_tracker keeps track of the average fitness in a population. If this average fitness does not change more than percent_change over some number of generations, num_generations, then the solution is reported as converged and the algorithm terminates. The best_fitness_tracker works in a similar manner, only it tracks the best fitness in the population. Convergence occurs after num_generations has passed and there has been less than percent_change in the best fitness value. The percent change can be as low as 0% in which case there must be no change at all over the number of generations. Both also respect the stopping criteria.

The SOGA specific controls are described in Table 5.21 below.

Table 5.21 Specification detail for SOGA method controls
Description Keyword Associated Data Status Default
Fitness type fitness_type merit_function Optional group merit_function
Constraint penalty in merit function constraint_penalty real Optional 1.0
Replacement type replacement_type favor_feasible | unique_roulette_wheel | roulette_wheel Required group None
Convergence type convergence_type best_fitness_tracker | average_fitness_tracker Optional None
Number of generations (for convergence test) num_generations integer Optional 10
Percent change in fitness percent_change real Optional 0.1

Least Squares Methods

DAKOTA's least squares branch currently contains three methods for solving nonlinear least squares problems: NL2SOL, a trust-region method that adaptively chooses between two Hessian approximations (Gauss-Newton and Gauss-Newton plus a quasi-Newton approximation to the rest of the Hessian), NLSSOL, a sequential quadratic programming (SQP) approach that is from the same algorithm family as NPSOL, and Gauss-Newton, which supplies the Gauss-Newton Hessian approximation to the full-Newton optimizers from OPT++.

The important difference of these algorithms from general-purpose optimization methods is that the response set is defined by least squares terms, rather than an objective function. Thus, a finer granularity of data is used by least squares solvers as compared to that used by optimizers. This allows the exploitation of the special structure provided by a sum of squares objective function. Refer to Least squares terms and constraint functions (least squares data set) for additional information on the least squares response data set.

NL2SOL Method

NL2SOL is available as nl2sol and addresses unconstrained and bound-constrained problems. It uses a trust-region method (and thus can be viewed as a generalization of the Levenberg-Marquardt algorithm) and adaptively chooses between two Hessian approximations, the Gauss-Newton approximation alone and the Gauss-Newton approximation plus a quasi-Newton approximation to the rest of the Hessian. Even on small-residual problems, the latter Hessian approximation can be useful when the starting guess is far from the solution. On problems that are not over-parameterized (i.e., that do not involve more optimization variables than the data support), NL2SOL usually exhibits fast convergence.

NL2SOL has a variety of internal controls as described in AT&T Bell Labs CS TR 153 (http://cm.bell-labs.com/cm/cs/cstr/153.ps.gz). A number of existing DAKOTA controls (method independent controls and responses controls) are mapped into these NL2SOL internal controls. In particular, DAKOTA's convergence_tolerance, max_iterations, max_function_evaluations, and fd_gradient_step_size are mapped directly into NL2SOL's rfctol, mxiter, mxfcal, and dltfdj controls, respectively. In addition, DAKOTA's fd_hessian_step_size is mapped into both delta0 and dltfdc, and DAKOTA's output verbosity is mapped into NL2SOL's auxprt and outlev (for normal/verbose/debug output, NL2SOL prints initial guess, final solution, solution statistics, nondefault values, and changes to the active bound constraint set on every iteration; for quiet output, NL2SOL prints only the initial guess and final solution; and for silent output, NL2SOL output is suppressed).

Several NL2SOL convergence tolerances are adjusted in response to function_precision, which gives the relative precision to which responses are computed. These tolerances may also be specified explicitly: convergence_tolerance (NL2SOL's rfctol, as mentioned previously) is the relative-function convergence tolerance (on the accuracy desired in the sum-of-squares function); x_conv_tol (NL2SOL's xctol) is the X-convergence tolerance (scaled relative accuracy of the solution variables); absolute_conv_tol (NL2SOL's afctol) is the absolute function convergence tolerance (stop when half the sum of squares is less than absolute_conv_tol, which is mainly of interest on zero-residual test problems); singular_conv_tol (NL2SOL's sctol) is the singular convergence tolerance, which works in conjunction with singular_radius (NL2SOL's lmaxs) to test for underdetermined least-squares problems (stop when the relative reduction yet possible in the sum of squares appears less then singular_conv_tol for steps of scaled length at most singular_radius); false_conv_tol (NL2SOL's xftol) is the false-convergence tolerance (stop with a suspicion of discontinuity when a more favorable stopping test is not satisfied and a step of scaled length at most false_conv_tol is not accepted). Finally, the initial_trust_radius specification (NL2SOL's lmax0) specifies the initial trust region radius for the algorithm.

The internal NL2SOL defaults can be obtained for many of these controls by specifying the value -1. For both the singular_radius and the initial_trust_radius, this results in the internal use of steps of length 1. For other controls, the internal defaults are often functions of machine epsilon (as limited by function_precision). Refer to CS TR 153 for additional details on these formulations.

Whether and how NL2SOL computes and prints a final covariance matrix and regression diagnostics is affected by several keywords. covariance (NL2SOL's covreq) specifies the desired covariance approximation:

When regression_diagnostics (NL2SOL's rdreq) is specified and a positive-definite final Hessian approximation H is computed, NL2SOL computes and prints a regression diagnostic vector RD such that if omitting the i-th observation would cause alpha times the change in the solution that omitting the j-th observation would cause, then RD[i] = |alpha| RD[j]. The finite-difference step-size tolerance affecting H is fd_hessian_step_size (NL2SOL's delta0 and dltfdc, as mentioned previously).

Table 5.22 provides the specification detail for the NL2SOL method dependent controls.

Table 5.22 Specification detail for NL2SOL method dependent controls.
Description Keyword Associated Data Status Default
Relative precision in least squares terms function_precision real Optional 1e-10
Absolute function convergence tolerance absolute_conv_tol real Optional -1. (use NL2SOL internal default)
Convergence tolerance for change in parameter vector x_conv_tol real Optional -1. (use NL2SOL internal default)
Singular convergence tolerance singular_conv_tol real Optional -1. (use NL2SOL internal default)
Step limit for sctol singular_radius real Optional -1. (use NL2SOL internal default of 1)
False convergence tolerance false_conv_tol real Optional -1. (use NL2SOL internal default)
Initial trust region radius initial_trust_radius real Optional -1. (use NL2SOL internal default of 1)
Covariance post-processing covariance integer Optional 0 (no covariance)
Regression diagnostics post-processing regression_diagnostics none Optional no regression diagnostics

NLSSOL Method

NLSSOL is available as nlssol_sqp and supports unconstrained, bound-constrained, and generally-constrained problems. It exploits the structure of a least squares objective function through the periodic use of Gauss-Newton Hessian approximations to accelerate the SQP algorithm. DAKOTA provides access to the NLSSOL library through the NLSSOLLeastSq class. The method independent and method dependent controls are identical to those of NPSOL as described in NPSOL method independent controls and NPSOL method dependent controls.

Gauss-Newton Method

The Gauss-Newton algorithm is available as optpp_g_newton and supports unconstrained, bound-constrained, and generally-constrained problems. The code for the Gauss-Newton approximation (objective function value, gradient, and approximate Hessian defined from residual function values and gradients) is provided outside of OPT++ within SNLLLeastSq::nlf2_evaluator_gn(). When interfaced with the unconstrained, bound-constrained, and nonlinear interior point full-Newton optimizers from the OPT++ library, it provides a Gauss-Newton least squares capability which -- on zero-residual test problems -- can exhibit quadratic convergence rates near the solution. (Real problems almost never have zero residuals, i.e., perfect fits.)

Mappings for the method independent and dependent controls are the same as for the OPT++ optimization methods and are as described in OPT++ method independent controls and OPT++ method dependent controls. In particular, since OPT++ full-Newton optimizers provide the foundation for Gauss-Newton, the specifications from Table 5.6 are also applicable for optpp_g_newton.

Surrogate-Based Minimization Methods

In surrogate-based optimization (SBO) and surrogate-based nonlinear least squares (SBNLS), minimization occurs using a set of one or more approximations, defined from a surrogate model, that are built and periodically updated using data from a "truth" model. The surrogate model can be a global data fit (e.g., regression or interpolation of data generated from a design of computer experiments), a multipoint approximation, a local Taylor Series expansion, or a model hierarchy approximation (e.g., a low-fidelity simulation model), whereas the truth model involves a high-fidelity simulation model. The goals of surrogate-based methods are to reduce the total number of truth model simulations and, in the case of global data fit surrogates, to smooth noisy data with an easily navigated analytic function.

Surrogate-Based Local Method

In the surrogate-based local (SBL) method, a trust region approach is used to manage the minimization process to maintain acceptable accuracy between the surrogate model and the truth model (by limiting the range over which the surrogate model is trusted). The process involves a sequence of minimizations performed on the surrogate model and bounded by the trust region. At the end of each approximate minimization, the candidate optimum point is validated using the truth model. If sufficient decrease has been obtained in the truth model, the trust region is re-centered around the candidate optimum point and the trust region will either shrink, expand, or remain the same size depending on the accuracy with which the surrogate model predicted the truth model decrease. If sufficient decrease has not been attained, the trust region center is not updated and the entire trust region shrinks by a user-specified factor. The cycle then repeats with the construction of a new surrogate model, a minimization, and another test for sufficient decrease in the truth model. This cycle continues until convergence is attained.

The surrogate_based_local method must specify an optimization or least squares sub-method either by pointer using approx_method_pointer (e.g., 'NLP1') or by name using approx_method_name (e.g., 'npsol_sqp'). The former identifies a full sub-method specification for the sub-problem minimizer (which allows non-default minimizer settings), whereas the latter supports a streamlined specification (that employs default minimizer settings). For both cases, the surrogate_based_local method specification is responsible for using its model_pointer (see Method Independent Controls) to select a surrogate model (see Surrogate Model Controls). Any model_pointer identified in an approximate sub-method specification is ignored.

In addition to the method independent controls for max_iterations and convergence_tolerance described in Table 5.1, SBL algorithm controls include soft_convergence_limit (a soft convergence control for the SBL iterations which limits the number of consecutive iterations with improvement less than the convergence tolerance) and truth_surrogate_bypass (a flag for bypassing all lower level surrogates when performing truth verifications on a top level surrogate). Table 5.23 summarizes these SBL inputs.

Table 5.23 Specification detail for surrogate-based local minimization method
Description Keyword Associated Data Status Default
Surrogate-based local method surrogate_based_local none Required group N/A
Approximate sub-problem minimization method pointer approx_method_pointer string Required (1 of 2 selections) N/A
Approximate sub-problem minimization method name approx_method_name string Required (1 of 2 selections) N/A
Soft convergence limit for SBL iterations soft_convergence_limit integer Optional 5
Flag for bypassing lower level surrogates in truth verifications truth_surrogate_bypass none Optional no bypass

The trust_region optional group specification can be used to specify the initial size of the trust region (using initial_size) relative to the total variable bounds, the minimum size of the trust region (using minimum_size), the contraction factor for the trust region size (using contraction_factor) used when the surrogate model is performing poorly, and the expansion factor for the trust region size (using expansion_factor) used when the the surrogate model is performing well. Two additional commands are the trust region size contraction threshold (using contract_threshold) and the trust region size expansion threshold (using expand_threshold). These two commands are related to what is called the trust region ratio, which is the actual decrease in the truth model divided by the predicted decrease in the truth model in the current trust region. The command contract_threshold sets the minimum acceptable value for the trust region ratio, i.e., values below this threshold cause the trust region to shrink for the next SBL iteration. The command expand_threshold determines the trust region value above which the trust region will expand for the next SBL iteration. Table 5.24 summarizes these trust region inputs.

Table 5.24 Specification detail for trust region controls in surrogate-based local methods
Description Keyword Associated Data Status Default
Trust region group specification trust_region none Optional group N/A
Trust region initial size (relative to bounds) initial_size real Optional 0.4
Trust region minimum size minimum_size real Optional 1.e-6
Shrink trust region if trust region ratio is below this value contract_threshold real Optional 0.25
Expand trust region if trust region ratio is above this value expand_threshold real Optional 0.75
Trust region contraction factor contraction_factor real Optional 0.25
Trust region expansion factor expansion_factor real Optional 2.0

For SBL problems with nonlinear constraints, a number of algorithm formulations exist as described in [Eldred and Dunlavy, 2006] and as summarized in the Advanced Examples section of the Models chapter of the Users Manual [Adams et al., 2010]. First, the "primary" functions (that is, the objective functions or least squares terms) in the approximate subproblem can be selected to be surrogates of the original primary functions (original_primary), a single objective function (single_objective) formed from the primary function surrogates, or either an augmented Lagrangian merit function (augmented_lagrangian_objective) or a Lagrangian merit function (lagrangian_objective) formed from the primary and secondary function surrogates. The former option may imply the use of a nonlinear least squares method, a multiobjective optimization method, or a single objective optimization method to solve the approximate subproblem, depending on the definition of the primary functions. The latter three options all imply the use of a single objective optimization method regardless of primary function definition. Second, the surrogate constraints in the approximate subproblem can be selected to be surrogates of the original constraints (original_constraints) or linearized approximations to the surrogate constraints (linearized_constraints), or constraints can be omitted from the subproblem (no_constraints). Following optimization of the approximate subproblem, the candidate iterate is evaluated using a merit function, which can be selected to be a simple penalty function with penalty ramped by SBL iteration number (penalty_merit), an adaptive penalty function where the penalty ramping may be accelerated in order to avoid rejecting good iterates which decrease the constraint violation (adaptive_penalty_merit), a Lagrangian merit function which employs first-order Lagrange multiplier updates (lagrangian_merit), or an augmented Lagrangian merit function which employs both a penalty parameter and zeroth-order Lagrange multiplier updates (augmented_lagrangian_merit). When an augmented Lagrangian is selected for either the subproblem objective or the merit function (or both), updating of penalties and multipliers follows the approach described in [Conn et al., 2000]. Following calculation of the merit function for the new iterate, the iterate is accepted or rejected and the trust region size is adjusted for the next SBL iteration. Iterate acceptance is governed either by a trust region ratio (tr_ratio) formed from the merit function values or by a filter method (filter); however, trust region resizing logic is currently based only on the trust region ratio. For infeasible iterates, constraint relaxation can be used for balancing constraint satisfaction and progress made toward an optimum. The command constraint_relax followed by a method name specifies the type of relaxation to be used. Currently, homotopy [Perez et al., 2004] is the only available method for constraint relaxation, and this method is dependent on the presence of the NPSOL library within the DAKOTA executable. Table 5.25 summarizes these constraint management inputs.

Table 5.25 Specification detail for constraint management in surrogate-based local methods
Description Keyword Associated Data Status Default
Approximate subproblem formulation approx_subproblem original_primary | single_objective | augmented_lagrangian_objective | lagrangian_objective
original_constraints | linearized_constraints | no_constraints
Optional group original_primary
original_constraints
SBL merit function merit_function penalty_merit | adaptive_penalty_merit | lagrangian_merit | augmented_lagrangian_merit Optional group augmented_lagrangian_merit
SBL iterate acceptance logic acceptance_logic tr_ratio | filter Optional group filter
SBL constraint relaxation method for infeasible iterates constraint_relax homotopy Optional group no relaxation

Surrogate-Based Global Method

The surrogate_based_global method differs from the surrogate_based_local method in a few ways. First, surrogate_based_global is not a trust region method. Rather, surrogate_based_global works in an iterative scheme where optimization is performed on a global surrogate using the same bounds during each iteration. In one iteration, the optimal solutions of the surrogate model are found, and then a selected set of these optimal surrogate solutions are passed to the next iteration. At the next iteration, these surrogate points are evaluated with the "truth" model, and then these points are added back to the set of points upon which the next surrogate is constructed. In this way, the optimization acts on a more accurate surrogate during each iteration, presumably driving to optimality quickly. This approach has no guarantee of convergence. It was originally designed for MOGA (a multi-objective genetic algorithm). Since genetic algorithms often need thousands or tens of thousands of points to produce optimal or near-optimal solutions, the use of surrogates can be helpful for reducing the truth model evaluations. Instead of creating one set of surrogates for the individual objectives and running the optimization algorithm on the surrogate once, the idea is to select points along the (surrogate) Pareto frontier, which can be used to supplement the existing points. In this way, one does not need to use many points initially to get a very accurate surrogate. The surrogate becomes more accurate as the iterations progress. Note that the user has the option of appending the optimal points from the surrogate model to the current set of truth points or using the optimal points from the surrogate model to replace the optimal set of points from the previous iteration. Although appending to the set is the default behavior, at this time we strongly recommend using the option replace_points because it appears to be more accurate and robust.

As for the surrogate_based_local method, the surrogate_based_global specification must identify a sub-method using either approx_method_pointer or approx_method_name and must identify a surrogate model (see Surrogate Model Controls) using its model_pointer (see Method Independent Controls). The only other algorithm control at this time is the method independent control for max_iterations described in Table 5.1. Table 5.26 summarizes the method dependent surrogate based global inputs.

Table 5.26 Specification detail for the surrogate-based global method
Description Keyword Associated Data Status Default
Surrogate-based global method surrogate_based_global none Required group N/A
Approximate sub-problem minimization method pointer approx_method_pointer string Required (1 of 2 selections) N/A
Approximate sub-problem minimization method name approx_method_name string Required (1 of 2 selections) N/A
Replace points used in surrogate construction with best points from previous iteration replace_points none Optional Points appended, not replaced

We have two cautionary notes before using the surrogate-based global method:

Efficient Global Method

The Efficient Global Optimization (EGO) method was first developed by Jones, Schonlau, and Welch [Jones et al., 1998]. In EGO, a stochastic response surface approximation for the objective function is developed based on some sample points from the "true" simulation. The particular response surface used is a Gaussian process (GP). The GP allows one to calculate the prediction at a new input location as well as the uncertainty associated with that prediction. The key idea in EGO is to maximize the Expected Improvement Function (EIF). The EIF is used to select the location at which a new training point should be added to the Gaussian process model by maximizing the amount of improvement in the objective function that can be expected by adding that point. A point could be expected to produce an improvement in the objective function if its predicted value is better than the current best solution, or if the uncertainty in its prediction is such that the probability of it producing a better solution is high. Because the uncertainty is higher in regions of the design space with few observations, this provides a balance between exploiting areas of the design space that predict good solutions, and exploring areas where more information is needed. EGO trades off this "exploitation vs. exploration." The general procedure for these EGO-type methods is:

Note that several major differences exist between our implementation and that of [Jones et al., 1998]. First, rather than using a branch and bound method to find the point which maximizes the EIF, we use the DIRECT global optimization method (see DIRECT and NCSU Methods). Second, we support both global optimization and global nonlinear least squares as well as general nonlinear constraints through abstraction and subproblem recasting within the SurrBasedMinimizer and EffGlobalMinimizer classes.

The efficient global method is in prototype form. Currently, we do not expose any specification controls for the underlying Gaussian process model used or for the optimization of the expected improvement function (which is currently performed by the NCSU DIRECT algorithm using its internal defaults). Future releases may allow more specification detail. The efficient global algorithm is specified by the keyword efficient_global along with an optional seed specification, as shown in in Table 5.27 below.

Table 5.27 Specification detail for the efficient global method
Description Keyword Associated Data Status Default
Efficient global method efficient_global none Required group N/A
Random seed seed integer Optional Time based seed: nonrepeatable

Uncertainty Quantification Methods

DAKOTA has several methods for propagating uncertainty. Aleatory uncertainty refers to inherent variability, irreducible uncertainty, or randomness, and is addressed with the probabilistic methods described in Aleatory Uncertainty Quantification Methods. Epistemic uncertainty refers to subjective uncertainty, reducible uncertainty, model form uncertainty, or uncertainty due to lack of knowledge, and is addressed using the non-probabilistic approaches described in Epistemic Uncertainty Quantification Methods. In general, we refer to both classes of uncertainty quantification methods in DAKOTA as nondeterministic methods. In the descriptions below, we described the issues and specification controls that are common to both aleatory and epistemic uncertainty quantification. Finally, we are including Bayesian calibration under the uncertainty quantification methods. This approach is described in Bayesian Calibration. The idea in Bayesian calibration is to assign a prior distribution on uncertain model parameters and use experimental data observations on model output to update or inform the model parameters: the prior distributions on model parameters are transformed to posterior distributions through the Bayesian updating process.

DAKOTA's nondeterministic methods do not make use of many method independent controls. Only the x_taylor_mpp, u_taylor_mpp, x_two_point, and u_two_point methods within nond_local_reliability use method independent convergence controls (see Local reliability methods). As such, the nondeterministic branch documentation which follows is primarily limited to the method dependent controls for the sampling, reliability, stochastic expansion, and epistemic methods.

With a few exceptions (nond_global_reliability, nond_importance, nond_local_evidence, and nond_global_evidence do not support mappings involving reliability_levels, and nond_local_interval_est and nond_global_interval_est do not support any level mappings), each of these techniques supports response_levels, probability_levels, reliability_levels, and gen_reliability_levels specifications along with optional num_response_levels, num_probability_levels, num_reliability_levels and num_gen_reliability_levels keys. The keys define the distribution of the levels among the different response functions. For example, the following specification

	num_response_levels = 2 4 3
	response_levels = 1. 2. .1 .2 .3 .4 10. 20. 30.
would assign the first two response levels (1., 2.) to response function 1, the next four response levels (.1, .2, .3, .4) to response function 2, and the final three response levels (10., 20., 30.) to response function 3. If the num_response_levels key were omitted from this example, then the response levels would be evenly distributed among the response functions (three levels each in this case).

The response_levels specification provides the target response values for generating probabilities, reliabilities, or generalized reliabilities (forward mapping). The selection among these possible results for the forward mapping is performed with the compute keyword followed by either probabilities, reliabilities, or gen_reliabilities. Conversely, the probability_levels, reliability_levels, and gen_reliability_levels specifications provide target levels for which response values will be computed (inverse mapping). Specifications of response_levels, probability_levels, reliability_levels, and gen_reliability_levels may be combined within the calculations for each response function. The mapping results (probabilities, reliabilities, or generalized reliabilities for the forward mapping and response values for the inverse mapping) define the final statistics of the nondeterministic analysis that can be accessed for use at a higher level (via the primary and secondary mapping matrices for nested models; see Nested Model Controls).

Sets of response-probability pairs computed with the forward/inverse mappings define either a cumulative distribution function (CDF) or a complementary cumulative distribution function (CCDF) for each response function. In the case of evidence-based epistemic methods, this is generalized to define either cumulative belief and plausibility functions (CBF and CPF) or complementary cumulative belief and plausibility functions (CCBF and CCPF) for each response function, where a forward mapping involves computing the belief and plausibility probability level for a specified response level and an inverse mapping involves computing the belief and plausibility response level for either a specified probability level or a specified generalized reliability level (two results for each level mapping in the evidence-based epistemic case, instead of the one result for each level mapping in the aleatory case). The selection of a CDF/CBF/CPF or CCDF/CCBF/CCPF can be performed with the distribution keyword followed by either cumulative for the CDF/CBF/CPF option or complementary for the CCDF/CCBF/CCPF option. This selection also defines the sign of the reliability or generalized reliability indices. Table 5.28 provides the specification detail for the forward/inverse mappings used by each of the nondeterministic analysis methods.

Table 5.28 Specification detail for forward/inverse level mappings
Description Keyword Associated Data Status Default
Distribution type distribution cumulative | complementary Optional group cumulative (CDF)
Response levels response_levels list of reals Optional group No CDF/CCDF probabilities/reliabilities to compute
Number of response levels num_response_levels list of integers Optional response_levels evenly distributed among response functions
Target statistics for response levels compute probabilities | reliabilities | gen_reliabilities Optional probabilities
Probability levels probability_levels list of reals Optional group No CDF/CCDF response levels to compute
Number of probability levels num_probability_levels list of integers Optional probability_levels evenly distributed among response functions
Reliability levels reliability_levels list of reals Optional group No CDF/CCDF response levels to compute
Number of reliability levels num_reliability_levels list of integers Optional reliability_levels evenly distributed among response functions
Generalized reliability levels gen_reliability_levels list of reals Optional group No CDF/CCDF response levels to compute
Number of generalized reliability levels num_gen_reliability_levels list of integers Optional gen_reliability_levels evenly distributed among response functions

Different nondeterministic methods have differing support for uncertain variable distributions. Table 5.29 summarizes the uncertain variables that are available for use by the different methods, where a "-" indicates that the distribution is not supported by the method, a "U" means the uncertain input variables of this type must be uncorrelated, a "C" denotes that correlations are supported involving uncertain input variables of this type. Additional notes include:

Table 5.29 Summary of Distribution Types supported by Nondeterministic Methods
Distribution Type

Sampling Local Reliability Global Reliability Wiener SE Askey SE Extended SE Local Interval Global Interval Local Evidence Global Evidence
Normal

C C C C C C - - - -
Bounded Normal

C U U U U U - - - -
Lognormal

C C C C C U - - - -
Bounded Lognormal

C U U U U U - - - -
Uniform

C C C C U U - - - -
Loguniform

C U U U U U - - - -
Triangular

C U U U U U - - - -
Exponential

C C C C U U - - - -
Beta

C U U U U U - - - -
Gamma

C C C C U U - - - -
Gumbel

C C C C C U - - - -
Frechet

C C C C C U - - - -
Weibull

C C C C C U - - - -
Continuous Histogram Bin

C U U U U U - - - -
Poisson

C - - - - - - - - -
Binomial

C - - - - - - - - -
Negative Binomial

C - - - - - - - - -
Geometric

C - - - - - - - - -
Hypergeometric

C - - - - - - - - -
Discrete Histogram Point

C - - - - - - - - -
Interval

U - - - - - U U U U
Continuous Design (all_variables)

U - U U U U - - - -
Discrete Design Range, Int Set, Real Set (all_variables)

U - - - - - - - - -
Continuous State (all_variables)

U - U U U U - - - -
Discrete State Range, Int Set, Real Set (all_variables)

U - - - - - - - - -

Aleatory Uncertainty Quantification Methods

Aleatory uncertainty is also known as inherent variability, irreducible uncertainty, or randomness. An example of aleatory uncertainty is the distribution of height in a population, as it is characterized by the availability of sufficient data to accurately model the form of the variation. For this reason, aleatory uncertainty is typically modeled using probabilistic approaches through the specification of probability distributions to represent the uncertain input variables and the propagation of this uncertainty using probability theory. The probabilistic approaches supported in DAKOTA include sampling, local and global reliability, polynomial chaos, and stochastic collocation, which may be used to propagate random variables described by Normal Distribution, Lognormal Distribution, Uniform Distribution, Loguniform Distribution, Triangular Distribution, Exponential Distribution, Beta Distribution, Gamma Distribution, Gumbel Distribution, Frechet Distribution, Weibull Distribution, Histogram Bin Distribution, Poisson Distribution, Binomial Distribution, Negative Binomial Distribution, Geometric Distribution, Hypergeometric Distribution, and/or Histogram Point Distribution.

Nondeterministic sampling method

The nondeterministic sampling method is selected using the nond_sampling specification. This method draws samples from the specified uncertain variable probability distributions and propagates them through the model to obtain statistics on the output response functions of interest. DAKOTA provides access to nondeterministic sampling methods through the combination of the NonDSampling base class and the NonDLHSSampling derived class.

CDF/CCDF probabilities are calculated for specified response levels using a simple binning approach. Response levels are calculated for specified CDF/CCDF probabilities and generalized reliabilities by indexing into a sorted samples array (the response levels computed are not interpolated and will correspond to one of the sampled values). CDF/CCDF reliabilities are calculated for specified response levels by computing the number of sample standard deviations separating the sample mean from the response level. Response levels are calculated for specified CDF/CCDF reliabilities by projecting out the prescribed number of sample standard deviations from the sample mean.

The seed integer specification specifies the seed for the random number generator which is used to make sampling studies repeatable. The fixed_seed flag is relevant if multiple sampling sets will be generated during the course of a strategy (e.g., surrogate-based optimization, optimization under uncertainty). Specifying this flag results in the reuse of the same seed value for each of these multiple sampling sets, which can be important for reducing variability in the sampling results. However, this behavior is not the default as the repetition of the same sampling pattern can result in a modeling weakness that an optimizer could potentially exploit (resulting in actual reliabilities that are lower than the estimated reliabilities). In either case (fixed_seed or not), the study is repeatable if the user specifies a seed and the study is random is the user omits a seed specification.

The number of samples to be evaluated is selected with the samples integer specification. The algorithm used to generate the samples can be specified using sample_type followed by either random, for pure random Monte Carlo sampling, or lhs, for Latin Hypercube sampling.

If the user wants to increment a particular set of samples with more samples to get better estimates of mean, variance, and percentiles, one can select incremental_random or incremental_lhs as the sample_type. Note that a preliminary sample of size N must have already been performed, and a dakota.rst restart file must be available from this original sample. For example, say a user performs an initial study using lhs as the sample_type, and generates 50 samples. If the user creates a new input file where samples is now specified to be 100, the sample_type is defined to be incremental_lhs or incremental_random, and previous_samples is specified to be 50, the user will get 50 new LHS samples which maintain both the correlation and stratification of the original LHS sample. The N new samples will be combined with the N original samples to generate a combined sample of size 2N. The syntax for running the second sample set is: dakota -i input2.in -r dakota.rst, where input2.in is the file which specifies incremental sampling. Note that the number of samples in the second set MUST currently be 2 times the number of previous samples, although incremental sampling based on any power of two may be supported in future releases.

The nondeterministic sampling method also supports a design of experiments mode through the all_variables flag. Normally, nond_sampling generates samples only for the uncertain variables, and treats any design or state variables as constants. The all_variables flag alters this behavior by instructing the sampling algorithm to treat any continuous design or continuous state variables as parameters with uniform probability distributions between their upper and lower bounds. Samples are then generated over all of the continuous variables (design, uncertain, and state) in the variables specification. This is similar to the behavior of the design of experiments methods described in Design of Computer Experiments Methods, since they will also generate samples over all continuous design, uncertain, and state variables in the variables specification. However, the design of experiments methods will treat all variables as being uniformly distributed between their upper and lower bounds, whereas the nond_sampling method will sample the uncertain variables within their specified probability distributions.

Finally, the nondeterministic sampling method supports two types of sensitivity analysis. In this context of sampling, we take sensitivity analysis to be global, not local as when calculating derivatives of output variables with respect to input variables. Our definition is similar to that of [Saltelli et al., 2004]: "The study of how uncertainty in the output of a model can be apportioned to different sources of uncertainty in the model input." As a default, DAKOTA provides correlation analyses when running LHS. Correlation tables are printed with the simple, partial, and rank correlations between inputs and outputs. These can be useful to get a quick sense of how correlated the inputs are to each other, and how correlated various outputs are to inputs. In addition, we have the capability to calculate sensitivity indices through variance based decomposition using the keyword variance_based_decomp. Variance based decomposition is a way of using sets of samples to understand how the variance of the output behaves, with respect to each input variable. A larger value of the sensitivity index, Si, means that the uncertainty in the input variable i has a larger effect on the variance of the output. More details on the calculations and interpretation of the sensitivity indices can be found in [Saltelli et al., 2004]. Note that variance_based_decomp is extremely computationally intensive since replicated sets of sample values are evaluated. If the user specified a number of samples, N, and a number of nondeterministic variables, M, variance-based decomposition requires the evaluation of N*(M+2) samples. To obtain sensitivity indices that are reasonably accurate, we recommend that N, the number of samples, be at least one hundred and preferably several hundred or thousands. Because of the computational cost, variance_based_decomp is turned off as a default. Table 5.30 provides details of the nondeterministic sampling specifications beyond those of Table 5.28.

Table 5.30 Specification detail for nondeterministic sampling method
Description Keyword Associated Data Status Default
Nondeterministic sampling method nond_sampling none Required group N/A
Random seed seed integer Optional randomly generated seed
Fixed seed flag fixed_seed none Optional seed not fixed: sampling patterns are variable among multiple runs
Number of samples samples integer Optional minimum required
Sampling type sample_type random | lhs | incremental_random |incremental_lhs Optional group lhs
All variables flag all_variables none Optional sampling only over uncertain variables
Variance based decomposition variance_based_decomp none Optional No variance_based_decomp
Previous samples previous_samples integer Optional 0 (no previous_samples)

Local reliability methods

Local reliability methods are selected using the nond_local_reliability specification and are implemented within the NonDLocalReliability class. These methods compute approximate response function distribution statistics based on specified uncertain variable probability distributions. Each of the local reliability methods can compute forward and inverse mappings involving response, probability, reliability, and generalized reliability levels.

The Mean Value method (MV, also known as MVFOSM in [Haldar and Mahadevan, 2000]) is the simplest, least-expensive method in that it estimates the response means, response standard deviations, and all CDF/CCDF forward/inverse mappings from a single evaluation of response functions and gradients at the uncertain variable means. This approximation can have acceptable accuracy when the response functions are nearly linear and their distributions are approximately Gaussian, but can have poor accuracy in other situations.

All other reliability methods perform an internal nonlinear optimization to compute a most probable point (MPP) of failure. A sign convention and the distance of the MPP from the origin in the transformed standard normal space ("u-space") define the reliability index, as explained in the section on Reliability Methods in the Uncertainty Quantification chapter of the Users Manual [Adams et al., 2010]. The reliability can then be converted to a probability using either first- or second-order integration, may then be refined using importance sampling, and finally may be converted to a generalized reliability index. The forward reliability analysis algorithm of computing reliabilities/probabilities for specified response levels is called the Reliability Index Approach (RIA), and the inverse reliability analysis algorithm of computing response levels for specified probability levels is called the Performance Measure Approach (PMA). The different RIA/PMA algorithm options are specified using the mpp_search specification which selects among different limit state approximations that can be used to reduce computational expense during the MPP searches. The x_taylor_mean MPP search option performs a single Taylor series approximation in the space of the original uncertain variables ("x-space") centered at the uncertain variable means, searches for the MPP for each response/probability level using this approximation, and performs a validation response evaluation at each predicted MPP. This option is commonly known as the Advanced Mean Value (AMV) method. The u_taylor_mean option is identical to the x_taylor_mean option, except that the approximation is performed in u-space. The x_taylor_mpp approach starts with an x-space Taylor series at the uncertain variable means, but iteratively updates the Taylor series approximation at each MPP prediction until the MPP converges. This option is commonly known as the AMV+ method. The u_taylor_mpp option is identical to the x_taylor_mpp option, except that all approximations are performed in u-space. The order of the Taylor-series approximation is determined by the corresponding responses specification and may be first or second-order. If second-order (methods named $AMV^2$ and $AMV^2+$ in [Eldred and Bichon, 2006]), the series may employ analytic, finite difference, or quasi Hessians (BFGS or SR1). The x_two_point MPP search option uses an x-space Taylor series approximation at the uncertain variable means for the initial MPP prediction, then utilizes the Two-point Adaptive Nonlinear Approximation (TANA) outlined in [Xu and Grandhi, 1998] for all subsequent MPP predictions. The u_two_point approach is identical to x_two_point, but all the approximations are performed in u-space. The x_taylor_mpp and u_taylor_mpp, x_two_point and u_two_point approaches utilize the max_iterations and convergence_tolerance method independent controls to control the convergence of the MPP iterations (the maximum number of MPP iterations per level is limited by max_iterations, and the MPP iterations are considered converged when $\parallel {\bf u}^{(k+1)} - {\bf u}^{(k)} \parallel_2$ < convergence_tolerance). And, finally, the no_approx option performs the MPP search on the original response functions without the use of any approximations. The optimization algorithm used to perform these MPP searches can be selected to be either sequential quadratic programming (uses the npsol_sqp optimizer) or nonlinear interior point (uses the optpp_q_newton optimizer) algorithms using the sqp or nip keywords.

In addition to the MPP search specifications, one may select among different integration approaches for computing probabilities at the MPP by using the integration keyword followed by either first_order or second_order. Second-order integration employs the formulation of [Hohenbichler and Rackwitz, 1988] (the approach of [Breitung, 1984] and the correction of [Hong 1999] are also implemented, but are not active). Combining the no_approx option of the MPP search with first- and second-order integrations results in the traditional first- and second-order reliability methods (FORM and SORM). These integration approximations may be subsequently refined using importance sampling. The refinement specification allows the seletion of basic importance sampling (import), adaptive importance sampling (adapt_import), or multimodal adaptive importance sampling (mm_adapt_import), along with the specification of number of samples (samples) and random seed (seed). Additional details on these methods are available in [Eldred et al., 2004b] and [Eldred and Bichon, 2006] and in the Uncertainty Quantification Capabilities chapter of the Users Manual [Adams et al., 2010].

Table 5.31 provides details of the local reliability method specifications beyond those of Table 5.28.

Table 5.31 Specification detail for local reliability methods
Description Keyword Associated Data Status Default
Reliability method nond_local_reliability none Required group N/A
MPP search type mpp_search x_taylor_mean | u_taylor_mean | x_taylor_mpp | u_taylor_mpp | x_two_point | u_two_point | no_approx Optional group No MPP search (MV method)
MPP search algorithm sqp, nip none Optional NPSOL's SQP algorithm
Integration method integration first_order | second_order Optional group First-order integration
Refinement method refinement import | adapt_import | mm_adapt_import Optional group No refinement
Refinement samples samples integer Optional 0
Refinement seed seed integer Optional group randomly generated seed

Global reliability methods

Global reliability methods are selected using the nond_global_reliability specification and are implemented within the NonDGlobalReliability class. These methods do not support forward/inverse mappings involving reliability_levels, since they never form a reliability index based on distance in u-space. Rather they use a Gaussian process model to form an approximation to the limit state (based either in x-space via the x_gaussian_process specification or in u-space via the u_gaussian_process specification), followed by probability estimation based on multimodal adaptive importance sampling (see [Bichon et al., 2007]). These probability estimates may then be transformed into generalized reliability levels if desired. At this time, inverse reliability analysis (mapping probability or generalized reliability levels into response levels) is not yet operational, although it may be supported in future releases. The Gaussian process model approximation to the limit state is formed over the uncertain variables by default, but may be extended to also capture the effect of design and state variables via the all_variables flag.

Table 5.32 provides details of the global reliability method specifications beyond those of Table 5.28.

Table 5.32 Specification detail for global reliability methods
Description Keyword Associated Data Status Default
Global reliability method nond_global_reliability none Required group N/A
Approximation type x_gaussian_process | u_gaussian_process none Required N/A
All variables flag all_variables none Optional Contour estimation only over uncertain variables
Random seed for initial GP construction seed integer Optional Time based seed: nonrepeatable

Polynomial chaos expansion method

The polynomial chaos expansion (PCE) is a general framework for the approximate representation of random response functions in terms of finite-dimensional series expansions in standardized random variables

\[R = \sum_{i=0}^P \alpha_i \Psi_i(\xi) \]

where $\alpha_i$ is a deterministic coefficient, $\Psi_i$ is a multidimensional orthogonal polynomial and $\xi$ is a vector of standardized random variables. An important distinguishing feature of the methodology is that the functional relationship between random inputs and outputs is captured, not merely the output statistics as in the case of many nondeterministic methodologies. DAKOTA provides access to PCE methods through the NonDPolynomialChaos class. Refer to the Uncertainty Quantification Capabilities chapter of the Users Manual [Adams et al., 2010] for additional information on the PCE algorithm.

To select the basis $\Psi_i$ of the expansion, three approaches may be employed, as previously shown in Table 5.29: Wiener, Askey, and Extended. The Wiener option uses a Hermite orthogonal polynomial basis for all random variables and employs the same nonlinear variable transformation as the local and global reliability methods (and therefore has the same variable support). The Askey option, however, employs an extended basis of Hermite, Legendre, Laguerre, Jacobi, and generalized Laguerre orthogonal polynomials. The Extended option avoids the use of any nonlinear variable transformations by augmenting the Askey approach with numerically-generated orthogonal polynomials for non-Askey probability density functions. The selection of Wiener versus Askey versus Extended is partially automated and partially under the user's control. The Extended option is the default and supports only Gaussian correlations (see Table 5.29). This default can be overridden by the user by supplying the keyword askey to request restriction to the use of Askey bases only or by supplying the keyword wiener to request restriction to the use of exclusively Hermite bases. If needed to support prescribed correlations (not under user control), the Extended and Askey options will fall back to the Wiener option on a per variable basis. If the prescribed correlations are also unsupported by Wiener expansions, then DAKOTA will exit with an error. Additional details include:

To obtain the coefficients $\alpha_i$ of the expansion, six options are provided:

  1. multidimensional integration by a tensor-product of Gaussian quadrature rules (specified with quadrature_order, where the order per variable may be anisotropic), where supported rules include Gauss-Hermite, Gauss-Legendre (Clenshaw-Curtis is not supported for tensor grids since Gauss-Legendre is preferred when nesting cannot be exploited), Gauss-Jacobi, Gauss-Laguerre, generalized Gauss-Laguerre, and numerically-generated Gauss rules. To synchronize with tensor-product integration, a tensor-product expansion is used, where the order $p_i$ of the expansion in each dimension is one less than the quadrature order $m_i$ in each dimension. The total number of terms, N, in a tensor-product expansion involving n uncertain input variables is

    \[N ~=~ 1 + P ~=~ \prod_{i=1}^{n} (p_i + 1)\]

  2. multidimensional integration by the Smolyak sparse grid method (specified with sparse_grid_level and, optionally, dimension_preference) employing weakly-nested Gauss-Hermite and Gauss-Legendre rules, and non-nested Gauss-Jacobi, Gauss-Laguerre, generalized Gauss-Laguerre, and numerically-generated Gauss rules. Both the rule type and the dimension emphasis (specified with dimension_preference, where higher preference leads to higher order resolution) may be anisotropic. For PCE with isotropic Smolyak, a total-order expansion is used, where the isotropic order p of the expansion is rigorously estimated from the set of monomials integrable by a particular sparse grid construction. The total number of terms N for an isotropic total-order expansion of order p over n variables is given by

    \[N~=~1 + P ~=~1 + \sum_{s=1}^{p} {\frac{1}{s!}} \prod_{r=0}^{s-1} (n + r) ~=~\frac{(n+p)!}{n!p!}\]

    Since fully nested Clenshaw-Curtis integration requires exponential growth rules (relating quadrature order m from level l) leading to an irregular set of resolvable monomials known as the "hyperbolic cross," the use of a total-order expansion leads to a poor synchronization between resolvable monomial set and chaos expansion. For this reason, Clenshaw-Curtis integration is disabled for PCE and weakly-/non-nested rules with linear growth (and a regular set of resolvable monomials) are employed exclusively. For PCE with anisotropic Smolyak, a custom anisotropic expansion is estimated (based again on the set of monomials that are resolvable by the anisotropic sparse grid).

  3. multidimensional integration by Stroud cubature rules [Stroud, 1971] and extensions [Xiu, 2008], as specified with cubature_integrand. A total-order expansion is used, where the isotropic order p of the expansion is half of the integrand order (rounded down). Since the maximum integrand order is currently five for normal and uniform and two for all other types, at most second- and first-order expansions, respectively, will be used. As a result, cubature is primarily useful for global sensitivity analysis, where the Sobol' indices will provide main effects and, at most, two-way interactions. In addition, the random variable set must be independent and identically distributed (iid), so the use of askey or wiener transformations may be required to create iid variable sets in the transformed space (as well as to allow usage of the higher order cubature rules for normal and uniform). Note that global sensitivity analysis often assumes uniform bounded regions, rather than precise probability distributions, so the iid restriction would not be problematic in that case.
  4. multidimensional integration by random sampling (specified with expansion_samples). In this case, the expansion order p cannot be inferred from the numerical integration specification and it is necessary to provide either expansion_order or expansion_terms to specify either p or N, respectively, for a total-order expansion, where the latter specification allows the use of a partial-order expansion (truncation of a complete order expansion, while supported, is not generally recommended).
  5. linear regression (specified with either collocation_points or collocation_ratio). A total-order expansion is used and must be specified using either expansion_order or expansion_terms as described in the previous option. Given p or N, the total number of collocation points (including any sample reuse) must be at least N, and an oversampling is generally advisable. To more easily satisfy this requirement (i.e., to avoid requiring the user to calculate N from n and p), collocation_ratio allows for specification of a constant oversampling factor applied to N (e.g., collocation_ratio = 2. for factor of 2 oversampling).
  6. coefficient import from a file (specified with expansion_import_file). A total-order expansion is assumed and must be specified using either expansion_order or expansion_terms.

If n is small (e.g., two or three), then tensor-product Gaussian quadrature is quite effective and can be the preferred choice. For moderate n (e.g., five), tensor-product quadrature quickly becomes too expensive and the sparse grid and point collocation approaches are preferred. For large n (e.g., more than ten), point collocation may begin to suffer from ill-conditioning and sparse grids are generally recommended. Random sampling for coefficient estimation is generally not recommended, although it does hold the advantage that the simulation budget is more flexible than that required by the other approaches. For incremental studies, approaches 3 and 4 support reuse of previous samples through the incremental_lhs (refer to Nondeterministic sampling method for description of incremental LHS) and reuse_samples (refer to Global approximations for description of the "all" option of sample reuse) specifications, respectively. As for Nondeterministic sampling method and Global reliability methods, the all_variables flag can be used to form expansions over all continuous variables, rather than just the default aleatory uncertain variables. For continuous design, continuous state, and epistemic interval variables included in all_variables mode, Legendre chaos bases are used to model the bounded intervals for these variables. However, these variables are not assumed to have any particular probability distribution, only that they are independent variables. Moreover, when probability integrals are evaluated, only the aleatory random variable domain is integrated, leaving behind a polynomial relationship between the statistics and the remaining design/state/epistemic variables.

The refinement keyword can be specified as either uniform or adaptive in order to select optional polynomial order refinement ("p-refinement"). This is currently supported for the tensor-product quadrature and Smolyak sparse grid options, and makes use of the max_iterations and convergence_tolerance method independent controls (see Table 5.1). The former control limits the number of refinement iterations, and the latter control terminates refinement when the two-norm of the change in the response covariance matrix falls below the tolerance. In the case of either uniform or adaptive refinement, the quadrature_order or sparse_grid_level are ramped by one on each refinement iteration until either of the convergence controls is satisfied. For the case of adaptive refinement, the dimension preference vector (updating dimension_preference for sparse grids and resulting in anisotropic quadrature_order for tensor grids) is also updated on every iteration using analytic Sobol' indices formed from variance-based decomposition of the polynomial chaos expansions, leading to anisotropic integrations/expansions with differing refinement levels for different random dimensions.

Additional specifications include the level mappings described in Uncertainty Quantification Methods and the seed, fixed_seed, samples, and sample_type specifications described in Nondeterministic sampling method. These latter sampling specifications refer to sampling on the PCE approximation for the purposes of generating approximate statistics, which should be distinguished from simulation sampling for generating the chaos coefficients as described in the previous paragraph. Table 5.33 provides details of the polynomial chaos expansion specifications beyond those of Table 5.28.

Table 5.33 Specification detail for polynomial chaos expansion method
Description Keyword Associated Data Status Default
Polynomial chaos expansion method nond_polynomial_chaos none Required group N/A
Alternate basis of orthogonal polynomials askey | wiener none Optional Extended basis of orthogonal polynomials (Askey + numerically generated)
p-refinement uniform | adaptive none Optional No refinement
Quadrature order for PCE coefficient estimation quadrature_order list of integer Required (1 of 7 selections) N/A
Cubature integrand order for PCE coefficient estimation cubature_integrand integer (1, 2, 3, or 5 for normal or uniform; 1 or 2 for exponential, beta, or gamma, 2 for all other distribution types) Required (1 of 7 selections) N/A
Sparse grid level for PCE coefficient estimation sparse_grid_level integer Required (1 of 7 selections) N/A
Sparse grid dimension preference dimension_preference list of reals Optional isotropic sparse grid
Number of simulation samples for PCE coefficient estimation expansion_samples integer Required (1 of 7 selections) N/A
Number of collocation points for PCE coefficient estimation collocation_points integer Required (1 of 7 selections) N/A
Collocation point oversampling ratio for PCE coefficient estimation collocation_ratio real Required (1 of 7 selections) N/A
File name for import of PCE coefficients expansion_import_file string Required (1 of 7 selections) N/A
Expansion order expansion_order integer Required (1 of 2 selections) for expansion_samples, collocation_points, collocation_ratio, or expansion_import_file N/A
Expansion terms expansion_terms integer Required (1 of 2 selections) for expansion_samples, collocation_points, collocation_ratio, or expansion_import_file N/A
Incremental LHS flag for PCE coefficient estimation by expansion_samples incremental_lhs none Optional coefficient estimation does not reuse previous samples
Reuse samples flag for PCE coefficient estimation by collocation_points or collocation_ratio reuse_samples none Optional coefficient estimation does not reuse previous samples
Random seed seed integer Optional randomly generated seed
Fixed seed flag fixed_seed none Optional seed not fixed: sampling patterns are variable among multiple PCE runs
Number of samples on PCE for generating statistics samples integer Optional 0 (will result in error if sampling-based statistics are requested)
Sampling type sample_type random | lhs Optional group lhs
All variables flag all_variables none Optional Expansion only over uncertain variables

Stochastic collocation method

The stochastic collocation (SC) method is very similar to the PCE method described above, with the key difference that the orthogonal polynomial basis functions are replaced with Lagrange polynomial interpolants. The expansion takes the form

\[R = \sum_{i=1}^{N_p} r_i L_i(\xi) \]

where $N_p$ is the total number of collocation points, $r_i$ is a response value at the $i^{th}$ collocation point, $L_i$ is the $i^{th}$ multidimensional Lagrange interpolation polynomial, and $\xi$ is a vector of standardized random variables. The $i^{th}$ Lagrange interpolation polynomial assumes the value of 1 at the $i^{th}$ collocation point and 0 at all other collocation points. Thus, in PCE, one forms coefficients for known orthogonal polynomial basis functions, whereas SC forms multidimensional interpolation functions for known coefficients. DAKOTA provides access to SC methods through the NonDStochCollocation class. Refer to the Uncertainty Quantification Capabilities chapter of the Users Manual [Adams et al., 2010] for additional information on the SC algorithm.

To form the multidimensional interpolants $L_i$ of the expansion, two options are provided:

  1. interpolation on a tensor-product of Gaussian quadrature points (specified with quadrature_order, where the order per variable may be anisotropic). As for PCE, Gauss-Hermite, Gauss-Legendre, Gauss-Jacobi, Gauss-Laguerre, generalized Gauss-Laguerre, and numerically-generated Gauss rules are supported.
  2. interpolation on a Smolyak sparse grid (specified with sparse_grid_level and, optionally, dimension_preference) defined from fully-nested Clenshaw-Curtis and weakly-/non-nested Gaussian rules. Both the rule type and dimension emphasis may be anisotropic. Unlike PCE, expansion synchronization is not an issue and both fully-nested Clenshaw-Curtis rules and anisotropic sparse grids are readily employed without concern over irregular sets of resolvable monomials. For self-consistency in growth rules, however, the grids are either fully nested using exponential growth rules or weakly-/non-nested using linear growth rules -- these two approaches are not mixed.

As for Polynomial chaos expansion method, the orthogonal polynomials used in defining the Gauss points that make up the interpolation grid are governed by one of three options: Wiener, Askey, or Extended. The Wiener option uses interpolation points from Gauss-Hermite quadrature for all random variables and employs the same nonlinear variable transformation as the local and global reliability methods (and therefore has the same variable support). The Askey option, however, employs interpolation points from Gauss-Hermite, Gauss-Legendre, Gauss-Laguerre, Gauss-Jacobi, and generalized Gauss-Laguerre quadrature. The Extended option avoids the use of any nonlinear variable transformations by augmenting the Askey approach with Gauss points from numerically-generated orthogonal polynomials for non-Askey probability density functions. As for PCE, the Wiener/Askey/Extended selection defaults to Extended, can be overridden by the user using the keywords askey or wiener, and automatically falls back from Extended/Askey to Wiener on a per variable basis as needed to support prescribed correlations.

If n is small, then tensor-product Gaussian quadrature is again the preferred choice. For larger n, tensor-product quadrature quickly becomes too expensive and the sparse grid approach is preferred. Similar to the approach decribed previously in Polynomial chaos expansion method, the all_variables flag can be used to expand the dimensionality of the interpolation to include continuous design and state variables and epistemic uncertain variables, in addition to the default aleatory uncertain variables. Interpolation points for these dimensions are based on Gauss-Legendre rules for tensor-product quadrature or Smolyak sparse grids that are anisotropic in rule, or Clenshaw-Curtis rules for Smolyak sparse grids that are isotropic in rule. Again, when probability integrals are evaluated, only the aleatory random variable domain is integrated, leaving behind a polynomial relationship between the statistics and the remaining design/state/epistemic variables.

The refinement keyword can be specified as either uniform or adaptive in order to select optional polynomial order refinement ("p-refinement"). This is currently supported for the tensor-product quadrature and Smolyak sparse grid options, and makes use of the max_iterations and convergence_tolerance iteration controls (see Table 5.1). The details of these specifications are identical to those described in Polynomial chaos expansion method.

Additional specifications include the level mappings described in Uncertainty Quantification Methods and the seed, fixed_seed, samples, and sample_type specifications described in Nondeterministic sampling method. These latter sampling specifications refer to sampling on the interpolant for the purposes of generating approximate statistics, which should not be confused with simulation evaluations used for forming the interpolant as described in the previous paragraph. Table 5.34 provides details of the stochastic collocation specifications beyond those of Table 5.28.

Table 5.34 Specification detail for stochastic collocation method
Description Keyword Associated Data Status Default
Stochastic collocation method nond_stoch_collocation none Required group N/A
Alternate basis of orthogonal polynomials askey | wiener none Optional Extended basis of orthogonal polynomials (Askey + numerically generated)
p-refinement uniform | adaptive none Optional No refinement
Quadrature order for collocation points quadrature_order list of integer Required (1 of 2 selections) N/A
Sparse grid level for collocation points sparse_grid_level integer Required (1 of 2 selections) N/A
Sparse grid dimension preference dimension_preference list of reals Optional isotropic sparse grid
Random seed seed integer Optional randomly generated seed
Fixed seed flag fixed_seed none Optional seed not fixed: sampling patterns are variable among multiple SC runs
Number of samples on interpolant for generating statistics samples integer Optional 0 (will result in error if sampling-based statistics are requested)
Sampling type sample_type random | lhs Optional group lhs
All variables flag all_variables none Optional Expansion only over uncertain variables

Epistemic Uncertainty Quantification Methods

Epistemic uncertainty is also referred to as subjective uncertainty, reducible uncertainty, model form uncertainty, or uncertainty due to lack of knowledge. Examples of epistemic uncertainty are little or no experimental data for an unknown physical parameter, or the existence of complex physics or behavior that is not included in the simulation model of a system. Epistemic uncertainty can be modeled probabilistically but is often modeled using non-probabilistic approaches such as interval propagation, evidence theory, possibility theory, information gap theory, etc. In DAKOTA, epistemic uncertainty analysis is performed using interval analysis or Dempster-Shafer theory of evidence. Epistemic (or mixed aleatory-epistemic) uncertainty may also be propagated through the use of the Nondeterministic sampling method, although in this case, the output statistics are limited to response intervals (any epistemic component suppresses all probabilistic results). Mixed uncertainty can also be addressed through use of nested UQ (refer to the Users Manual [Adams et al., 2010] for NestedModel discussion and examples); in this case, epistemic and aleatory analyses can be segregated and intervals on probabilistic results can be reported. A subtle distinction exists between nond_sampling for epistemic intervals and the lhs option of nond_global_interval_est: the former allows mixed aleatory-epistemic uncertainty within a single level, whereas the latter supports only epistemic variables and relies on nesting to address mixed uncertainty. In each of these cases, the Interval Uncertain Variable specification is used to describe the epistemic uncertainty using either simple intervals or basic probability assignments. Note that for mixed UQ problems with both aleatory and epistemic variables, if the user defines the epistemic variables as intervals and aleatory variables as probability distribution types, the method nond_sampling (in a simple, single-level study) will result in intervals only on the output. Although the aleatory variables will be sampled according to their distributions, the output will only be reported as an interval given the presence of interval variables. There is also the option to perform nested sampling, where one separates the epistemic and aleatory uncertain variables, samples over epistemic variables in the outer loop and then samples the aleatory variables in the inner llop, resulting in intervals on statistics. The calculation of intervals on statistics can also be performed by using nested approaches with interval estimation or evidence methods in the outer loop and aleatory UQ methods on the inner loop such as stochastic expansion or reliability methods. More detail about these "intervals on statistics" approaches can be found in [Eldred and Swiler, 2009].

Local Interval Estimation

In interval analysis, one assumes that nothing is known about an epistemic uncertain variable except that its value lies somewhere within an interval. In this situation, it is NOT assumed that the value has a uniform probability of occuring within the interval. Instead, the interpretation is that any value within the interval is a possible value or a potential realization of that variable. In interval analysis, the uncertainty quantification problem is one of determining the resulting bounds on the output (defining the output interval) given interval bounds on the inputs. Again, any output response that falls within the output interval is a possible output with no frequency information assigned to it.

We have the capability to perform interval analysis using either local methods (nond_local_interval_est) or global methods (nond_global_interval_est). If the problem is amenable to local optimization methods (e.g. can provide derivatives or use finite difference method to calculate derivatives), then one can use local methods to calculate these bounds. nond_local_interval_est allows the user to specify either sqp which is sequential quadratic programming, or nip which is a nonlinear interior point method. Table 5.35 provides the specification for the local interval method.

Table 5.35 Specification detail for local interval estimation used in epistemic uncertainty
Description Keyword Associated Data Status Default
Nondeterministic interval estimation nond_local_interval_est none Required group N/A
Estimation method sqp | nip none Required group N/A

Global Interval Estimation

As mentioned above, when performing interval analysis, one assumes that nothing is known about an epistemic uncertain variable except that its value lies somewhere within an interval. The goal of uncertainty quantification in this context is to determine the resulting bounds on the output (defining the output interval) given interval bounds on the inputs.

In the global approach, one uses either a global optimization method or a sampling method to assess the bounds. nond_global_interval_est allows the user to specify either lhs, which performs Latin Hypercube Sampling and takes the minimum and maximum of the samples as the bounds (no optimization is performed) or ego. In the case of ego, the efficient global optimization (EGO) method is used to calculate bounds (see the EGO method on this page for more explanation). When using lhs or ego, one can specify a seed for the number of LHS samples, the random number generator, and the number of samples. Table 5.36 provides the specification for the global interval methods.

Table 5.36 Specification detail for global interval estimation used in epistemic uncertainty
Description Keyword Associated Data Status Default
Nondeterministic interval estimation nond_global_interval_est none Required group N/A
Estimation method lhs | ego none Required group N/A
Random seed generator rng mt19937 | rnum2 Optional mt19937
Random seed seed integer Optional randomly generated seed
Number of samples samples integer Optional 10,000 for LHS, approximately numVars^2 for EGO

Local Evidence Theory (Dempster-Shafer) Methods

The above section discussed a pure interval approach. This section discusses Dempster-Shafer evidence theory. In this approach, one does not assign a probability distribution to each uncertain input variable. Rather, one divides each uncertain input variable into one or more intervals. The input parameters are only known to occur within intervals: nothing more is assumed. Each interval is defined by its upper and lower bounds, and a Basic Probability Assignment (BPA) associated with that interval. The BPA represents a probability of that uncertain variable being located within that interval. The intervals and BPAs are used to construct uncertainty measures on the outputs called "belief" and "plausibility." Belief represents the smallest possible probability that is consistent with the evidence, while plausibility represents the largest possible probability that is consistent with the evidence. For more information about the Dempster-Shafer theory of evidence, see Oberkampf and Helton, 2003 and Helton and Oberkampf, 2004.

Similar to the interval approaches, one may use global or local methods to determine plausbility and belief measures for the outputs. Note that to calculate the plausibility and belief cumulative distribution functions, one has to look at all combinations of intervals for the uncertain variables. Within each interval cell combination, the minimum and maximum value of the objective function determine the belief and plausibility, respectively. In terms of implementation, global methods use LHS sampling or global optimization to calculate the minimum and maximum values of the objective function within each interval cell, while local methods use gradient-based optimization methods to calculate these minima and maxima.

Finally, note that the nondeterministic general settings apply to the interval and evidence methods, but one needs to be careful about the interpretation and translate probabilistic measures to epistemic ones. For example, if the user specifies distribution of type complementary, a complementary plausibility and belief function will be generated for the evidence methods (as opposed to a complementary distribution function in the nond_sampling case). If the user specifies a set of responses levels, both the belief and plausibility will be calculated for each response level. Likewise, if the user specifies a probability level, the probability level will be interpreted both as a belief and plausibility, and response levels corresponding to the belief and plausibility levels will be calculated. Finally, if generalized reliability levels are specified, either as inputs (gen_reliability_levels) or outputs (response_levels with compute gen_reliabilities), then these are directly converted to/from probability levels and the same probability-based mappings described above are performed.

Table 5.37 provides the specification for the nond_local_evidence method. Note that two local optimization methods are available: sqp (sequential quadratic programming or nip (nonlinear interior point method).

Table 5.37 Specification detail for local evidence theory method for epistemic uncertainty
Description Keyword Associated Data Status Default
Nondeterministic local evidence method nond_local_evidence none Required group N/A
Estimation method sqp | nip none Required group N/A

Global Evidence Theory (Dempster-Shafer) Methods

Evidence theory has been explained above in the Local Evidence Theory section. The basic idea is that one specifies an "evidence structure" on uncertain inputs and propagates that to obtain belief and plausibility functions on the response functions. The inputs are defined by sets of intervals and Basic Probability Assignments (BPAs). Evidence propagation is computationally expensive, since the minimum and maximum function value must be calculated for each "interval cell combination." These bounds are aggregated into belief and plausibility.

Table 5.38 provides the specification for the nond_global_evidence method. nond_global_evidence allows the user to specify either lhs or ego. lhs performs Latin Hypercube Sampling and takes the minimum and maximum of the samples as the bounds per "interval cell combination." In the case of ego, the efficient global optimization (EGO) method is used to calculate bounds (see the EGO method on this page for more explanation). When using lhs or ego, one can specify a seed for the number of LHS samples, the random number generator, and the number of samples.

Note that to calculate the plausibility and belief cumulative distribution functions, one has to look at all combinations of intervals for the uncertain variables. In terms of implementation, if one is using LHS sampling as outlined above, this method creates a large sample over the response surface, then examines each cell to determine the minimum and maximum sample values within each cell. To do this, one needs to set the number of samples relatively high: the default is 10,000 and we recommend at least that number. If the model you are running is a simulation that is computationally quite expensive, we recommend that you set up a surrogate model within the DAKOTA input file so that nond_global_evidence performs its sampling and calculations on the surrogate and not on the original model. If one uses optimization methods instead to find the minimum and maximum sample values within each cell, this can also be computationally expensive.

Table 5.38 Specification detail for global evidence theory method for epistemic uncertainty
Description Keyword Associated Data Status Default
Nondeterministic global evidence method nond_global_evidence none Required group N/A
Estimation method lhs | ego none Required group N/A
Random seed generator rng mt19937 or rnum2 Optional mt19937
Random seed seed integer Optional randomly generated seed
Number of samples samples integer Optional 10,000 for LHS, approximately numVars^2 for EGO

Bayesian Calibration

This section covers Bayesian calibration methods. This is a new capability in DAKOTA 5.1 that we plan on extending. The idea in Bayesian calibration is to assign a prior distribution on uncertain model parameters and use experimental data observations on model output to update or inform the model parameters. The prior distributions on model parameters are transformed to posterior distributions through the Bayesian updating process. Typically, this process relies on a sampling method called Monte Carlo Markov Chain (MCMC). Note that Bayesian calibration can be thought of both as an optimization process and as an uncertainty quantification process. It is an optimization process in the sense of modifying model parameters based on experimental data (e.g. use experimental observations to improve the "fit" of the model similar to the approach used in least-squares methods which minimize the sum-of-squares of the residual terms, or errors, where error = model prediction - experimental observations). However, Bayesian calibration can also be thought of as an uncertainty quantification process since prior distribution on parameters are transformed to posterior distributions using Bayes' formula and a likelihood function. We have classified Bayesian calibration as a nondeterministic method in DAKOTA because of the probabilistic treatment of inputs and the statistical approaches underpinning Bayesian methods. Currently, we only have one Bayesian calibration method, GPMSA, but we are planning to incorporate additional libraries and Bayesian capabilities in DAKOTA.

GPMSA Methods

The Gaussian Process Models for Simulation Analysis (GPMSA) code has been developed at Los Alamos National Laboratory. The original version was developed in Matlab, and was translated to a C++ library in 2010.

GPMSA provides capabilities for the analysis of computer models, specifically the analysis of systems that have both computer simulations and observed data. The underlying assumption is that experimental results are expensive or difficult to collect, though it is possible to get simulation studies of the experimental problem in comparatively large, but still restricted, quantities. Therefore a statistical model of system responses is to be constructed, which is called a surrogate or emulator.

In GPM/SA, the emulator involves Gaussian Process models. The emulator is constructed from an ensemble of simulation responses collected at various settings of input parameters. The statistical model may be extended to incorporate system observations to improve system predictions and constrain, or calibrate, unknown parameters. The statistical model accounts for the discrepancy between the system observations and the simulation output. Parameters of the statistical model are determined through sampling their posterior probability with MCMC. The statistical model can then be used in several ways. Statistical prediction of simulation and observation response is relatively fast and includes uncertainty. Examination and analysis of the model parameters can be used for calibration, screening, and analysis of model properties.

At this point, the interface between DAKOTA and GPMSA supports the calibration of scalar data (e.g. a set of output values at specific values of inputs). In the input specification for GPMSA, the user assigns prior parameter distributions using the uncertain variable types available. DAKOTA then generates a Latin Hypercube Sample (LHS) for the model parameters and runs the simulation model at these points. DAKOTA sends the simulation model results and the experimental data results (which are specified by three data files as described below) to the GPMSA code which performs the MCMC and generates posterior samples on the model parameters, based on a Gaussian Process emulator, a likelihood function, and other assumptions involving the formulation of a discrepancy term. For more information about GPMSA, see the GPMSA Command manual and Examples manual in the DAKOTA/packages/gpmsa directory.

Table 5.39 provides the specification for the bayes_calibration method (note this can also be specified as nond_bayes_calibration). The keyword gpmsa specifies the type of Bayesian calibration method. There are three data files which are required: x_obs_data_file, y_obs_data_file, and y_std_data_file. The x_obs_data_file has a listing of input points at which the experimental data is taken, one per line. Note that there can be multiple X inputs, but the X inputs specified in this data file are assumed to be "configuration" parameters. They are not the same as uncertain parameters of the simulation. The y_obs_data_file has the responses corresponding to the input points specified in the x_obs_data_file, and the y_std_data_file has the experimental errors for each of the data points. Note that in some cases, the experimental error will be the same for all of the data, and in other cases, it may not be. Finally, there are three specification items relating to the initial LHS sampling which is performed on the simulation. rng specifies which random number is used, seed specifies the random seed, and samples specifies the number of samples at which the simulation will be run. The number of samples should be large enough to generate a reasonable Gaussian process estimate for the calibration process. At a minimum, we recommend the number of samples be at least the number of input variables squared, but we are not providing a default setting at this time. After running GPMSA, the posterior samples are written to a file called gpmsa.pvals.out in the directory in which DAKOTA was run.

Table 5.39 Specification detail for GPMSA method for Bayesian Calibration
Description Keyword Associated Data Status Default
Nondeterministic Bayesian Calibration nond_bayes_calibration | bayes_calibration none Required group N/A
Method type gpmsa none Required group N/A
Experimental data file - input x_obs_data_file none Required group N/A
Experimental data file - outputs y_obs_data_file none Required group N/A
Experimental data file - std. dev. of outputs y_std_data_file none Required group N/A
Random seed generator rng mt19937 or rnum2 Optional mt19937
Random seed seed integer Optional randomly generated seed
Number of samples samples integer Required N/A

Design of Computer Experiments Methods

Design and Analysis of Computer Experiments (DACE) methods compute response data sets at a selection of points in the parameter space. Two libraries are provided for performing these studies: DDACE and FSUDace. A DAKOTA interface to Lawrence Livermore National Laboratory's PSUADE library is provided, but this package is currently only available within LLNL. The design of experiments methods do not currently make use of any of the method independent controls.

DDACE

The Distributed Design and Analysis of Computer Experiments (DDACE) library provides the following DACE techniques: grid sampling (grid), pure random sampling (random), orthogonal array sampling (oas), latin hypercube sampling (lhs), orthogonal array latin hypercube sampling (oa_lhs), Box-Behnken (box_behnken), and central composite design (central_composite). It is worth noting that there is some overlap in sampling techniques with those available from the nondeterministic branch. The current distinction is that the nondeterministic branch methods are designed to sample within a variety of probability distributions for uncertain variables, whereas the design of experiments methods treat all variables as having uniform distributions. As such, the design of experiments methods are well-suited for performing parametric studies and for generating data sets used in building global approximations (see Global approximations), but are not currently suited for assessing the effect of uncertainties. If a design of experiments over both design/state variables (treated as uniform) and uncertain variables (with probability distributions) is desired, then nond_sampling can support this with its all_variables option (see Nondeterministic sampling method). DAKOTA provides access to the DDACE library through the DDACEDesignCompExp class.

In terms of method dependent controls, the specification structure is straightforward. First, there is a set of design of experiments algorithm selections separated by logical OR's (grid or random or oas or lhs or oa_lhs or box_behnken or central_composite). Second, there are optional specifications for the random seed to use in generating the sample set (seed), for fixing the seed (fixed_seed) among multiple sample sets (see Nondeterministic sampling method for discussion), for the number of samples to perform (samples), and for the number of symbols to use (symbols). The seed control is used to make sample sets repeatable, and the symbols control is related to the number of replications in the sample set (a larger number of symbols equates to more stratification and fewer replications). The quality_metrics control is available for the DDACE library. This control turns on calculation of volumetric quality measures which measure the uniformity of the point samples. More details on the quality measures are given under the description of the FSU sampling methods. The variance_based_decomp control is also available. This control enables the calculation of sensitivity indices which indicate how important the uncertainty in each input variable is in contributing to the output variance. More details on variance based decomposition are given in Nondeterministic sampling method. Design of experiments specification detail is given in Table 5.40.

Table 5.40 Specification detail for design of experiments methods
Description Keyword Associated Data Status Default
Design of experiments method dace none Required group N/A
dace algorithm selection grid | random | oas | lhs | oa_lhs | box_behnken | central_composite none Required N/A
Random seed seed integer Optional randomly generated seed
Fixed seed flag fixed_seed none Optional seed not fixed: sampling patterns are variable among multiple DACE runs
Number of samples samples integer Optional minimum required
Number of symbols symbols integer Optional default for sampling algorithm
Quality metrics quality_metrics none Optional No quality_metrics
Variance based decomposition variance_based_decomp none Optional No variance_based_decomp

FSUDace

The Florida State University Design and Analysis of Computer Experiments (FSUDace) library provides the following DACE techniques: quasi-Monte Carlo sampling (fsu_quasi_mc) based on the Halton sequence (halton) or the Hammersley sequence (hammersley), and Centroidal Voronoi Tessellation (fsu_cvt). All three methods generate sets of uniform random variables on the interval [0,1]. If the user specifies lower and upper bounds for a variable, the [0,1] samples are mapped to the [lower, upper] interval. The quasi-Monte Carlo and CVT methods are designed with the goal of low discrepancy. Discrepancy refers to the nonuniformity of the sample points within the hypercube. Discrepancy is defined as the difference between the actual number and the expected number of points one would expect in a particular set B (such as a hyper-rectangle within the unit hypercube), maximized over all such sets. Low discrepancy sequences tend to cover the unit hypercube reasonably uniformly. Quasi-Monte Carlo methods produce low discrepancy sequences, especially if one is interested in the uniformity of projections of the point sets onto lower dimensional faces of the hypercube (usually 1-D: how well do the marginal distributions approximate a uniform?) CVT does very well volumetrically: it spaces the points fairly equally throughout the space, so that the points cover the region and are isotropically distributed with no directional bias in the point placement. There are various measures of volumetric uniformity which take into account the distances between pairs of points, regularity measures, etc. Note that CVT does not produce low-discrepancy sequences in lower dimensions, however: the lower-dimension (such as 1-D) projections of CVT can have high discrepancy.

The quasi-Monte Carlo sequences of Halton and Hammersley are deterministic sequences determined by a set of prime bases. Generally, we recommend that the user leave the default setting for the bases, which are the lowest primes. Thus, if one wants to generate a sample set for 3 random variables, the default bases used are 2, 3, and 5 in the Halton sequence. To give an example of how these sequences look, the Halton sequence in base 2 starts with points 0.5, 0.25, 0.75, 0.125, 0.625, etc. The first few points in a Halton base 3 sequence are 0.33333, 0.66667, 0.11111, 0.44444, 0.77777, etc. Notice that the Halton sequence tends to alternate back and forth, generating a point closer to zero then a point closer to one. An individual sequence is based on a radix inverse function defined on a prime base. The prime base determines how quickly the [0,1] interval is filled in. Generally, the lowest primes are recommended.

The Hammersley sequence is the same as the Halton sequence, except the values for the first random variable are equal to 1/N, where N is the number of samples. Thus, if one wants to generate a sample set of 100 samples for 3 random variables, the first random variable has values 1/100, 2/100, 3/100, etc. and the second and third variables are generated according to a Halton sequence with bases 2 and 3, respectively. For more information about these sequences, see [Halton, 1960, Halton and Smith, 1964, and Kocis and Whiten, 1997].

The specification for specifying quasi-Monte Carlo (fsu_quasi_mc) is given below in Table 5.41. The user must specify if the sequence is (halton) or (hammersley). The user must also specify the number of samples to generate for each variable (samples). Then, there are three optional lists the user may specify. The first list determines where in the sequence the user wants to start. For example, for the Halton sequence in base 2, if the user specifies sequence_start = 2, the sequence would not include 0.5 and 0.25, but instead would start at 0.75. The default sequence_start is a vector with 0 for each variable, specifying that each sequence start with the first term. The sequence_leap control is similar but controls the "leaping" of terms in the sequence. The default is 1 for each variable, meaning that each term in the sequence be returned. If the user specifies a sequence_leap of 2 for a variable, the points returned would be every other term from the QMC sequence. The advantage to using a leap value greater than one is mainly for high-dimensional sets of random deviates. In this case, setting a leap value to the next prime number larger than the largest prime base can help maintain uniformity when generating sample sets for high dimensions. For more information about the efficacy of leaped Halton sequences, see [Robinson and Atcitty, 1999]. The final specification for the QMC sequences is the prime base. It is recommended that the user not specify this and use the default values. For the Halton sequence, the default bases are primes in increasing order, starting with 2, 3, 5, etc. For the Hammersley sequence, the user specifies (s-1) primes if one is generating an s-dimensional set of random variables.

The fixed_sequence control is similar to fixed_seed for other sampling methods. If fixed_sequence is specified, the user will get the same sequence (meaning the same set of samples) for subsequent calls of the QMC sampling method (for example, this might be used in a surrogate based optimization method or a parameter study where one wants to fix the uncertain variables). The latinize command takes the QMC sequence and "latinizes" it, meaning that each original sample is moved so that it falls into one strata or bin in each dimension as in Latin Hypercube sampling. The default setting is NOT to latinize a QMC sample. However, one may be interested in doing this in situations where one wants better discrepancy of the 1-dimensional projections (the marginal distributions). The variance_based_decomp control is also available. This control enables the calculation of sensitivity indices which indicate how important the uncertainty in each input variable is in contributing to the output variance. More details on variance based decomposition are given in Nondeterministic sampling method.

Finally, quality_metrics calculates four quality metrics relating to the volumetric spacing of the samples. The four quality metrics measure different aspects relating to the uniformity of point samples in hypercubes. Desirable properties of such point samples are: are the points equally spaced, do the points cover the region, and are they isotropically distributed, with no directional bias in the spacing. The four quality metrics we report are h, chi, tau, and d. h is the point distribution norm, which is a measure of uniformity of the point distribution. Chi is a regularity measure, and provides a measure of local uniformity of a set of points. Tau is the second moment trace measure, and d is the second moment determinant measure. All of these values are scaled so that smaller is better (the smaller the metric, the better the uniformity of the point distribution). Complete explanation of these measures can be found in [Gunzburger and Burkardt, 2004.].

Table 5.41 Specification detail for FSU Quasi-Monte Carlo sequences
Description Keyword Associated Data Status Default
FSU Quasi-Monte Carlo fsu_quasi_mc none Required group N/A
Sequence type halton | hammersley none Required group N/A
Number of samples samples integer Optional (0) for standalone sampling, (minimum required) for surrogates
Sequence starting indices sequence_start integer list (one integer per variable) Optional Vector of zeroes
Sequence leaping indices sequence_leap integer list (one integer per variable) Optional Vector of ones
Prime bases for sequences prime_base integer list (one integer per variable) Optional Vector of the first s primes for s-dimensions in Halton, First (s-1) primes for Hammersley
Fixed sequence flag fixed_sequence none Optional sequence not fixed: sampling patterns are variable among multiple QMC runs
Latinization of samples latinize none Optional No latinization
Variance based decomposition variance_based_decomp none Optional No variance_based_decomp
Quality metrics quality_metrics none Optional No quality_metrics

The FSU CVT method (fsu_cvt) produces a set of sample points that are (approximately) a Centroidal Voronoi Tessellation. The primary feature of such a set of points is that they have good volumetric spacing; the points tend to arrange themselves in a pattern of cells that are roughly the same shape. To produce this set of points, an almost arbitrary set of initial points is chosen, and then an internal set of iterations is carried out. These iterations repeatedly replace the current set of sample points by an estimate of the centroids of the corresponding Voronoi subregions. [Du, Faber, and Gunzburger, 1999].

The user may generally ignore the details of this internal iteration. If control is desired, however, there are a few variables with which the user can influence the iteration. The user may specify max_iterations, the number of iterations carried out; num_trials, the number of secondary sample points generated to adjust the location of the primary sample points; and trial_type, which controls how these secondary sample points are generated. In general, the variable with the most influence on the quality of the final sample set is num_trials, which determines how well the Voronoi subregions are sampled. Generally, num_trials should be "large", certainly much bigger than the number of sample points being requested; a reasonable value might be 10,000, but values of 100,000 or 1 million are not unusual.

CVT has a seed specification similar to that in DDACE: there are optional specifications for the random seed to use in generating the sample set (seed), for fixing the seed (fixed_seed) among multiple sample sets (see Nondeterministic sampling method for discussion), and for the number of samples to perform (samples). The seed control is used to make sample sets repeatable. Finally, the user has the option to specify the method by which the trials are created to adjust the centroids. The trial_type can be one of three types: random, where points are generated randomly; halton, where points are generated according to the Halton sequence; and grid, where points are placed on a regular grid over the hyperspace.

Finally, latinization is available for CVT as with QMC. The latinize control takes the CVT sequence and "latinizes" it, meaning that each original sample is moved so that it falls into one strata or bin in each dimension as in Latin Hypercube sampling. The default setting is NOT to latinize a CVT sample. However, one may be interested in doing this in situations where one wants better discrepancy of the 1-dimensional projections (the marginal distributions). The variance_based_decomp control is also available. This control enables the calculation of sensitivity indices which indicate how important the uncertainty in each input variable is in contributing to the output variance. More details on variance based decomposition are given in Nondeterministic sampling method. The quality_metrics control is available for CVT as with QMC. This command turns on calculation of volumetric quality measures which measure the "goodness" of the uniformity of the point samples. More details on the quality measures are given under the description of the QMC methods.

The specification detail for the FSU CVT method is given in Table 5.42.

Table 5.42 Specification detail for FSU Centroidal Voronoi Tesselation sampling
Description Keyword Associated Data Status Default
FSU CVT sampling fsu_cvt none Required group N/A
Random seed seed integer Optional randomly generated seed
Fixed seed flag fixed_seed none Optional seed not fixed: sampling patterns are variable among multiple CVT runs
Number of samples samples integer Required (0) for standalone sampling, (minimum required) for surrogates
Number of trials num_trials integer Optional 10000
Trial type trial_type random | grid | halton Optional random
Latinization of samples latinize none Optional No latinization
Variance based decomposition variance_based_decomp none Optional No variance_based_decomp
Quality metrics quality_metrics none Optional No quality_metrics

PSUADE

The Problem Solving Environment for Uncertainty Analysis and Design Exploration (PSUADE) is a Lawrence Livermore National Laboratory tool for metamodeling, sensitivity analysis, uncertainty quantification, and optimization. Its features include non-intrusive and parallel function evaluations, sampling and analysis methods, an integrated design and analysis framework, global optimization, numerical integration, response surfaces (MARS and higher order regressions), graphical output with Pgplot or Matlab, and fault tolerance [C.H. Tong, 2005]. While PSUADE is only available internally at LLNL, DAKOTA includes a prototype interface to its MOAT sampling method.

The Morris One-At-A-Time (MOAT) method, originally proposed by Morris [M.D. Morris, 1991], is a screening method, designed to explore a computational model to distinguish between input variables that have negligible, linear and additive, or nonlinear or interaction effects on the output. The computer experiments performed consist of individually randomized designs which vary one input factor at a time to create a sample of its elementary effects. The PSUADE implementation of MOAT is selected with method keyword psuade_moat. The number of samples (samples) must be a positive integer multiple of (number of continuous design variable + 1) and will be automatically adjusted if misspecified. The number of partitions (partitions) applies to each variable being studied and must be odd (the number of MOAT levels per variable is partitions + 1). This will also be adjusted at runtime as necessary. For information on practical use of this method, see [Saltelli, et al., 2004]. The specification detail for the PSUADE MOAT method is given in Table 5.43.

Table 5.43 Specification detail for PSUADE methods
Description Keyword Associated Data Status Default
PSUADE MOAT method psuade_moat none Required group N/A
Random seed seed integer Optional randomly generated seed
Number of samples samples integer Optional 10*(num_cdv + 1)
Number of partitions partitions integer Optional 3

Parameter Study Methods

DAKOTA's parameter study methods compute response data sets at a selection of points in the parameter space. These points may be specified as a vector, a list, a set of centered vectors, or a multi-dimensional grid. Capability overviews and examples of the different types of parameter studies are provided in the Users Manual [Adams et al., 2010]. DAKOTA implements all of the parameter study methods within the ParamStudy class.

With the exception of output verbosity (a setting of silent will suppress some parameter study diagnostic output), DAKOTA's parameter study methods do not make use of the method independent controls. Therefore, the parameter study documentation which follows is limited to the method dependent controls for the vector, list, centered, and multidimensional parameter study methods.

Vector parameter study

DAKOTA's vector parameter study computes response data sets at selected intervals along a vector in parameter space. It is often used for single-coordinate parameter studies (to study the effect of a single variable on a response set), but it can be used more generally for multiple coordinate vector studies (to investigate the response variations along some n-dimensional vector such as an optimizer search direction). This study is selected using the vector_parameter_study specification followed by either a final_point or a step_vector specification.

The vector for the study can be defined in several ways (refer to dakota.input.summary). First, a final_point specification, when combined with the initial values from the variables specification (in Variables Commands, see initial_point and initial_state for design and state variables as well as inferred initial values for uncertain variables), uniquely defines an n-dimensional vector's direction and magnitude through its start and end points. The values included in the final_point specification are the actual variable values for discrete sets, not the underlying set index value. The intervals along this vector are then specified with a num_steps specification, for which the distance between the initial values and the final_point is broken into num_steps intervals of equal length. For continuous and discrete range variables, distance is measured in the actual values of the variables, but for discrete set variables (either integer or real sets for design, uncertain, or state types), distance is instead measured in index offsets. Since discrete sets may have nonuniform offsets in their enumerated set values but have uniform offsets in their index values, defining steps in terms of set indices allows for meaningful parameter study specifications for these variable types. This study performs function evaluations at both ends, making the total number of evaluations equal to num_steps+1. The study has stringent requirements on performing appropriate steps with any discrete range and discrete set variables. A num_steps specification must result in discrete range and set index steps that are integral: no remainder is currently permitted in the integer step calculation and no rounding to integer steps will occur. The final_point specification detail is given in Table 5.44.

Table 5.44 final_point specification detail for the vector parameter study
Description Keyword Associated Data Status Default
Vector parameter study vector_parameter_study none Required group N/A
Termination point of vector final_point list of reals (actual values; no set indices) Required group N/A
Number of steps along vector num_steps integer Required N/A

The other technique for defining a vector in the study is the step_vector specification. This parameter study begins at the initial values and adds the increments specified in step_vector to obtain new simulation points. For discrete set types (design, uncertain, or state; real or integer), the steps are set index offsets, not steps between the actual set values. This increment process is performed num_steps times, and since the initial values are included, the total number of simulations is again equal to num_steps+1. The step_vector specification detail is given in Table 5.45.

Table 5.45 step_vector specification detail for the vector parameter study
Description Keyword Associated Data Status Default
Vector parameter study vector_parameter_study none Required group N/A
Step vector step_vector list of reals (index offset components are cast to integers) Required group N/A
Number of steps along vector num_steps integer Required N/A

List parameter study

DAKOTA's list parameter study allows for evaluations at user selected points of interest which need not follow any particular structure. This study is selected using the list_parameter_study method specification followed by a list_of_points specification.

The number of real values in the list_of_points specification must be a multiple of the total number of variables (including continuous and discrete types) contained in the variables specification. This parameter study simply performs simulations for the first parameter set (the first n entries in the list), followed by the next parameter set (the next n entries), and so on, until the list of points has been exhausted. Since the initial values from the variables specification will not be used, they need not be specified. For discrete set types, the actual values should be specified, not the set indices, although the values will be validated for membership within the set value specifications. The list parameter study specification detail is given in Table 5.46.

Table 5.46 Specification detail for the list parameter study
Description Keyword Associated Data Status Default
List parameter study list_parameter_study none Required group N/A
List of points to evaluate list_of_points list of reals (actual values; no set indices) Required N/A

Centered parameter study

DAKOTA's centered parameter study computes response data sets along multiple coordinate-based vectors, one per parameter, centered about the initial values from the variables specification. This is useful for investigation of function contours with respect to each parameter individually in the vicinity of a specific point (e.g., post-optimality analysis for verification of a minimum), thereby avoiding the cost associated with a multidimensional grid. It is selected using the centered_parameter_study method specification followed by step_vector and steps_per_variable specifications. The step_vector specification provides the size of the increments for each variable (employed sequentially, not all at once as for Vector parameter study) in either actual values (continuous and discrete range) or index offsets (discrete set). The steps_per_variable specification provides the number of increments per variable (again, employed sequentially) in each of the plus and minus directions. The centered parameter study specification detail is given in Table 5.47.

Table 5.47 Specification detail for the centered parameter study
Description Keyword Associated Data Status Default
Centered parameter study centered_parameter_study none Required group N/A
Step vector step_vector list of reals (index offset components are cast to integers) Required group N/A
Number of steps per variable steps_per_variable list of integers Required N/A

Multidimensional parameter study

DAKOTA's multidimensional parameter study computes response data sets for an n-dimensional grid of points. Each continuous and discrete range variable is partitioned into equally spaced intervals between its upper and lower bounds, each discrete set variable is partitioned into equally spaced index intervals, and each combination of the values defined by the boundaries of these partitions is evaluated.

This study is selected using the multidim_parameter_study method specification followed by a partitions specification, where the partitions list specifies the number of partitions for each variable. The number of entries in the partitions list must be equal to the total number of variables contained in the variables specification. As for the vector and centered studies, remainders within the integer division of the step calculations are not permitted for discrete range or set types and therefore no integer rounding occurs, so the partitions specification must be carefully selected in the presence of these types. Since the initial values from the variables specification will not be used, they need not be specified. The multidimensional parameter study specification detail is given in Table 5.48.

Table 5.48 Specification detail for the multidimensional parameter study
Description Keyword Associated Data Status Default
Multidimensional parameter study multidim_parameter_study none Required group N/A
Partitions per variable partitions list of integers Required N/A



Previous chapter

Next chapter
Generated on Thu Dec 2 01:22:52 2010 for DAKOTA by  doxygen 1.4.7