Responses Commands

Responses Commands Table of Contents

Responses Description

The responses specification in a DAKOTA input file specifies the data set that can be recovered from the interface after the completion of a "function evaluation." Here, the term function evaluation is used somewhat loosely to denote a data request from an iterator that is mapped through an interface in a single pass. Strictly speaking, this data request may actually involve multiple response functions and their derivatives, but the term function evaluation is widely used for this purpose. The data set is made up of a set of functions, their first derivative vectors (gradients), and their second derivative matrices (Hessians). This abstraction provides a generic data container (the Response class) whose contents are interpreted differently depending upon the type of iteration being performed. In the case of optimization, the set of functions consists of one or more objective functions, nonlinear inequality constraints, and nonlinear equality constraints. Linear constraints are not part of a response set since their coefficients can be communicated to an optimizer at start up and then computed internally for all function evaluations (see Method Independent Controls). In the case of least squares iterators, the functions consist of individual residual terms or model responses and an observed data file for comparison (as opposed to a sum of the squares objective function) as well as nonlinear inequality and equality constraints. In the case of nondeterministic iterators, the function set is made up of generic response functions for which the effect of parameter uncertainty is to be quantified. Lastly, parameter study and design of experiments iterators may be used with any of the response data set types. Within the C++ implementation, the same data structures are reused for each of these cases; only the interpretation of the data varies from iterator branch to iterator branch.

Gradient availability may be described by no_gradients, numerical_gradients, analytic_gradients, or mixed_gradients. The no_gradients selection means that gradient information is not needed in the study. The numerical_gradients selection means that gradient information is needed and will be computed with finite differences using either the native or one of the vendor finite differencing routines. The analytic_gradients selection means that gradient information is available directly from the simulation (finite differencing is not required). And the mixed_gradients selection means that some gradient information is available directly from the simulation whereas the rest will have to be estimated with finite differences.

Hessian availability may be described by no_hessians, numerical_hessians, quasi_hessians, analytic_hessians, or mixed_hessians. As for the gradient specification, the no_hessians selection indicates that Hessian information is not needed/available in the study, and the analytic_hessians selection indicates that Hessian information is available directly from the simulation. The numerical_hessians selection indicates that Hessian information is needed and will be estimated with finite differences using either first-order differences of gradients (for analytic gradients) or second-order differences of function values (for non-analytic gradients). The quasi_hessians specification means that Hessian information is needed and will be accumulated over time using secant updates based on the existing gradient evaluations. Finally, the mixed_hessians selection allows for a mixture of analytic, numerical, and quasi Hessian response data.

The responses specification provides a description of the total data set that is available for use by the iterator during the course of its iteration. This should be distinguished from the data subset described in an active set vector (see DAKOTA File Data Formats in the Users Manual [Adams et al., 2010]) which describes the particular subset of the response data needed for an individual function evaluation. In other words, the responses specification is a broad description of the data to be used during a study whereas the active set vector describes the particular subset of the available data that is currently needed.

Several examples follow. The first example shows an optimization data set containing an objective function and two nonlinear inequality constraints. These three functions have analytic gradient availability and no Hessian availability.

responses,
	num_objective_functions = 1
	num_nonlinear_inequality_constraints = 2
	analytic_gradients
	no_hessians

The next example shows a typical specification for a least squares data set. The six residual functions will have numerical gradients computed using the dakota finite differencing routine with central differences of 0.1% (plus/minus delta value = .001*value).

responses,
	num_least_squares_terms = 6
	numerical_gradients
	  method_source dakota
	  interval_type central
	  fd_gradient_step_size = .001
	no_hessians

The last example shows a specification that could be used with a nondeterministic sampling iterator. The three response functions have no gradient or Hessian availability; therefore, only function values will be used by the iterator.

responses,
	num_response_functions = 3
	no_gradients
	no_hessians

Parameter study and design of experiments iterators are not restricted in terms of the response data sets which may be catalogued; they may be used with any of the function specification examples shown above.

Responses Specification

The responses specification has the following structure:

responses,
	<set identifier>
	<response descriptors>
	<function specification>
	<gradient specification>
	<Hessian specification>

Referring to dakota.input.summary, it is evident from the enclosing brackets that the set identifier and response descriptors are optional. However, the function, gradient, and Hessian specifications are all required specifications, each of which contains several possible specifications separated by logical OR's. The function specification must be one of three types:

The gradient specification must be one of four types: And the Hessian specification must be one of five types: The following sections describe each of these specification components in additional detail.

Responses Set Identifier

The optional set identifier specification uses the keyword id_responses to input a string for use in identifying a particular responses specification. A model can then identify the use of this response set by specifying the same string in its responses_pointer specification (see Model Independent Controls). For example, a model whose specification contains responses_pointer = 'R1' will use a responses set with id_responses = 'R1'.

If the id_responses specification is omitted, a particular responses specification will be used by a model only if that model omits specifying a responses_pointer and if the responses set was the last set parsed (or is the only set parsed). In common practice, if only one responses set exists, then id_responses can be safely omitted from the responses specification and responses_pointer can be omitted from the model specification(s), since there is no potential for ambiguity in this case. Table 9.1 summarizes the set identifier input.

Table 9.1 Specification detail for set identifier
Description Keyword Associated Data Status Default
Responses set identifier id_responses string Optional use of last responses parsed

Response Labels

The optional response labels specification uses the keyword response_descriptors to input a list of strings which will be replicated through the DAKOTA output to help identify the numerical values for particular response functions. The default descriptor strings use a root string plus a numeric identifier. This root string is "obj_fn" for objective functions, "least_sq_term" for least squares terms, "response_fn" for generic response functions, "nln_ineq_con" for nonlinear inequality constraints, and "nln_eq_con" for nonlinear equality constraints. Table 9.2 summarizes the response descriptors input.

Table 9.2 Specification detail for response labels
Description Keyword Associated Data Status Default
Response labels descriptors list of strings Optional root strings plus numeric identifiers

Function Specification

The function specification must be one of three types: 1) a group containing objective and constraint functions, 2) a group containing least squares terms and constraint functions, or 3) a generic response functions specification. These function sets correspond to optimization, least squares, and uncertainty quantification iterators, respectively. Parameter study and design of experiments iterators may be used with any of the three function specifications.

Objective and constraint functions (optimization data set)

An optimization data set is specified using num_objective_functions and optionally objective_function_scale_types, objective_function_scales, multi_objective_weights, num_nonlinear_inequality_constraints, nonlinear_inequality_lower_bounds, nonlinear_inequality_upper_bounds, nonlinear_inequality_scale_types, nonlinear_inequality_scales, num_nonlinear_equality_constraints, nonlinear_equality_targets, nonlinear_equality_scale_types, and nonlinear_equality_scales. The num_objective_functions, num_nonlinear_inequality_constraints, and num_nonlinear_equality_constraints inputs specify the number of objective functions, nonlinear inequality constraints, and nonlinear equality constraints, respectively. The number of objective functions must be 1 or greater, and the number of inequality and equality constraints must be 0 or greater. The objective_function_scale_types specification includes strings specifying the scaling type for each objective function value in methods that support scaling, when scaling is enabled (see Method Independent Controls for details). Each entry in objective_function_scale_types may be selected from 'none', 'value', or 'log', to select no, characteristic value, or logarithmic scaling, respectively. Automatic scaling is not available for objective functions. If a single string is specified it will apply to each objective function. Each entry in objective_function_scales may be a user-specified nonzero characteristic value to be used in scaling each objective function. These values are ignored for scaling type 'none', required for 'value', and optional for 'log'. If a single real value is specified it will apply to each function. If the number of objective functions is greater than 1, then a multi_objective_weights specification provides a simple weighted-sum approach to combining multiple objectives:

\[f = \sum_{i=1}^{n} w_{i}f_{i}\]

If this is not specified, then each objective function is given equal weighting:

\[f = \sum_{i=1}^{n} \frac{f_i}{n}\]

If scaling is specified, it is applied before multi-objective weighted sums are formed.

The nonlinear_inequality_lower_bounds and nonlinear_inequality_upper_bounds specifications provide the lower and upper bounds for 2-sided nonlinear inequalities of the form

\[g_l \leq g(x) \leq g_u\]

The defaults for the inequality constraint bounds are selected so that one-sided inequalities of the form

\[g(x) \leq 0.0\]

result when there are no user constraint bounds specifications (this provides backwards compatibility with previous DAKOTA versions). In a user bounds specification, any upper bound values greater than +bigRealBoundSize (1.e+30, as defined in Minimizer) are treated as +infinity and any lower bound values less than -bigRealBoundSize are treated as -infinity. This feature is commonly used to drop one of the bounds in order to specify a 1-sided constraint (just as the default lower bounds drop out since -DBL_MAX < -bigRealBoundSize). The same approach is used for nonexistent linear inequality bounds as described in Method Independent Controls and for nonexistent design variable bounds as described in Design Variables.

The nonlinear_equality_targets specification provides the targets for nonlinear equalities of the form

\[g(x) = g_t\]

and the defaults for the equality targets enforce a value of 0. for each constraint

\[g(x) = 0.0\]

The nonlinear_inequality_scale_types and nonlinear_equality_scale_types specifications include strings specifying the scaling type for each nonlinear inequality or equality constraint, respectively, in methods that support scaling, when scaling is enabled (see Method Independent Controls for details). Each entry in objective_function_scale_types may be selected from 'none', 'value', 'auto', or 'log', to select no, characteristic value, automatic, or logarithmic scaling, respectively. If a single string is specified it will apply to all components of the relevant nonlinear constraint vector. Each entry in nonlinear_inequality_scales and nonlinear_equality_scales may be a user-specified nonzero characteristic value to be used in scaling each constraint component. These values are ignored for scaling type 'none', required for 'value', and optional for 'auto' and 'log'. If a single real value is specified it will apply to each constraint.

Any linear constraints present in an application need only be input to an optimizer at start up and do not need to be part of the data returned on every function evaluation (see the linear constraints description in Method Independent Controls). Table 9.3 summarizes the optimization data set specification.

Table 9.3 Specification detail for optimization data sets
Description Keyword Associated Data Status Default
Number of objective functions num_objective_functions integer Required group N/A
Objective function scaling types objective_function_scale_types list of strings Optional vector values = 'none'
Objective function scales objective_function_scales list of reals Optional vector values = 1. (no scaling)
Multiobjective weightings multi_objective_weights list of reals Optional equal weightings
Number of nonlinear inequality constraints num_nonlinear_inequality_constraints integer Optional 0
Nonlinear inequality constraint lower bounds nonlinear_inequality_lower_bounds list of reals Optional vector values = -DBL_MAX
Nonlinear inequality constraint upper bounds nonlinear_inequality_upper_bounds list of reals Optional vector values = 0.
Nonlinear inequality constraint scaling types nonlinear_inequality_scale_types list of strings Optional vector values = 'none'
Nonlinear inequality constraint scales nonlinear_inequality_scales list of reals Optional vector values = 1. (no scaling)
Number of nonlinear equality constraints num_nonlinear_equality_constraints integer Optional 0
Nonlinear equality constraint targets nonlinear_equality_targets list of reals Optional vector values = 0.
Nonlinear equality constraint scaling types nonlinear_equality_scale_types list of strings Optional vector values = 'none'
Nonlinear equality constraint scales nonlinear_equality_scales list of reals Optional vector values = 1. (no scaling)

Least squares terms and constraint functions (least squares data set)

A least squares data set is specified using num_least_squares_terms and optionally least_squares_data_file least_squares_term_scales, least_squares_weights num_nonlinear_inequality_constraints, nonlinear_inequality_lower_bounds, nonlinear_inequality_upper_bounds, nonlinear_inequality_scales, num_nonlinear_equality_constraints, nonlinear_equality_targets, and nonlinear_equality_scales. Each of the least squares terms is a residual function to be driven toward zero, and the nonlinear inequality and equality constraint specifications have identical meanings to those described in Objective and constraint functions (optimization data set). These types of problems are commonly encountered in parameter estimation, system identification, and model calibration. Least squares problems are most efficiently solved using special-purpose least squares solvers such as Gauss-Newton or Levenberg-Marquardt; however, they may also be solved using general-purpose optimization algorithms.

It is important to realize that, while DAKOTA can solve these problems with either least squares or optimization algorithms, the response data sets to be returned from the simulator are different. Least squares involves a set of residual functions whereas optimization involves a single objective function (sum of the squares of the residuals), i.e.,

\[f = \sum_{i=1}^{n} (R_i)^2\]

where f is the objective function and the set of $R_i$ are the residual functions. Therefore, function values and derivative data in the least squares case involve the values and derivatives of the residual functions, whereas the optimization case involves values and derivatives of the sum of squares objective function. Switching between the two approaches will likely require different simulation interfaces capable of returning the different granularity of response data required. The specification least_squares_data_file may be used to specify a text file containing num_least_squares_terms observed data values (one per line) to be used in computing the residuals

\[R_i = y^M_i - y^O_i \]

where M denotes model and O, observation. In this case the simulator should return the actual model response, as DAKOTA will compute the residual internally using the supplied data.

The least_squares_term_scale_types specification includes strings specifying the scaling type for each least squares term in methods that support scaling, when scaling is enabled (see Method Independent Controls for details). Each entry in least_squares_term_scale_types may be selected from 'none', 'value', or 'log', to select no, characteristic value, or logarithmic scaling, respectively. Automatic scaling is not available for least squares terms. If a single string is specified it will apply to each least squares terms. Each entry in least_squares_term_scales may be a user-specified nonzero characteristic value to be used in scaling each term. These values are ignored for scaling type 'none', required for 'value', and optional for 'log'. If a single real value is specified it will apply to each term. The least_squares_weights specification provides a means to multiplicatively weight the vector of least squares residuals with a vector of weights. If scaling is specified, it is applied before term weighting.

Table 9.4 summarizes the least squares data set specification.

Table 9.4 Specification detail for nonlinear least squares data sets
Description Keyword Associated Data Status Default
Number of least squares terms num_least_squares_terms integer Required N/A
Least squares data source file least_squares_data_file string Optional none
Least squares term scaling types least_squares_term_scale_types list of strings Optional vector values = 'none'
Least squares terms scales least_squares_term_scales list of reals Optional no scaling (vector values = 1.)
Least squares terms weightings least_squares_weights list of reals Optional equal weightings
Number of nonlinear inequality constraints num_nonlinear_inequality_constraints integer Optional 0
Nonlinear inequality constraint lower bounds nonlinear_inequality_lower_bounds list of reals Optional vector values = -DBL_MAX
Nonlinear inequality constraint upper bounds nonlinear_inequality_upper_bounds list of reals Optional vector values = 0.
Nonlinear inequality scaling types nonlinear_inequality_scale_types list of strings Optional vector values = 'none'
Nonlinear inequality constraint scales nonlinear_inequality_scales list of reals Optional no scaling (vector values = 1.)
Number of nonlinear equality constraints num_nonlinear_equality_constraints integer Optional 0
Nonlinear equality constraint targets nonlinear_equality_targets list of reals Optional vector values = 0.
Nonlinear equality scaling types nonlinear_equality_scale_types list of strings Optional vector values = 'none'
Nonlinear equality constraint scales nonlinear_equality_scales list of reals Optional no scaling (vector values = 1.)

Response functions (generic data set)

A generic response data set is specified using num_response_functions. Each of these functions is simply a response quantity of interest with no special interpretation taken by the method in use. This type of data set is used by uncertainty quantification methods, in which the effect of parameter uncertainty on response functions is quantified, and can also be used in parameter study and design of experiments methods (although these methods are not restricted to this data set), in which the effect of parameter variations on response functions is evaluated. Whereas objective, constraint, and residual functions have special meanings for optimization and least squares algorithms, the generic response function data set need not have a specific interpretation and the user is free to define whatever functional form is convenient. Table 9.5 summarizes the generic response function data set specification.

Table 9.5 Specification detail for generic response function data sets
Description Keyword Associated Data Status Default
Number of response functions num_response_functions integer Required N/A

Gradient Specification

The gradient specification must be one of four types: 1) no gradients, 2) numerical gradients, 3) analytic gradients, or 4) mixed gradients.

No gradients

The no_gradients specification means that gradient information is not needed in the study. Therefore, it will neither be retrieved from the simulation nor computed with finite differences. The no_gradients keyword is a complete specification for this case.

Numerical gradients

The numerical_gradients specification means that gradient information is needed and will be computed with finite differences using either the native or one of the vendor finite differencing routines.

The method_source setting specifies the source of the finite differencing routine that will be used to compute the numerical gradients: dakota denotes DAKOTA's internal finite differencing algorithm and vendor denotes the finite differencing algorithm supplied by the iterator package in use (DOT, CONMIN, NPSOL, NL2SOL, NLSSOL, and OPT++ each have their own internal finite differencing routines). The dakota routine is the default since it can execute in parallel and exploit the concurrency in finite difference evaluations (see Exploiting Parallelism in the Users Manual [Adams et al., 2010]). However, the vendor setting can be desirable in some cases since certain libraries will modify their algorithm when the finite differencing is performed internally. Since the selection of the dakota routine hides the use of finite differencing from the optimizers (the optimizers are configured to accept user-supplied gradients, which some algorithms assume to be of analytic accuracy), the potential exists for the vendor setting to trigger the use of an algorithm more optimized for the higher expense and/or lower accuracy of finite-differencing. For example, NPSOL uses gradients in its line search when in user-supplied gradient mode (since it assumes they are inexpensive), but uses a value-based line search procedure when internally finite differencing. The use of a value-based line search will often reduce total expense in serial operations. However, in parallel operations, the use of gradients in the NPSOL line search (user-supplied gradient mode) provides excellent load balancing without need to resort to speculative optimization approaches. In summary, then, the dakota routine is preferred for parallel optimization, and the vendor routine may be preferred for serial optimization in special cases.

The interval_type setting is used to select between forward and central differences in the numerical gradient calculations. The dakota, DOT vendor, and OPT++ vendor routines have both forward and central differences available, the CONMIN and NL2SOL vendor routines support forward differences only, and the NPSOL and NLSSOL vendor routines start with forward differences and automatically switch to central differences as the iteration progresses (the user has no control over this). The following forward difference expression

\[ \nabla f ({\bf x}) \cong \frac{f ({\bf x} + h {\bf e}_i) - f ({\bf x})}{h} \]

and the following central difference expression

\[ \nabla f ({\bf x}) \cong \frac{f ({\bf x} + h {\bf e}_i) - f ({\bf x} - h {\bf e}_i)}{2h} \]

are used to estimate the $i^{th}$ component of the gradient vector.

Lastly, fd_gradient_step_size specifies the relative finite difference step size to be used in the computations. Either a single value may be entered for use with all parameters, or a list of step sizes may be entered, one for each parameter. The latter option of a list of step sizes is only valid for use with the DAKOTA finite differencing routine. For DAKOTA, DOT, CONMIN, and OPT++, the differencing intervals are computed by multiplying the fd_gradient_step_size with the current parameter value. In this case, a minimum absolute differencing interval is needed when the current parameter value is close to zero. This prevents finite difference intervals for the parameter which are too small to distinguish differences in the response quantities being computed. DAKOTA, DOT, CONMIN, and OPT++ all use .01*fd_gradient_step_size as their minimum absolute differencing interval. With a fd_gradient_step_size = .001, for example, DAKOTA, DOT, CONMIN, and OPT++ will use intervals of .001*current value with a minimum interval of 1.e-5. NPSOL and NLSSOL use a different formula for their finite difference intervals: fd_gradient_step_size*(1+|current parameter value|). This definition has the advantage of eliminating the need for a minimum absolute differencing interval since the interval no longer goes to zero as the current parameter value goes to zero.

When DAKOTA computes gradients or Hessians by finite differences and the variables in question have bounds, by default DAKOTA 5.0 chooses finite-differencing steps that keep the variables within their specified bounds. Older versions of DAKOTA generally ignored bounds when computing finite differences. To restore the older behavior, one can add keyword ignore_bounds to the response specification when method_source dakota (or just dakota) is also specified. In forward difference or backward difference computations, honoring bounds is straightforward. To honor bounds when approximating $\partial f / \partial x_i$, i.e., component $i$ of the gradient of $f$, by central differences, DAKOTA chooses two steps $h_1$ and $h_2$ with $h_1 \ne h_2$, such that $x + h_1 e_i$ and $x + h_2 e_i$ both satisfy the bounds, and then computes

\[ \frac{\partial f}{\partial x_i} \cong \frac{h_2^2(f_1 - f_0) - h_1^2(f_2 - f_0)}{h_1 h_2 (h_2 - h_1)} , \]

with $f_0 = f(x)$, $f_1 = f(x + h_1 e_i)$, and $f_2 = f(x + h_2 e_i)$.

Table 9.6 summarizes the numerical gradient specification.

Table 9.6 Specification detail for numerical gradients
Description Keyword Associated Data Status Default
Numerical gradients numerical_gradients none Required group N/A
Method source method_source dakota | vendor Optional group dakota
Interval type interval_type forward | central Optional group forward
Finite difference step size fd_gradient_step_size list of reals Optional 0.001
Ignore variable bounds ignore_bounds none Optional bounds respected

Analytic gradients

The analytic_gradients specification means that gradient information is available directly from the simulation (finite differencing is not required). The simulation must return the gradient data in the DAKOTA format (enclosed in single brackets; see DAKOTA File Data Formats in the Users Manual [Adams et al., 2010]) for the case of file transfer of data. The analytic_gradients keyword is a complete specification for this case.

Mixed gradients

The mixed_gradients specification means that some gradient information is available directly from the simulation (analytic) whereas the rest will have to be finite differenced (numerical). This specification allows the user to make use of as much analytic gradient information as is available and then finite difference for the rest. For example, the objective function may be a simple analytic function of the design variables (e.g., weight) whereas the constraints are nonlinear implicit functions of complex analyses (e.g., maximum stress). The id_analytic_gradients list specifies by number the functions which have analytic gradients, and the id_numerical_gradients list specifies by number the functions which must use numerical gradients. Each function identifier, from 1 through the total number of functions, must appear once and only once within the union of the id_analytic_gradients and id_numerical_gradients lists. The method_source, interval_type, and fd_gradient_step_size specifications are as described previously in Numerical gradients and pertain to those functions listed by the id_numerical_gradients list. Table 9.7 summarizes the mixed gradient specification.

Table 9.7 Specification detail for mixed gradients
Description Keyword Associated Data Status Default
Mixed gradients mixed_gradients none Required group N/A
Analytic derivatives function list id_analytic_gradients list of integers Required N/A
Numerical derivatives function list id_numerical_gradients list of integers Required N/A
Method source method_source dakota | vendor Optional group dakota
Interval type interval_type forward | central Optional group forward
Finite difference step size fd_step_size list of reals Optional 0.001
Ignore variable bounds ignore_bounds none Optional bounds respected

Hessian Specification

Hessian availability must be specified with either no_hessians, numerical_hessians, quasi_hessians, analytic_hessians, or mixed_hessians.

No Hessians

The no_hessians specification means that the method does not require DAKOTA to manage the computation of any Hessian information. Therefore, it will neither be retrieved from the simulation nor computed by DAKOTA. The no_hessians keyword is a complete specification for this case. Note that, in some cases, Hessian information may still be being approximated internal to an algorithm (e.g., within a quasi-Newton optimizer such as optpp_q_newton); however, DAKOTA has no direct involvement in this process and the responses specification need not include it.

Numerical Hessians

The numerical_hessians specification means that Hessian information is needed and will be computed with finite differences using either first-order gradient differencing (for the cases of analytic_gradients or for the functions identified by id_analytic_gradients in the case of mixed_gradients) or first- or second-order function value differencing (all other gradient specifications). In the former case, the following expression

\[ \nabla^2 f ({\bf x})_i \cong \frac{\nabla f ({\bf x} + h {\bf e}_i) - \nabla f ({\bf x})}{h} \]

estimates the $i^{th}$ Hessian column, and in the latter case, the following expressions

\[ \nabla^2 f ({\bf x})_{i,j} \cong \frac{f({\bf x} + h_i {\bf e}_i + h_j {\bf e}_j) - f({\bf x} + h_i {\bf e}_i) - f({\bf x} - h_j {\bf e}_j) + f({\bf x})}{h_i h_j} \]

and

\[ \nabla^2 f ({\bf x})_{i,j} \cong \frac{f({\bf x} + h {\bf e}_i + h {\bf e}_j) - f({\bf x} + h {\bf e}_i - h {\bf e}_j) - f({\bf x} - h {\bf e}_i + h {\bf e}_j) + f({\bf x} - h {\bf e}_i - h {\bf e}_j)}{4h^2} \]

provide first- and second-order estimates of the $ij^{th}$ Hessian term. Prior to DAKOTA 5.0, DAKOTA always used second-order estimates. Starting in DAKOTA 5.0, the default is to use first-order estimates (which honor bounds on the variables and require only about a quarter as many function evaluations as do the second-order estimates), but specifying central after numerical_hessians causes DAKOTA to use the old second-order estimates, which do not honor bounds. In optimization algorithms that use Hessians, there is little reason to use second-order differences in computing Hessian approximations.

The fd_hessian_step_size specifies the relative finite difference step size to be used in these differences. Either a single value may be entered for use with all parameters, or a list of step sizes may be entered, one for each parameter. The differencing intervals are computed by multiplying the fd_hessian_step_size with the current parameter value. A minimum absolute differencing interval of .01*fd_hessian_step_size is used when the current parameter value is close to zero. Table 9.8 summarizes the numerical Hessian specification.

Table 9.8 Specification detail for numerical Hessians
Description Keyword Associated Data Status Default
Numerical Hessians numerical_hessians none Required group N/A
Finite difference step size fd_step_size list of reals Optional 0.001 (1st-order), 0.002 (2nd-order)
Difference order forward | central none Optional forward

Quasi Hessians

The quasi_hessians specification means that Hessian information is needed and will be approximated using secant updates (sometimes called ``quasi-Newton updates", though any algorithm that approximates Newton's method is a quasi-Newton method). Compared to finite difference numerical Hessians, secant approximations do not expend additional function evaluations in estimating all of the second-order information for every point of interest. Rather, they accumulate approximate curvature information over time using the existing gradient evaluations. The supported secant approximations include the Broyden-Fletcher-Goldfarb-Shanno (BFGS) update (specified with the keyword bfgs)

\[ B_{k+1} = B_{k} - \frac{B_k s_k s_k^T B_k}{s_k^T B_k s_k} + \frac{y_k y_k^T}{y_k^T s_k} \]

and the Symmetric Rank 1 (SR1) update (specified with the keyword sr1)

\[ B_{k+1} = B_k + \frac{(y_k - B_k s_k)(y_k - B_k s_k)^T}{(y_k - B_k s_k)^T s_k} \]

where $B_k$ is the $k^{th}$ approximation to the Hessian, $s_k = x_{k+1} - x_k$ is the step and $y_k = \nabla f_{k+1} - \nabla f_k$ is the corresponding yield in the gradients. In both cases, an initial scaling of $\frac{y_k^T y_k}{y_k^T s_k} I$ is used for $B_0$ prior to the first update. In addition, both cases employ basic numerical safeguarding to protect against numerically small denominators within the updates. This safeguarding skips the update if $|y_k^T s_k| < 10^{-6} s_k^T B_k s_k$ in the BFGS case or if $|(y_k - B_k s_k)^T s_k| < 10^{-6} ||s_k||_2 ||y_k - B_k s_k||_2$ in the SR1 case. In the BFGS case, additional safeguarding can be added using the damped option, which utilizes an alternative damped BFGS update when the curvature condition $y_k^T s_k > 0$ is nearly violated. Table 9.9 summarizes the quasi Hessian specification.

Table 9.9 Specification detail for quasi Hessians
Description Keyword Associated Data Status Default
Quasi Hessians quasi_hessians bfgs | sr1 Required group N/A
Numerical safeguarding of BFGS update damped none Optional undamped BFGS

Analytic Hessians

The analytic_hessians specification means that Hessian information is available directly from the simulation. The simulation must return the Hessian data in the DAKOTA format (enclosed in double brackets; see DAKOTA File Data Formats in Users Manual [Adams et al., 2010]) for the case of file transfer of data. The analytic_hessians keyword is a complete specification for this case.

Mixed Hessians

The mixed_hessians specification means that some Hessian information is available directly from the simulation (analytic) whereas the rest will have to be estimated by finite differences (numerical) or approximated by secant updating. As for mixed gradients, this specification allows the user to make use of as much analytic information as is available and then estimate/approximate the rest. The id_analytic_hessians list specifies by number the functions which have analytic Hessians, and the id_numerical_hessians and id_quasi_hessians lists specify by number the functions which must use numerical Hessians and secant Hessian updates, respectively. Each function identifier, from 1 through the total number of functions, must appear once and only once within the union of the id_analytic_hessians, id_numerical_hessians, and id_quasi_hessians lists. The fd_hessian_step_size and bfgs, damped bfgs, or sr1 secant update selections are as described previously in Numerical Hessians and Quasi Hessians and pertain to those functions listed by the id_numerical_hessians and id_quasi_hessians lists. Table 9.10 summarizes the mixed Hessian specification.

Table 9.10 Specification detail for mixed Hessians
Description Keyword Associated Data Status Default
Mixed Hessians mixed_hessians none Required group N/A
Analytic Hessians function list id_analytic_hessians list of integers Required N/A
Numerical Hessians function list id_numerical_hessians list of integers Required N/A
Finite difference step size fd_step_size list of reals Optional 0.001 (1st-order), 0.002 (2nd-order)
Quasi Hessians function list id_quasi_hessians list of integers Required N/A
Quasi-Hessian update bfgs | sr1 none Required N/A
Numerical safeguarding of BFGS update damped none Optional undamped BFGS



Previous chapter

Next chapter
Generated on Thu Dec 2 01:22:53 2010 for DAKOTA by  doxygen 1.4.7