Publications

Publications / SAND Report

Methods for model selection in applied science and engineering

Field, Richard V.

Mathematical models are developed and used to study the properties of complex systems and/or modify these systems to satisfy some performance requirements in just about every area of applied science and engineering. A particular reason for developing a model, e.g., performance assessment or design, is referred to as the model use. Our objective is the development of a methodology for selecting a model that is sufficiently accurate for an intended use. Information on the system being modeled is, in general, incomplete, so that there may be two or more models consistent with the available information. The collection of these models is called the class of candidate models. Methods are developed for selecting the optimal member from a class of candidate models for the system. The optimal model depends on the available information, the selected class of candidate models, and the model use. Classical methods for model selection, including the method of maximum likelihood and Bayesian methods, as well as a method employing a decision-theoretic approach, are formulated to select the optimal model for numerous applications. There is no requirement that the candidate models be random. Classical methods for model selection ignore model use and require data to be available. Examples are used to show that these methods can be unreliable when data is limited. The decision-theoretic approach to model selection does not have these limitations, and model use is included through an appropriate utility function. This is especially important when modeling high risk systems, where the consequences of using an inappropriate model for the system can be disastrous. The decision-theoretic method for model selection is developed and applied for a series of complex and diverse applications. These include the selection of the: (1) optimal order of the polynomial chaos approximation for non-Gaussian random variables and stationary stochastic processes, (2) optimal pressure load model to be applied to a spacecraft during atmospheric re-entry, and (3) optimal design of a distributed sensor network for the purpose of vehicle tracking and identification.