We report a Bayesian framework for concurrent selection of physics-based models and (modeling) error models. We investigate the use of colored noise to capture the mismatch between the predictions of calibrated models and observational data that cannot be explained by measurement error alone within the context of Bayesian estimation for stochastic ordinary differential equations. Proposed models are characterized by the average data-fit, a measure of how well a model fits the measurements, and the model complexity measured using the Kullback–Leibler divergence. The use of a more complex error models increases the average data-fit but also increases the complexity of the combined model, possibly over-fitting the data. Bayesian model selection is used to find the optimal physical model as well as the optimal error model. The optimal model is defined using the evidence, where the average data-fit is balanced by the complexity of the model. The effect of colored noise process is illustrated using a nonlinear aeroelastic oscillator representing a rigid NACA0012 airfoil undergoing limit cycle oscillations due to complex fluid–structure interactions. Several quasi-steady and unsteady aerodynamic models are proposed with colored noise or white noise for the model error. The use of colored noise improves the predictive capabilities of simpler models.
The UQ Toolkit (UQTk) is a collection of libraries and tools for the quantification of uncertainty in numerical model predictions. Version 3.1.2 offers intrusive and non-intrusive methods for propagating input uncertainties through computational models, tools for sensitivity analysis, methods for sparse surrogate construction, and Bayesian inference tools for inferring parameters from experimental data. This manual discusses the download and installation process for UQTk, provides pointers to the UQ methods used in the toolkit, and describes some of the examples provided with the toolkit.
This paper addresses the issue of overfitting while calibrating unknown parameters of over-parameterized physics-based models with noisy and incomplete observations. A semi-analytical Bayesian framework of nonlinear sparse Bayesian learning (NSBL) is proposed to identify sparsity among model parameters during Bayesian inversion. NSBL offers significant advantages over machine learning algorithm of sparse Bayesian learning (SBL) for physics-based models, such as 1) the likelihood function or the posterior parameter distribution is not required to be Gaussian, and 2) prior parameter knowledge is incorporated into sparse learning (i.e. not all parameters are treated as questionable). NSBL employs the concept of automatic relevance determination (ARD) to facilitate sparsity among questionable parameters through parameterized prior distributions. The analytical tractability of NSBL is enabled by employing Gaussian ARD priors and by building a Gaussian mixture-model approximation of the posterior parameter distribution that excludes the contribution of ARD priors. Subsequently, type-II maximum likelihood is executed using Newton's method whereby the evidence and its gradient and Hessian information are computed in a semi-analytical fashion. We show numerically and analytically that SBL is a special case of NSBL for linear regression models. Subsequently, a linear regression example involving multimodality in both parameter posterior pdf and model evidence is considered to demonstrate the performance of NSBL in cases where SBL is inapplicable. Next, NSBL is applied to identify sparsity among the damping coefficients of a mass-spring-damper model of a shear building frame. These numerical studies demonstrate the robustness and efficiency of NSBL in alleviating overfitting during Bayesian inversion of nonlinear physics-based models.
This study presents a numerical model of a WEC array. The model will be used in subsequent work to study the ability of data assimilation to support power prediction from WEC arrays and WEC array design. In this study, we focus on design, modeling, and control of the WEC array. A case study is performed for a small remote Alaskan town. Using an efficient method for modeling the linear interactions within a homogeneous array, we produce a model and predictionless feedback controllers for the devices within the array. The model is applied to study the effects of spectral wave forecast errors on power output. The results of this analysis show that the power performance of the WEC array will be most strongly affected by errors in prediction of the spectral period, but that reductions in performance can realistically be limited to less than 10% based on typical data assimilation based spectral forecasting accuracy levels.
To model and quantify the variability in plasticity and failure of additively manufactured metals due to imperfections in their microstructure, we have developed uncertainty quantification methodology based on pseudo marginal likelihood and embedded variability techniques. We account for both the porosity resolvable in computed tomography scans of the initial material and the sub-threshold distribution of voids through a physically motivated model. Calibration of the model indicates that the sub-threshold population of defects dominates the yield and failure response. The technique also allows us to quantify the distribution of material parameters connected to microstructural variability created by the manufacturing process, and, thereby, make assessments of material quality and process control.
Robinson, Brandon; da Costa, Leandro; Poirel, Dominique; Pettit, Chris; Khalil, Mohammad K.; Sarkar, Abhijit
This paper focuses on the derivation of an analytical model of the aeroelastic dynamics of an elastically mounted flexible wing. The equations of motion obtained serve to help understand the behaviour of the aeroelastic wind tunnel setup in question, which consists of a rectangular wing with a uniform NACA 0012 airfoil profile, whose base is free to rotate rigidly about a longitudinal axis. Of particular interest are the structural geometric nonlinearities primarily introduced by the coupling between the rigid body pitch degree-of-freedom and the continuous system. A coupled system of partial differential equations (PDEs) coupled with an ordinary differential equation (ODE) describing axial-bending-bending-torsion-pitch motion is derived using Hamilton's principle. A finite dimensional approximation of the system of coupled differential equations is obtained using the Galerkin method, leading to a system of coupled nonlinear ODEs. Subsequently, these nonlinear ODEs are solved numerically using Houbolt's method. The results that are obtained are verified by comparison with the results obtained by direct integration of the equations of motion using a finite difference scheme. Adopting a linear unsteady aerodynamic model, it is observed that the system undergoes coalescence flutter due to coupling between the rigid body pitch rotation dominated mode and the first flapwise bending dominated mode. The behaviour of the limit cycle oscillations is primarily influenced by the structural geometric nonlinear terms in the coupled system of PDEs and ODE.
Integration of renewable power sources into grids remains an active research and development area,particularly for less developed renewable energy technologies such as wave energy converters (WECs).WECs are projected to have strong early market penetration for remote communities, which serve as naturalmicrogrids. Hence, accurate wave predictions to manage the interactions of a WEC array with microgridsis especially important. Recently developed, low-cost wave measurement buoys allow for operationalassimilation of wave data at remote locations where real-time data have previously been unavailable. This work includes the development and assessment of a wave modeling framework with real-time dataassimilation capabilities for WEC power prediction. The availability of real-time wave spectral componentsfrom low-cost wave measurement buoys allows for operational data assimilation with the Ensemble Kalmanfilter technique, whereby measured wave conditions within the numerical wave forecast model domain areassimilated onto the combined set of internal and boundary grid points while taking into account model andobservation error covariances. The updated model state and boundary conditions allow for more accuratewave characteristic predictions at the locations of interest. Initial deployment data indicated that measured wave data from one buoy that were assimilated intothe wave modeling framework resulted in improved forecast skill for a case where a traditional numericalforecast model (e.g., Simulating WAves Nearshore; SWAN) did not well represent the measured conditions.On average, the wave power forecast error was reduced from 73% to 43% using the data assimilationmodeling with real-time wave observations.
This project has developed models of variability of performance to enable robust design and certification. Material variability originating from microstructure has significant effects on component behavior and creates uncertainty in material response. The outcomes of this project are uncertainty quantification (UQ) enabled analysis of material variability effects on performance and methods to evaluate the consequences of microstructural variability on material response in general. Material variability originating from heterogeneous microstructural features, such as grain and pore morphologies, has significant effects on component behavior and creates uncertainty around performance. Current engineering material models typically do not incorporate microstructural variability explicitly, rather functional forms are chosen based on intuition and parameters are selected to reflect mean behavior. Conversely, mesoscale models that capture the microstructural physics, and inherent variability, are impractical to utilize at the engineering scale. Therefore, current efforts ignore physical characteristics of systems that may be the predominant factors for quantifying system reliability. To address this gap we have developed explicit connections between models of microstructural variability and component/system performance. Our focus on variability of mechanical response due to grain and pore distributions enabled us to fully probe these influences on performance and develop a methodology to propagate input variability to output performance. This project is at the forefront of data-science and material modeling. We adapted and innovated from progressive techniques in machine learning and uncertainty quantification to develop a new, physically-based methodology to address the core issues of the Engineering Materials Reliability (EMR) research challenge in modeling constitutive response of materials with significant inherent variability and length-scales.
The advent of fabrication techniques such as additive manufacturing has focused attention on the considerable variability of material response due to defects and other microstructural aspects. This variability motivates the development of an enhanced design methodology that incorporates inherent material variability to provide robust predictions of performance. In this work, we develop plasticity models capable of representing the distribution of mechanical responses observed in experiments using traditional plasticity models of the mean response and recently developed uncertainty quantification (UQ) techniques. To account for material response variability through variations in physical parameters, we adapt a recent Bayesian embedded modeling error calibration technique. We use Bayesian model selection to determine the most plausible of a variety of plasticity models and the optimal embedding of parameter variability. To expedite model selection, we develop an adaptive importance-sampling-based numerical integration scheme to compute the Bayesian model evidence. We demonstrate that the new framework provides predictive realizations that are superior to more traditional ones, and how these UQ techniques can be used in model selection and assessing the quality of calibrated physical parameters.
Integration of renewable power sources into electrical grids remains an active research and development area, particularly for less developed renewable energy technologies, such as wave energy converters (WECs). High spatio-temporal resolution and accurate wave forecasts at a potential WEC (or WEC array) lease area are needed to improve WEC power prediction and to facilitate grid integration, particularly for microgrid locations. The availability of high quality measurement data from recently developed low-cost buoys allows for operational assimilation of wave data into forecast models at remote locations where real-time data have previously been unavailable. This work includes the development and assessment of a wave modeling framework with real-time data assimilation capabilities for WEC power prediction. Spoondrift wave measurement buoys were deployed off the coast of Yakutat, Alaska, a microgrid site with high wave energy resource potential. A wave modeling framework with data assimilation was developed and assessed, which was most effective when the incoming forecasted boundary conditions did not represent the observations well. For that case, assimilation of the wave height data using the ensemble Kalman filter resulted in a reduction of wave height forecast normalized root mean square error from 27% to an average of 16% over a 12-hour period. This results in reduction of wave power forecast error from 73% to 43%. In summary, the use of the low-cost wave buoy data assimilated into the wave modeling framework improved the forecast skill and will provide a useful development tool for the integration of WECs into electrical grids.
This investigation tackles the probabilistic parameter estimation problem involving the Arrhenius parameters for the rate coefficient of the chain branching reaction H + O2 → OH + O. This is achieved in a Bayesian inference framework that uses indirect data from the literature in the form of summary statistics by approximating the maximum entropy solution with the aid of approximate bayesian computation. The summary statistics include nominal values and uncertainty factors of the rate coefficient, obtained from shock-tube experiments performed at various initial temperatures. The Bayesian framework allows for the incorporation of uncertainty in the rate coefficient of a secondary reaction, namely OH + H2 → H2O + H, resulting in a consistent joint probability density on Arrhenius parameters for the two rate coefficients. It also allows for uncertainty quantification in numerical ignition predictions while conforming with the published summary statistics. The method relies on probabilistic reconstruction of the unreported data, OH concentration profiles from shock-tube experiments, along with the unknown Arrhenius parameters. The data inference is performed using a Markov chain Monte Carlo sampling procedure that relies on an efficient adaptive quadrature in estimating relevant integrals needed for data likelihood evaluations. For further efficiency gains, local Padé–Legendre approximants are used as surrogates for the time histories of OH concentration, alleviating the need for 0-D auto-ignition simulations. The reconstructed realisations of the missing data are used to provide a consensus joint posterior probability density on the unknown Arrhenius parameters via probabilistic pooling. Uncertainty quantification analysis is performed for stoichiometric hydrogen–air auto-ignition computations to explore the impact of uncertain parameter correlations on a range of quantities of interest.
Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. Although these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolution in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. Parallel sparse matrix–vector operations are used to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.
We investigate the feasibility of constructing a data-driven distance metric for use in null-hypothesis testing in the context of arms-control treaty verification. The distance metric is used in testing the hypothesis that the available data are representative of a certain object or otherwise, as opposed to binary-classification tasks studied previously. The metric, being of strictly quadratic form, is essentially computed using projections of the data onto a set of optimal vectors. These projections can be accumulated in list mode. The relatively low number of projections hampers the possible reconstruction of the object and subsequently the access to sensitive information. The projection vectors that channelize the data are optimal in capturing the Mahalanobis squared distance of the data associated with a given object under varying nuisance parameters. The vectors are also chosen such that the resulting metric is insensitive to the difference between the trusted object and another object that is deemed to contain sensitive information. Data used in this study were generated using the GEANT4 toolkit to model gamma transport using a Monte Carlo method. For numerical illustration, the methodology is applied to synthetic data obtained using custom models for plutonium inspection objects. The resulting metric based on a relatively low number of channels shows moderate agreement with the Mahalanobis distance metric for the trusted object but enabling a capability to obscure sensitive information.
Sandhu, Rimple S.; Rocha da Costa, Leandro J.; Robinson, Brandon R.; Matachniouk, Anton M.; Chajjed, Sandip C.; Bisaillon, Philippe B.; Desai, Ajit D.; Khalil, Mohammad K.; Pettit, Chris P.; Poirel, Dominique P.; Sarkar, Abhijit S.
A general strategy for analysis and reduction of uncertain chemical kinetic models is presented, and its utility is illustrated in the context of ignition of hydrocarbon fuel–air mixtures. The strategy is based on a deterministic analysis and reduction method which employs computational singular perturbation analysis to generate simplified kinetic mechanisms, starting from a detailed reference mechanism. We model uncertain quantities in the reference mechanism, namely the Arrhenius rate parameters, as random variables with prescribed uncertainty factors. We propagate this uncertainty to obtain the probability of inclusion of each reaction in the simplified mechanism. We propose probabilistic error measures to compare predictions from the uncertain reference and simplified models, based on the comparison of the uncertain dynamics of the state variables, where the mixture entropy is chosen as progress variable. We employ the construction for the simplification of an uncertain mechanism in an n-butane–air mixture homogeneous ignition case, where a 176-species, 1111-reactions detailed kinetic model for the oxidation of n-butane is used with uncertainty factors assigned to each Arrhenius rate pre-exponential coefficient. This illustration is employed to highlight the utility of the construction, and the performance of a family of simplified models produced depending on chosen thresholds on importance and marginal probabilities of the reactions.
The thermal decomposition of H2O2 is an important process in hydrocarbon combustion playing a particularly crucial role in providing a source of radicals at high pressure where it controls the 3rd explosion limit in the H2-O2 system, and also as a branching reaction in intermediatetemperature hydrocarbon oxidation. As such, understanding the uncertainty in the rate expression for this reaction is crucial for predictive combustion computations. Raw experimental measurement data, and its associated noise and uncertainty, is typically unreported in most investigations of elementary reaction rates, making the direct derivation of the joint uncertainty structure of the parameters in rate expressions difficult. To overcome this, we employ a statistical inference procedure, relying on maximum entropy and approximate Bayesian computation methods, and using a two-level nested Markov Chain Monte Carlo algorithm, to arrive at a posterior density on rate parameters for a selected case of laser absorption measurements in a shock tube study, subject to the constraints imposed by the reported experimental statistics. The procedure constructs a set of H2O2 concentration decay profiles consistent with these reported statistics. These consistent data sets are then used to determine the joint posterior density on the rate parameters through straightforward Bayesian inference. Broadly, the method also provides a framework for the replication and comparison of missing data from different experiments, based on reported statistics, for the generation of consensus rate expressions.
Bayesian inference and maximum entropy methods were employed for the estimation of the joint probability density for the Arrhenius rate parameters of the rate coefficient of the H2/O2-mechanism chain branching reaction H + O2 → OH + O. A consensus joint posterior on the parameters was obtained by pooling the posterior parameter densities given each consistent data set. Efficient surrogates for the OH concentration were constructed using a combination of Padé and polynomial approximants. Gauss-Hermite quadrature with Gaussian proposal probability density functions for moment computation were used resulting in orders of magnitude speedup in data likelihood evaluation. The consistent data sets resulted in nearly Gaussian conditional parameter probability density functions. The resulting pooled parameter probability density function was propagated through stoichiometric H2-air auto-ignition computations to illustrate the necessity for correlation among the Arrhenius rate parameters of one reaction and across rate parameters of different reactions to be considered.
Open-source indicators have been proposed as a way of tracking and forecasting disease outbreaks. Some, such are meteorological data, are readily available as reanalysis products. Others, such as those derived from our online behavior (web searches, media article etc.) are gathered easily and are more timely than public health reporting. In this study we investigate how these datastreams may be combined to provide useful epidemiological information. The investigation is performed by building data assimilation systems to track influenza in California and dengue in India. The first does not suffer from incomplete data and was chosen to explore disease modeling needs. The second explores the case when observational data is sparse and disease modeling complexities are beside the point. The two test cases are for opposite ends of the disease tracking spectrum. We find that data assimilation systems that produce disease activity maps can be constructed. Further, being able to combine multiple open-source datastreams is a necessity as any one individually is not very infor- mative. The data assimilation systems have very little in common except that they contain disease models, calibration algorithms and some ability to impute missing data. Thus while the data assimilation systems share the goal for accurate forecasting, they are practically designed to compensate for the shortcomings of the datastreams. Thus we expect them to be disease and location-specific.
A Bayesian statistical framework is presented for Zimmerman and Weissenburger flutter margin method which considers the uncertainties in aeroelastic modal parameters. The proposed methodology overcomes the limitations of the previously developed least-square based estimation technique which relies on the Gaussian approximation of the flutter margin probability density function (pdf). Using the measured free-decay responses at subcritical (preflutter) airspeeds, the joint non-Gaussain posterior pdf of the modal parameters is sampled using the Metropolis–Hastings (MH) Markov chain Monte Carlo (MCMC) algorithm. The posterior MCMC samples of the modal parameters are then used to obtain the flutter margin pdfs and finally the flutter speed pdf. The usefulness of the Bayesian flutter margin method is demonstrated using synthetic data generated from a two-degree-of-freedom pitch-plunge aeroelastic model. The robustness of the statistical framework is demonstrated using different sets of measurement data. In conclusion, it will be shown that the probabilistic (Bayesian) approach reduces the number of test points required in providing a flutter speed estimate for a given accuracy and precision.