Uncertainty Quantification: UQTk example problems
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
Abstract not provided.
Abstract not provided.
Abstract not provided.
In this report, we proposed, examined and implemented approaches for performing efficient uncertainty quantification (UQ) in climate land models. Specifically, we applied Bayesian compressive sensing framework to a polynomial chaos spectral expansions, enhanced it with an iterative algorithm of basis reduction, and investigated the results on test models as well as on the community land model (CLM). Furthermore, we discussed construction of efficient quadrature rules for forward propagation of uncertainties from high-dimensional, constrained input space to output quantities of interest. The work lays grounds for efficient forward UQ for high-dimensional, strongly non-linear and computationally costly climate models. Moreover, to investigate parameter inference approaches, we have applied two variants of the Markov chain Monte Carlo (MCMC) method to a soil moisture dynamics submodel of the CLM. The evaluation of these algorithms gave us a good foundation for further building out the Bayesian calibration framework towards the goal of robust component-wise calibration.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Mathematical Biosciences
Abstract not provided.
We present a statistical method, predicated on the use of surrogate models, for the 'real-time' characterization of partially observed epidemics. Observations consist of counts of symptomatic patients, diagnosed with the disease, that may be available in the early epoch of an ongoing outbreak. Characterization, in this context, refers to estimation of epidemiological parameters that can be used to provide short-term forecasts of the ongoing epidemic, as well as to provide gross information on the dynamics of the etiologic agent in the affected population e.g., the time-dependent infection rate. The characterization problem is formulated as a Bayesian inverse problem, and epidemiological parameters are estimated as distributions using a Markov chain Monte Carlo (MCMC) method, thus quantifying the uncertainty in the estimates. In some cases, the inverse problem can be computationally expensive, primarily due to the epidemic simulator used inside the inversion algorithm. We present a method, based on replacing the epidemiological model with computationally inexpensive surrogates, that can reduce the computational time to minutes, without a significant loss of accuracy. The surrogates are created by projecting the output of an epidemiological model on a set of polynomial chaos bases; thereafter, computations involving the surrogate model reduce to evaluations of a polynomial. We find that the epidemic characterizations obtained with the surrogate models is very close to that obtained with the original model. We also find that the number of projections required to construct a surrogate model is O(10)-O(10{sup 2}) less than the number of samples required by the MCMC to construct a stationary posterior distribution; thus, depending upon the epidemiological models in question, it may be possible to omit the offline creation and caching of surrogate models, prior to their use in an inverse problem. The technique is demonstrated on synthetic data as well as observations from the 1918 influenza pandemic collected at Camp Custer, Michigan.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
The TChem toolkit is a software library that enables numerical simulations using complex chemistry and facilitates the analysis of detailed kinetic models. The toolkit provide capabilities for thermodynamic properties based on NASA polynomials and species production/consumption rates. It incorporates methods that can selectively modify reaction parameters for sensitivity analysis. The library contains several functions that provide analytically computed Jacobian matrices necessary for the efficient time advancement and analysis of detailed kinetic models.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Uncertainty quantification in complex climate models is challenged by the sparsity of available climate model predictions due to the high computational cost of model runs. Another feature that prevents classical uncertainty analysis from being readily applicable is bifurcative behavior in climate model response with respect to certain input parameters. A typical example is the Atlantic Meridional Overturning Circulation. The predicted maximum overturning stream function exhibits discontinuity across a curve in the space of two uncertain parameters, namely climate sensitivity and CO2 forcing. We outline a methodology for uncertainty quantification given discontinuous model response and a limited number of model runs. Our approach is two-fold. First we detect the discontinuity with Bayesian inference, thus obtaining a probabilistic representation of the discontinuity curve shape and location for arbitrarily distributed input parameter values. Then, we construct spectral representations of uncertainty, using Polynomial Chaos (PC) expansions on either side of the discontinuity curve, leading to an averaged-PC representation of the forward model that allows efficient uncertainty quantification. The approach is enabled by a Rosenblatt transformation that maps each side of the discontinuity to regular domains where desirable orthogonality properties for the spectral bases hold. We obtain PC modes by either orthogonal projection or Bayesian inference, and argue for a hybrid approach that targets a balance between the accuracy provided by the orthogonal projection and the flexibility provided by the Bayesian inference - where the latter allows obtaining reasonable expansions without extra forward model runs. The model output, and its associated uncertainty at specific design points, are then computed by taking an ensemble average over PC expansions corresponding to possible realizations of the discontinuity curve. The methodology is tested on synthetic examples of discontinuous model data with adjustable sharpness and structure.
SIAM Journal on Scientific Computing
Abstract not provided.
Abstract not provided.
Techniques appear promising to construct and integrate automated detect-and-characterize technique for epidemics - Working off biosurveillance data, and provides information on the particular/ongoing outbreak. Potential use - in crisis management and planning, resource allocation - Parameter estimation capability ideal for providing the input parameters into an agent-based model, Index Cases, Time of Infection, infection rate. Non-communicable diseases are easier than communicable ones - Small anthrax can be characterized well with 7-10 days of data, post-detection; plague takes longer, Large attacks are very easy.
Abstract not provided.
Uncertainty quantificatio in climate models is challenged by the sparsity of the available climate data due to the high computational cost of the model runs. Another feature that prevents classical uncertainty analyses from being easily applicable is the bifurcative behavior in the climate data with respect to certain parameters. A typical example is the Meridional Overturning Circulation in the Atlantic Ocean. The maximum overturning stream function exhibits discontinuity across a curve in the space of two uncertain parameters, namely climate sensitivity and CO{sub 2} forcing. We develop a methodology that performs uncertainty quantificatio in the presence of limited data that have discontinuous character. Our approach is two-fold. First we detect the discontinuity location with a Bayesian inference, thus obtaining a probabilistic representation of the discontinuity curve location in presence of arbitrarily distributed input parameter values. Furthermore, we developed a spectral approach that relies on Polynomial Chaos (PC) expansions on each sides of the discontinuity curve leading to an averaged-PC representation of the forward model that allows efficient uncertainty quantification and propagation. The methodology is tested on synthetic examples of discontinuous data with adjustable sharpness and structure.
Uncertainty quantification in climate models is challenged by the prohibitive cost of a large number of model evaluations for sampling. Another feature that often prevents classical uncertainty analysis from being readily applicable is the bifurcative behavior in the climate data with respect to certain parameters. A typical example is the Meridional Overturning Circulation in the Atlantic Ocean. The maximum overturning stream function exhibits a discontinuity across a curve in the space of two uncertain parameters, namely climate sensitivity and CO2 forcing. In order to propagate uncertainties from model parameters to model output we use polynomial chaos (PC) expansions to represent the maximum overturning stream function in terms of the uncertain climate sensitivity and CO2 forcing parameters. Since the spectral methodology assumes a certain degree of smoothness, the presence of discontinuities suggests that separate PC expansions on each side of the discontinuity will lead to more accurate descriptions of the climate model output compared to global PC expansions. We propose a methodology that first finds a probabilistic description of the discontinuity given a number of data points. Assuming the discontinuity curve is a polynomial, the algorithm is based on Bayesian inference of its coefficients. Markov chain Monte Carlo sampling is used to obtain joint distributions for the polynomial coefficients, effectively parameterizing the distribution over all possible discontinuity curves. Next, we apply the Rosenblatt transformation to the irregular parameter domains on each side of the discontinuity. This transformation maps a space of uncertain parameters with specific probability distributions to a space of i.i.d standard random variables where orthogonal projections can be used to obtain PC coefficients. In particular, we use uniform random variables that are compatible with PC expansions based on Legendre polynomials. The Rosenblatt transformation and the corresponding PC expansions for the model output on either side of the discontinuity are applied successively for several realizations of the discontinuity curve. The climate model output and its associated uncertainty at specific design points is then computed by taking a quadrature-based integration average over PC expansions corresponding to possible realizations of the discontinuity curve.
Conventional methods for uncertainty quantification are generally challenged in the 'tails' of probability distributions. This is specifically an issue for many climate observables since extensive sampling to obtain a reasonable accuracy in tail regions is especially costly in climate models. Moreover, the accuracy of spectral representations of uncertainty is weighted in favor of more probable ranges of the underlying basis variable, which, in conventional bases does not particularly target tail regions. Therefore, what is ideally desired is a methodology that requires only a limited number of full computational model evaluations while remaining accurate enough in the tail region. To develop such a methodology, we explore the use of surrogate models based on non-intrusive Polynomial Chaos expansions and Galerkin projection. We consider non-conventional and custom basis functions, orthogonal with respect to probability distributions that exhibit fat-tailed regions. We illustrate how the use of non-conventional basis functions, and surrogate model analysis, improves the accuracy of the spectral expansions in the tail regions. Finally, we also demonstrate these methodologies using precipitation data from CCSM simulations.
Abstract not provided.
Discontinuity detection is an important component in many fields: Image recognition, Digital signal processing, and Climate change research. Current methods shortcomings are: Restricted to one- or two-dimensional setting, Require uniformly spaced and/or dense input data, and Give deterministic answers without quantifying the uncertainty. Spectral methods for Uncertainty Quantification with global, smooth bases are challenged by discontinuities in model simulation results. Domain decomposition reduces the impact of nonlinearities and discontinuities. However, while gaining more smoothness in each subdomain, the current domain refinement methods require prohibitively many simulations. Therefore, detecting discontinuities up front and refining accordingly provides huge improvement to the current methodologies.
Results show that a time-series based classification may be possible. For the test cases considered, the correct model can be selected and the number of index case can be captured within {+-} {sigma} with 5-10 days of data. The low signal-to-noise ratio makes the classification difficult for small epidemics. The problem statement is: (1) Create Bayesian techniques to classify and characterize epidemics from a time-series of ICD-9 codes (will call this time-series a 'morbidity stream'); and (2) It is assumed the morbidity stream has already set off an alarm (through a Kalman filter anomaly detector) Starting with a set of putative diseases: Identify which disease or set of diseases 'fit the data best' and, Infer associated information about it, i.e. number of index cases, start time of the epidemic, spread rate, etc.
Uncertainty quantification in climate models is challenged by the sparsity of the available climate data due to the high computational cost of the model runs. Another feature that prevents classical uncertainty analyses from being easily applicable is the bifurcative behavior in the climate data with respect to certain parameters. A typical example is the Meridional Overturning Circulation in the Atlantic Ocean. The maximum overturning stream function exhibits discontinuity across a curve in the space of two uncertain parameters, namely climate sensitivity and CO2 forcing. We develop a methodology that performs uncertainty quantification in this context in the presence of limited data.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.
Abstract not provided.