Publications

50 Results
Skip to search filters

Entropy and its Relationship with Statistics

Lehoucq, Richard B.; Mayer, Carolyn D.; Tucker, James D.

The purpose of our report is to discuss the notion of entropy and its relationship with statistics. Our goal is to provide a manner in which you can think about entropy, its central role within information theory and relationship with statistics. We review various relationships between information theory and statistics—nearly all are well-known but unfortunately are often not recognized. Entropy quantities the "average amount of surprise" in a random variable and lies at the heart of information theory, which studies the transmission, processing, extraction, and utilization of information. For us, data is information. What is the distinction between information theory and statistics? Information theorists work with probability distributions. Instead, statisticians work with samples. In so many words, information theory using samples is the practice of statistics. Acknowledgements. We thank Danny Dunlavy, Carlos Llosa, Oscar Lopez, Arvind Prasadan, Gary Saavedra, Jeremy Wendt for helpful discussions along the way. Our report was supported by the Laboratory Directed Research and Development program at San- dia National Laboratories, a multimission laboratory managed and operated by National Technol- ogy and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell Inter- national, Inc., for the U.S. Department of Energy's National Nuclear Adminstration under contract DE-NA0003525.

More Details

A Model of Narrative Reinforcement on a Dual-Layer Social Network

Emery, Benjamin F.; Ting, Christina T.; Gearhart, Jared L.; Tucker, James D.

Widespread integration of social media into daily life has fundamentally changed the way society communicates, and, as a result, how individuals develop attitudes, personal philosophies, and worldviews. The excess spread of disinformation and misinformation due to this increased connectedness and streamlined communication has been extensively studied, simulated, and modeled. Less studied is the interaction of many pieces of misinformation, and the resulting formation of attitudes. We develop a framework for the simulation of attitude formation based on exposure to multiple cognitions. We allow a set of cognitions with some implicit relational topology to spread on a social network, which is defined with separate layers to specify online and offline relationships. An individual’s opinion on each cognition is determined by a process inspired by the Ising model for ferromagnetism. We conduct experimentation using this framework to test the effect of topology, connectedness, and social media adoption on the ultimate prevalence of and exposure to certain attitudes.

More Details

A Framework for Inverse Prediction Using Functional Response Data

Journal of Computing and Information Science in Engineering

Ries, Daniel R.; Zhang, Adah S.; Tucker, James D.; Shuler, Kurtis; Ausdemore, Madeline A.

Inverse prediction models have commonly been developed to handle scalar data from physical experiments. However, it is not uncommon for data to be collected in functional form. When data are collected in functional form, it must be aggregated to fit the form of traditional methods, which often results in a loss of information. For expensive experiments, this loss of information can be costly. In this study, we introduce the functional inverse prediction (FIP) framework, a general approach which uses the full information in functional response data to provide inverse predictions with probabilistic prediction uncertainties obtained with the bootstrap. The FIP framework is a general methodology that can be modified by practitioners to accommodate many different applications and types of data. We demonstrate the framework, highlighting points of flexibility, with a simulation example and applications to weather data and to nuclear forensics. Results show how functional models can improve the accuracy and precision of predictions.

More Details

Multimodal Bayesian registration of noisy functions using Hamiltonian Monte Carlo

Computational Statistics and Data Analysis

Tucker, James D.; Shand, Lyndsay S.; Chowdhary, Kenny

Functional data registration is a necessary processing step for many applications. The observed data can be inherently noisy, often due to measurement error or natural process uncertainty; which most functional alignment methods cannot handle. A pair of functions can also have multiple optimal alignment solutions, which is not addressed in current literature. In this paper, a flexible Bayesian approach to functional alignment is presented, which appropriately accounts for noise in the data without any pre-smoothing required. Additionally, by running parallel MCMC chains, the method can account for multiple optimal alignments via the multi-modal posterior distribution of the warping functions. To most efficiently sample the warping functions, the approach relies on a modification of the standard Hamiltonian Monte Carlo to be well-defined on the infinite-dimensional Hilbert space. This flexible Bayesian alignment method is applied to both simulated data and real data sets to show its efficiency in handling noisy functions and successfully accounting for multiple optimal alignments in the posterior; characterizing the uncertainty surrounding the warping functions.

More Details

A Projected Network Model of Online Disinformation Cascades

Emery, Benjamin F.; Ting, Christina T.; Johnson, Nicholas J.; Tucker, James D.

Within the past half-decade, it has become overwhelmingly clear that suppressing the spread of deliberate false and misleading information is of the utmost importance for protecting democratic institutions. Disinformation has been found to come from both foreign and domestic actors, but the effects from either can be disastrous. From the simple encouragement of unwarranted distrust to conspiracy theories promoting violence, the results of disinformation have put the functionality of American democracy under direct threat. Present scientific challenges posed by this problem include detecting disinformation, quantifying its potential impact, and preventing its amplification. We present a model on which we can experiment with possible strategies toward the third challenge: the prevention of amplification. This is a social contagion network model, which is decomposed into layers to represent physical, ''offline'', interactions as well as virtual interactions on a social media platform. Along with the topological modifications to the standard contagion model, we use state-transition rules designed specifically for disinformation, and distinguish between contagious and non-contagious infected nodes. We use this framework to explore the effect of grassroots social movements on the size of disinformation cascades by simulating these cascades in scenarios where a proportion of the agents remove themselves from the social platform. We also test the efficacy of strategies that could be implemented at the administrative level by the online platform to minimize such spread. These top-down strategies include banning agents who disseminate false information, or providing corrective information to individuals exposed to false information to decrease their probability of believing it. We find an abrupt transition to smaller cascades when a critical number of random agents are removed from the platform, as well as steady decreases in the size of cascades with increasingly more convincing corrective information. Finally, we compare simulated cascades on this framework with real cascades of disinformation recorded on Whatsapp surrounding the 2019 Indian election. We find a set of hyperparameter values that produces a distribution of cascades matching the scaling exponent of the distribution of actual cascades recorded in the dataset. We acknowledge the available future directions for improving the performance of the framework and validation methods, as well as ways to extend the model to capture additional features of social contagion.

More Details

Regression models using shapes of functions as predictors

Computational Statistics and Data Analysis

Ahn, Kyungmin; Tucker, James D.; Wu, Wei; Srivastava, Anuj

Functional variables are often used as predictors in regression problems. A commonly used parametric approach, called scalar-on-function regression, uses the L2 inner product to map functional predictors into scalar responses. This method can perform poorly when predictor functions contain undesired phase variability, causing phases to have disproportionately large influence on the response variable. One past solution has been to perform phase–amplitude separation (as a pre-processing step) and then use only the amplitudes in the regression model. Here we propose a more integrated approach, termed elastic functional regression model (EFRM), where phase-separation is performed inside the regression model, rather than as a pre-processing step. This approach generalizes the notion of phase in functional data, and is based on the norm-preserving time warping of predictors. Due to its invariance properties, this representation provides robustness to predictor phase variability and results in improved predictions of the response variable over traditional models. We demonstrate this framework using a number of datasets involving gait signals, NMR data, and stock market prices.

More Details

Tolerance Bound Calculation for Compact Model Calibration Using Functional Data Analysis

4th Electron Devices Technology and Manufacturing Conference, EDTM 2020 - Proceedings

Reza, Shahed R.; Martin, Nevin S.; Buchheit, Thomas E.; Tucker, James D.

Measurements performed on a population of electronic devices reveal part-to-part variation due to manufacturing process variation. Corner models are a useful tool for the designers to bound the effect of this variation on circuit performance. To accurately simulate the circuit level behavior, compact model parameters for devices within a circuit must be calibrated to experimental data. However, determination of the bounding data for corner model calibration is difficult, primarily because available tolerance bound calculation methods only consider variability along one dimension and, do not adequately consider the variabilities across both the current and voltage axes. This paper presents the demonstration of a novel functional data analysis approach to generate tolerance bounds on these two types of variability separately and these bounds are then transformed to be used in corner model calibration.

More Details

A geometric approach for computing tolerance bounds for elastic functional data

Journal of Applied Statistics

Tucker, James D.; Lewis, John R.; King, Caleb; Kurtek, Sebastian

We develop a method for constructing tolerance bounds for functional data with random warping variability. In particular, we define a generative, probabilistic model for the amplitude and phase components of such observations, which parsimoniously characterizes variability in the baseline data. Based on the proposed model, we define two different types of tolerance bounds that are able to measure both types of variability, and as a result, identify when the data has gone beyond the bounds of amplitude and/or phase. The first functional tolerance bounds are computed via a bootstrap procedure on the geometric space of amplitude and phase functions. The second functional tolerance bounds utilize functional Principal Component Analysis to construct a tolerance factor. This work is motivated by two main applications: process control and disease monitoring. The problem of statistical analysis and modeling of functional data in process control is important in determining when a production has moved beyond a baseline. Similarly, in biomedical applications, doctors use long, approximately periodic signals (such as the electrocardiogram) to diagnose and monitor diseases. In this context, it is desirable to identify abnormalities in these signals. We additionally consider a simulated example to assess our approach and compare it to two existing methods.

More Details

Bounding uncertainty in functional data: A case study

Quality Engineering

King, Caleb; Martin, Nevin; Tucker, James D.

Functional data are fast becoming a preeminent source of information across a wide range of industries. A particularly challenging aspect of functional data is bounding uncertainty. In this unique case study, we present our attempts at creating bounding functions for selected applications at Sandia National Laboratories (SNL). The first attempt involved a simple extension of functional principal component analysis (fPCA) to incorporate covariates. Though this method was straightforward, the extension was plagued by poor coverage accuracy for the bounding curve. This led to a second attempt utilizing elastic methodology which yielded more accurate coverage at the cost of more complexity.

More Details

Computer Model Calibration Based on Image Warping Metrics: An Application for Sea Ice Deformation

Journal of Agricultural, Biological, and Environmental Statistics

Guan, Yawen; Sampson, Christian; Tucker, James D.; Chang, Won; Mondal, Anirban; Haran, Murali; Sulsky, Deborah

Arctic sea ice plays an important role in the global climate. Sea ice models governed by physical equations have been used to simulate the state of the ice including characteristics such as ice thickness, concentration, and motion. More recent models also attempt to capture features such as fractures or leads in the ice. These simulated features can be partially misaligned or misshapen when compared to observational data, whether due to numerical approximation or incomplete physics. In order to make realistic forecasts and improve understanding of the underlying processes, it is necessary to calibrate the numerical model to field data. Traditional calibration methods based on generalized least-square metrics are flawed for linear features such as sea ice cracks. We develop a statistical emulation and calibration framework that accounts for feature misalignment and misshapenness, which involves optimally aligning model output with observed features using cutting-edge image registration techniques. This work can also have application to other physical models which produce coherent structures. Supplementary materials accompanying this paper appear online.

More Details

Elastic functional principal component regression

Statistical Analysis and Data Mining

Tucker, James D.; Lewis, John R.; Srivastava, Anuj

We study regression using functional predictors in situations where these functions contains both phase and amplitude variability. In other words, the functions are misaligned due to errors in time measurements, and these errors can significantly degrade both model estimation and prediction performance. The current techniques either ignore the phase variability, or handle it via preprocessing, that is, use an off-the-shelf technique for functional alignment and phase removal. We develop a functional principal component regression model which has a comprehensive approach in handling phase and amplitude variability. The model utilizes a mathematical representation of the data known as the square-root slope function. These functions preserve the L 2 norm under warping and are ideally suited for simultaneous estimation of regression and warping parameters. Using both simulated and real-world data sets, we demonstrate our approach and evaluate its prediction performance relative to current models. In addition, we propose an extension to functional logistic and multinomial logistic regression.

More Details

Statistically normalized coherent change detection for synthetic aperture sonar imagery

Proceedings of SPIE - The International Society for Optical Engineering

G-Michael, Tesfaye; Tucker, James D.; Roberts, Rodney G.

Coherent Change Detection (CCD) is a process of highlighting an area of activity in scenes (seafloor) under survey and generated from pairs of synthetic aperture sonar (SAS) images of approximately the same location observed at two different time instances. The problem of CCD and subsequent anomaly feature extraction/detection is complicated due to several factors such as the presence of random speckle pattern in the images, changing environmental conditions, and platform instabilities. These complications make the detection of weak target activities even more difficult. Typically, the degree of similarity between two images measured at each pixel locations is the coherence between the complex pixel values in the two images. Higher coherence indicates little change in the scene represented by the pixel and lower coherence indicates change activity in the scene. Such coherence estimation scheme based on the pixel intensity correlation is an ad-hoc procedure where the effectiveness of the change detection is determined by the choice of threshold which can lead to high false alarm rates. In this paper, we propose a novel approach for anomalous change pattern detection using the statistical normalized coherence and multi-pass coherent processing. This method may be used to mitigate shadows by reducing the false alarms resulting in the coherent map due to speckles and shadows. Test results of the proposed methods on a data set of SAS images will be presented, illustrating the effectiveness of the normalized coherence in terms statistics from multi-pass survey of the same scene.

More Details

Analysis of signals under compositional noise with applications to SONAR data

IEEE Journal of Oceanic Engineering

Tucker, James D.

In this paper, we consider the problem of denoising and classification of SONAR signals observed under compositional noise, i.e., they have been warped randomly along the x-axis. The traditional techniques do not account for such noise and, consequently, cannot provide a robust classification of signals. We apply a recent framework that: 1) uses a distance-based objective function for data alignment and noise reduction; and 2) leads to warping-invariant distances between signals for robust clustering and classification. We use this framework to introduce two distances that can be used for signal classification: a) a y-distance, which is the distance between the aligned signals; and b) an x-distance that measures the amount of warping needed to align the signals. We focus on the task of clustering and classifying objects, using acoustic spectrum (acoustic color), which is complicated by the uncertainties in aspect angles at data collections. Small changes in the aspect angles corrupt signals in a way that amounts to compositional noise. As a result, we demonstrate the use of the developed metrics in classification of acoustic color data and highlight improvements in signal classification over current methods.

More Details
50 Results
50 Results