While the use of machine learning (ML) classifiers is widespread, their output is often not part of any follow-on decision-making process. To illustrate, consider the scenario where we have developed and trained an ML classifier to find malicious URL links. In this scenario, network administrators must decide whether to allow a computer user to visit a particular website, or to instead block access because the site is deemed malicious. It would be very beneficial if decisions such as these could be made automatically using a trained ML classifier. Unfortunately, due to a variety of reasons discussed herein, the output from these classifiers can be uncertain, rendering downstream decisions difficult. Herein, we provide a framework for: (1) quantifying and propagating uncertainty in ML classifiers; (2) formally linking ML outputs with the decision-making process; and (3) making optimal decisions for classification under uncertainty with single or multiple objectives.
This report details the results of a three-fold investigation of sensitivity analysis (SA) for machine learning (ML) explainability (MLE): (1) the mathematical assessment of the fidelity of an explanation with respect to a learned ML model, (2) quantifying the trustworthiness of a prediction, and (3) the impact of MLE on the efficiency of end-users through multiple users studies. We focused on the cybersecurity domain as the data is inherently non-intuitive. As ML is being using in an increasing number of domains, including domains where being wrong can elicit high consequences, MLE has been proposed as a means of generating trust in a learned ML models by end users. However, little analysis has been performed to determine if the explanations accurately represent the target model and they themselves should be trusted beyond subjective inspection. Current state-of-the-art MLE techniques only provide a list of important features based on heuristic measures and/or make certain assumptions about the data and the model which are not representative of the real-world data and models. Further, most are designed without considering the usefulness by an end-user in a broader context. To address these issues, we present a notion of explanation fidelity based on Shapley values from cooperative game theory. We find that all of the investigated MLE explainability methods produce explanations that are incongruent with the ML model that is being explained. This is because they make critical assumptions about feature independence and linear feature interactions for computational reasons. We also find that in deployed, explanations are rarely used due to a variety of reason including that there are several other tools which are trusted more than the explanations and there is little incentive to use the explanations. In the cases when the explanations are used, we found that there is the danger that explanations persuade the end users to wrongly accept false positives and false negatives. However, ML model developers and maintainers find the explanations more useful to help ensure that the ML model does not have obvious biases. In light of these findings, we suggest a number of future directions including developing MLE methods that directly model non-linear model interactions and including design principles that take into account the usefulness of explanations to the end user. We also augment explanations with a set of trustworthiness measures that measure geometric aspects of the data to determine if the model output should be trusted.
The flexibility of network communication within Internet protocols is fundamental to network function, yet this same flexibility permits the possibility of malicious use. In particular, malicious behavior can masquerade as benign traffic, thus evading systems designed to catch misuse of network resources. However, perfect imitation of benign traffic is difficult, meaning that small unintentional deviations from normal can occur. Identifying these deviations requires that the defenders know what features reveal malicious behavior. Herein, we present an application of compression-based analytics to network communication that can reduce the need for defenders to know a priori what features they need to examine. Motivating the approach is the idea that compression relies on the ability to discover and make use of predictable elements in information, thereby highlighting any deviations between expected and received content. We introduce a so-called 'slice compression' score to identify malicious or anomalous communication in two ways. First, we apply normalized compression distances to classification problems and discuss methods for reducing the noise by excising application content (as opposed to protocol features) using slice compression. Second, we present a new technique for anomaly detection, referred to as slice compression for anomaly detection. A diverse collection of datasets are analyzed to illustrate the efficacy of the proposed approaches. While our focus is network communication, other types of data are also considered to illustrate the generality of the method.
We present a new method for boundary detection within sequential data using compression-based analytics. Our approach is to approximate the information distance between two adjacent sliding windows within the sequence. Large values in the distance metric are indicative of boundary locations. A new algorithm is developed, referred to as sliding information distance (SLID), that provides a fast, accurate, and robust approximation to the normalized information distance. A modified smoothed z-score algorithm is used to locate peaks in the distance metric, indicating boundary locations. A variety of data sources are considered, including text and audio, to demonstrate the efficacy of our approach.
We apply transfer learning techniques to create topically and/or stylistically biased natural language models from small data samples, given generic long short-term memory (LSTM) language models trained on larger data sets. Although LSTM language models are powerful tools with wide-ranging applications, they require enormous amounts of data and time to train. Thus, we build general purpose language models that take advantage of large standing corpora and computational resources proactively, allowing us to build more specialized analytical tools from smaller data sets on demand. We show that it is possible to construct a language model from a small, focused corpus by first training an LSTM language model on a large corpus (e.g., the text from English Wikipedia) and then retraining only the internal transition model parameters on the smaller corpus. We also show that a single general language model can be reused through transfer learning to create many distinct special purpose language models quickly with modest amounts of data.
Social network graph models are data structures representing entities (often people, corpora- tions, or accounts) as "vertices" and their interactions as "edges" between pairs of vertices. These graphs are most often total-graph models -- the overall structure of edges and vertices in a bidirectional or directional graph are described in global terms and the network is gen- erated algorithmically. We are interested in "egocentrie or "agent-based" models of social networks where the behavior of the individual participants are described and the graph itself is an emergent phenomenon. Our hope is that such graph models will allow us to ultimately reason from observations back to estimated properties of the individuals and populations, and result in not only more accurate algorithms for link prediction and friend recommen- dation, but also a more intuitive understanding of human behavior in such systems than is revealed by previous approaches. This report documents our preliminary work in this area; we describe several past graph models, two egocentric models of our own design, and our thoughts about the future direction of this research.
Microstructural variabilities are among the predominant sources of uncertainty in structural performance and reliability. We seek to develop efficient algorithms for multiscale calcu- lations for polycrystalline alloys such as aluminum alloy 6061-T6 in environments where ductile fracture is the dominant failure mode. Our approach employs concurrent multiscale methods, but does not focus on their development. They are a necessary but not sufficient ingredient to multiscale reliability predictions. We have focused on how to efficiently use concurrent models for forward propagation because practical applications cannot include fine-scale details throughout the problem domain due to exorbitant computational demand. Our approach begins with a low-fidelity prediction at the engineering scale that is sub- sequently refined with multiscale simulation. The results presented in this report focus on plasticity and damage at the meso-scale, efforts to expedite Monte Carlo simulation with mi- crostructural considerations, modeling aspects regarding geometric representation of grains and second-phase particles, and contrasting algorithms for scale coupling.
In this work, we approach topic tracking and meme trending in social media with a temporal focus; rather than analyzing topics, we aim to identify time periods whose content differs significantly from normal. We detail two approaches. The first is an information-theoretic analysis of the distributions of terms emitted during each time period. In the second, we cluster the documents from each time period and analyze the tightness of each clustering. We also discuss a method of combining the scores created by each technique, and we provide ample empirical analysis of our methodology on various Twitter datasets.
We present a temporal model of individual-scale social media user behavior, comprising modal activity levels and mode switching patterns. We show that this model can be effectively and easily learned from available social media data, and that our model is sufficiently flexible to capture diverse users’ daily activity patterns. In applications such as electric power load prediction, computer network traffic analysis, disease spread modeling, and disease outbreak forecasting, it is useful to have a model of individual-scale patterns of human behavior. Our user model is intended to be suitable for integration into such population models, for future applications of prediction, change detection, or agent-based simulation.
The Data Inferencing on Semantic Graphs project (DISeG) was a two-year investigation of inferencing techniques (focusing on belief propagation) to social graphs with a focus on semantic graphs (also called multi-layer graphs). While working this problem, we developed a new directed version of inferencing we call Directed Propagation (Chapters 2 and 4), identified new semantic graph sampling problems (Chapter 3).
A Bayesian framework is developed for characterizing the unknown parameters of probabilistic models for material properties. In this framework, the unknown parameters are viewed as random and described by their posterior distributions obtained from prior information and measurements of quantities of interest that are observable and depend on the unknown parameters. The proposed Bayesian method is applied to characterize an unknown spatial correlation of the conductivity field in the definition of a stochastic transport equation and to solve this equation by Monte Carlo simulation and stochastic reduced order models (SROMs). The Bayesian method is also employed to characterize unknown parameters of material properties for laser welds from measurements of peak forces sustained by these welds.
The stochastic collocation (SC) and stochastic Galerkin (SG) methods are two well-established and successful approaches for solving general stochastic problems. A recently developed method based on stochastic reduced order models (SROMs) can also be used. Herein we provide a comparison of the three methods for some numerical examples; our evaluation only holds for the examples considered in the paper. The purpose of the comparisons is not to criticize the SC or SG methods, which have proven very useful for a broad range of applications, nor is it to provide overall ratings of these methods as compared to the SROM method. Rather, our objectives are to present the SROM method as an alternative approach to solving stochastic problems and provide information on the computational effort required by the implementation of each method, while simultaneously assessing their performance for a collection of specific problems.
Laser welds are prevalent in complex engineering systems and they frequently govern failure. The weld process often results in partial penetration of the base metals, leaving sharp crack-like features with a high degree of variability in the geometry and material properties of the welded structure. Furthermore, accurate finite element predictions of the structural reliability of components containing laser welds requires the analysis of a large number of finite element meshes with very fine spatial resolution, where each mesh has different geometry and/or material properties in the welded region to address variability. We found that traditional modeling approaches could not be efficiently employed. Consequently, a method is presented for constructing a surrogate model, based on stochastic reduced-order models, and is proposed to represent the laser welds within the component. Here, the uncertainty in weld microstructure and geometry is captured by calibrating plasticity parameters to experimental observations of necking as, because of the ductility of the welds, necking – and thus peak load – plays the pivotal role in structural failure. The proposed method is exercised for a simplified verification problem and compared with the traditional Monte Carlo simulation with rather remarkable results.