As machine learning (ML) models are deployed into an ever-diversifying set of application spaces, ranging from self-driving cars to cybersecurity to climate modeling, the need to carefully evaluate model credibility becomes increasingly important. Uncertainty quantification (UQ) provides important information about the ability of a learned model to make sound predictions, often with respect to individual test cases. However, most UQ methods for ML are themselves data-driven and therefore susceptible to the same knowledge gaps as the models themselves. Specifically, UQ helps to identify points near decision boundaries where the models fit the data poorly, yet predictions can score as certain for points that are under-represented by the training data and thus out-of-distribution (OOD). One method for evaluating the quality of both ML models and their associated uncertainty estimates is out-of-distribution detection (OODD). We combine OODD with UQ to provide insights into the reliability of the individual predictions made by an ML model.
A forensics investigation after a breach often uncovers network and host indicators of compromise (IOCs) that can be deployed to sensors to allow early detection of the adversary in the future. Over time, the adversary will change tactics, techniques, and procedures (TTPs), which will also change the data generated. If the IOCs are not kept up-to-date with the adversary's new TTPs, the adversary will no longer be detected once all of the IOCs become invalid. Tracking the Known (TTK) is the problem of keeping IOCs, in this case regular expression (regexes), up-to-date with a dynamic adversary. Our framework solves the TTK problem in an automated, cyclic fashion to bracket a previously discovered adversary. This tracking is accomplished through a data-driven approach of self-adapting a given model based on its own detection capabilities.In our initial experiments, we found that the true positive rate (TPR) of the adaptive solution degrades much less significantly over time than the naïve solution, suggesting that self-updating the model allows the continued detection of positives (i.e., adversaries). The cost for this performance is in the false positive rate (FPR), which increases over time for the adaptive solution, but remains constant for the naïve solution. However, the difference in overall detection performance, as measured by the area under the curve (AUC), between the two methods is negligible. This result suggests that self-updating the model over time should be done in practice to continue to detect known, evolving adversaries.
Rigorous characterization of the performance and generalization ability of cyber defense systems is extremely difficult, making it hard to gauge uncertainty, and thus, confidence. This difficulty largely stems from a lack of labeled attack data that fully explores the potential adversarial space. Currently, performance of cyber defense systems is typically evaluated in a qualitative manner by manually inspecting the results of the system on live data and adjusting as needed. Additionally, machine learning has shown promise in deriving models that automatically learn indicators of compromise that are more robust than analyst-derived detectors. However, to generate these models, most algorithms require large amounts of labeled data (i.e., examples of attacks). Algorithms that do not require annotated data to derive models are similarly at a disadvantage, because labeled data is still necessary when evaluating performance. In this work, we explore the use of temporal generative models to learn cyber attack graph representations and automatically generate data for experimentation and evaluation. Training and evaluating cyber systems and machine learning models requires significant, annotated data, which is typically collected and labeled by hand for one-off experiments. Automatically generating such data helps derive/evaluate detection models and ensures reproducibility of results. Experimentally, we demonstrate the efficacy of generative sequence analysis techniques on learning the structure of attack graphs, based on a realistic example. These derived models can then be used to generate more data. Additionally, we provide a roadmap for future research efforts in this area.
On September 5th and 6th, 2012, the Dynamic Defense Workshop: From Research to Practice brought together researchers from academia, industry, and Sandia with the goals of increasing collaboration between Sandia National Laboratories and external organizations, de ning and un- derstanding dynamic, or moving target, defense concepts and directions, and gaining a greater understanding of the state of the art for dynamic defense. Through the workshop, we broadened and re ned our de nition and understanding, identi ed new approaches to inherent challenges, and de ned principles of dynamic defense. Half of the workshop was devoted to presentations of current state-of-the-art work. Presentation topics included areas such as the failure of current defenses, threats, techniques, goals of dynamic defense, theory, foundations of dynamic defense, future directions and open research questions related to dynamic defense. The remainder of the workshop was discussion, which was broken down into sessions on de ning challenges, applications to host or mobile environments, applications to enterprise network environments, exploring research and operational taxonomies, and determining how to apply scienti c rigor to and investigating the eld of dynamic defense.