Publications

Results 1–50 of 75
Skip to search filters

SAGE Intrusion Detection System: Sensitivity Analysis Guided Explainability for Machine Learning

Smith, Michael R.; Acquesta, Erin A.; Ames, Arlo L.; Carey, Alycia N.; Cueller, Christopher R.; Field, Richard V.; Maxfield, Trevor M.; Mitchell, Scott A.; Morris, Elizabeth S.; Moss, Blake C.; Nyre-Yu, Megan N.; Rushdi, Ahmad R.; Stites, Mallory C.; Smutz, Charles S.; Zhou, Xin Z.

This report details the results of a three-fold investigation of sensitivity analysis (SA) for machine learning (ML) explainability (MLE): (1) the mathematical assessment of the fidelity of an explanation with respect to a learned ML model, (2) quantifying the trustworthiness of a prediction, and (3) the impact of MLE on the efficiency of end-users through multiple users studies. We focused on the cybersecurity domain as the data is inherently non-intuitive. As ML is being using in an increasing number of domains, including domains where being wrong can elicit high consequences, MLE has been proposed as a means of generating trust in a learned ML models by end users. However, little analysis has been performed to determine if the explanations accurately represent the target model and they themselves should be trusted beyond subjective inspection. Current state-of-the-art MLE techniques only provide a list of important features based on heuristic measures and/or make certain assumptions about the data and the model which are not representative of the real-world data and models. Further, most are designed without considering the usefulness by an end-user in a broader context. To address these issues, we present a notion of explanation fidelity based on Shapley values from cooperative game theory. We find that all of the investigated MLE explainability methods produce explanations that are incongruent with the ML model that is being explained. This is because they make critical assumptions about feature independence and linear feature interactions for computational reasons. We also find that in deployed, explanations are rarely used due to a variety of reason including that there are several other tools which are trusted more than the explanations and there is little incentive to use the explanations. In the cases when the explanations are used, we found that there is the danger that explanations persuade the end users to wrongly accept false positives and false negatives. However, ML model developers and maintainers find the explanations more useful to help ensure that the ML model does not have obvious biases. In light of these findings, we suggest a number of future directions including developing MLE methods that directly model non-linear model interactions and including design principles that take into account the usefulness of explanations to the end user. We also augment explanations with a set of trustworthiness measures that measure geometric aspects of the data to determine if the model output should be trusted.

More Details

An Agile Design-to-Simulation Workflow Using a New Conforming Moving Least Squares Method

Koester, Jacob K.; Tupek, Michael R.; Mitchell, Scott A.

This report summarizes the accomplishments and challenges of a two year LDRD effort focused on improving design-to-simulation agility. The central bottleneck in most solid mechanics simulations is the process of taking CAD geometry and creating a discretization of suitable quality, i.e., the "meshine effort. This report revisits meshfree methods and documents some key advancements that allow their use on problems with complex geometries, low quality meshes, nearly incompressible materials or that involve fracture. The resulting capability was demonstrated to be an effective part of an agile simulation process by enabling rapid discretization techniques without increasing the time to obtain a solution of a given accuracy. The first enhancement addressed boundary-related challenges associated with meshfree methods. When using point clouds and Euclidean metrics to construct approximation spaces, the boundary information is lost, which results in low accuracy solutions for non-convex geometries and mate rial interfaces. This also complicates the application of essential boundary conditions. The solution involved the development of conforming window functions which use graph and boundary information to directly incorporate boundaries into the approximation space. The next enhancement was a procedure for producing a quality approximation with a low quality mesh. Unlike, the finite element method, meshfree approximation spaces do not require a mesh. However, meshes can be useful in providing domain boundary information and performing domain integration. A process was developed which aggregates low quality elements to create polyhedra of agreeable quality for domain integration. Stable time increments for transient dynamic simulations were observed to be up to 1000x larger than finite element simulations and solution quality and robustness were vastly superior. Obtaining a solution which is free of nonphysical displacement or pressure oscillations is a challenge for many methods when simulating nearly incompressible materials. Existing nodally integrated meshfree methods suffer from this limitation as well. New techniques were developed that combine B / F methods and the strain smoothing technique used in nodal integration to provide agreeable solutions for problems with nearly incompressible materials. The last major contribution enabled efficient simulations of material fracture with mass conservation. An inter-particle connectivity degradation approach was developed using ideas from peridynamics and cohesive zone modeling to disassociate nodes when fracture conditions are met. The method can, in principal, be applied to any material model with a specified failure criterion. For a mode-I ductile crack propagation problem, the method demonstrates mesh-size independent behavior without the particle instabilities near the fracture surface that are common to other particle methods. Addressing the aforementioned challenges of meshfree methods opens the approach to a broader class of problems and enables an agile simulation development process for problems of interest to Sandia.

More Details

Fast Approximate Union Volume in High Dimensions with Line Samples

Mitchell, Scott A.; Awad, Muhammad A.; Ebeida, Mohamed S.; Swiler, Laura P.

The classical problem of calculating the volume of the union of d-dimensional balls is known as "Union Volume." We present line-sampling approximation algorithms for Union Volume. Our methods may be extended to other Boolean operations, such as setminus; or to other shapes, such as hyper-rectangles. The deterministic, exact approaches for Union Volume do not scale well to high dimensions. However, we adapt several of these exact approaches to approximation algorithms based on sampling. We perform local sampling within each ball using lines. We have several variations, depending on how the overlapping volume is partitioned, and depending on whether radial, axis-aligned, or other line patterns are used. Our variations fall within the family of Monte Carlo sampling, and hence have about the same theoretical convergence rate, 1 /$\sqrt{M}$, where M is the number of samples. In our limited experiments, line-sampling proved more accurate per unit work than point samples, because a line sample provides more information, and the analytic equation for a sphere makes the calculation almost as fast. We performed a limited empirical study of the efficiency of these variations. We suggest a more extensive study for future work. We speculate that different ball arrangements, differentiated by the distribution of overlaps in terms of volume and degree, will benefit the most from patterns of line samples that preferentially capture those overlaps. Acknowledgement We thank Karl Bringman for explaining his BF-ApproxUnion (ApproxUnion) algorithm [3] to us. We thank Josiah Manson for pointing out that spoke darts oversample the center and we might get a better answer by uniform sampling. We thank Vijay Natarajan for suggesting random chord sampling. The authors are grateful to Brian Adams, Keith Dalbey, and Vicente Romero for useful technical discussions. This work was sponsored by the Laboratory Directed Research and Development (LDRD) Program at Sandia National Laboratories. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research (ASCR), Applied Mathematics Program. Sandia National Laboratories is a multi-mission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA0003525.

More Details

VoroCrust illustrated: Theory and challenges

Leibniz International Proceedings in Informatics, LIPIcs

Abdelkader, Ahmed; Bajaj, Chandrajit L.; Ebeida, Mohamed S.; Mahmoud, Ahmed H.; Mitchell, Scott A.; Owens, John D.; Rushdi, Ahmad A.

Over the past decade, polyhedral meshing has been gaining popularity as a better alternative to tetrahedral meshing in certain applications. Within the class of polyhedral elements, Voronoi cells are particularly attractive thanks to their special geometric structure. What has been missing so far is a Voronoi mesher that is sufficiently robust to run automatically on complex models. In this video, we illustrate the main ideas behind the VoroCrust algorithm, highlighting both the theoretical guarantees and the practical challenges imposed by realistic inputs.

More Details

Sampling conditions for conforming voronoi meshing by the vorocrust algorithm

Leibniz International Proceedings in Informatics, LIPIcs

Abdelkader, Ahmed; Bajaj, Chandrajit L.; Ebeida, Mohamed S.; Mahmoud, Ahmed H.; Mitchell, Scott A.; Owens, John D.; Rushdi, Ahmad A.

We study the problem of decomposing a volume bounded by a smooth surface into a collection of Voronoi cells. Unlike the dual problem of conforming Delaunay meshing, a principled solution to this problem for generic smooth surfaces remained elusive. VoroCrust leverages ideas from α-shapes and the power crust algorithm to produce unweighted Voronoi cells conforming to the surface, yielding the first provably-correct algorithm for this problem. Given an ϵ-sample on the bounding surface, with a weak σ-sparsity condition, we work with the balls of radius δ times the local feature size centered at each sample. The corners of this union of balls are the Voronoi sites, on both sides of the surface. The facets common to cells on opposite sides reconstruct the surface. For appropriate values of ϵ, σ and δ, we prove that the surface reconstruction is isotopic to the bounding surface. With the surface protected, the enclosed volume can be further decomposed into an isotopic volume mesh of fat Voronoi cells by generating a bounded number of sites in its interior. Compared to state-of-the-art methods based on clipping, VoroCrust cells are full Voronoi cells, with convexity and fatness guarantees. Compared to the power crust algorithm, VoroCrust cells are not filtered, are unweighted, and offer greater flexibility in meshing the enclosed volume by either structured grids or random samples.

More Details

Sampling Conditions for Conforming Voronoi Meshing by the VoroCrust Algorithm

LIPIcs-Leibniz International Proceedings in Informatics

Abdelkader, Ahmed A.; Bajaja, Chandrajit L.; Ebeida, Mohamed S.; Mahmoud, Ahmed H.; Mitchell, Scott A.; Owens, John D.; Rushdi, Ahmad A.

© Ahmed Abdelkader, Chandrajit L. Bajaj, Mohamed S. Ebeida, Ahmed H. Mahmoud, Scott A. Mitchell, John D. Owens and Ahmad A. Rushdi; licensed under Creative Commons License CC-BY 34th Symposium on Computational Geometry (SoCG 2018). We study the problem of decomposing a volume bounded by a smooth surface into a collection of Voronoi cells. Unlike the dual problem of conforming Delaunay meshing, a principled solution to this problem for generic smooth surfaces remained elusive. VoroCrust leverages ideas from α-shapes and the power crust algorithm to produce unweighted Voronoi cells conforming to the surface, yielding the first provably-correct algorithm for this problem. Given an ϵ-sample on the bounding surface, with a weak σ-sparsity condition, we work with the balls of radius δ times the local feature size centered at each sample. The corners of this union of balls are the Voronoi sites, on both sides of the surface. The facets common to cells on opposite sides reconstruct the surface. For appropriate values of ϵ, σ and δ, we prove that the surface reconstruction is isotopic to the bounding surface. With the surface protected, the enclosed volume can be further decomposed into an isotopic volume mesh of fat Voronoi cells by generating a bounded number of sites in its interior. Compared to state-of-the-art methods based on clipping, VoroCrust cells are full Voronoi cells, with convexity and fatness guarantees. Compared to the power crust algorithm, VoroCrust cells are not filtered, are unweighted, and offer greater flexibility in meshing the enclosed volume by either structured grids or random samples.

More Details

Footprint placement for mosaic imaging by sampling and optimization

Proceedings International Conference on Automated Planning and Scheduling, ICAPS

Mitchell, Scott A.; Valicka, Christopher G.; Rowe, Stephen R.; Zou, Simon Z.

We consider the problem of selecting a small set (mosaic) of sensor images (footprints) whose union covers a two-dimensional Region Of Interest (ROI) on Earth. We take the approach of modeling the mosaic problem as a Mixed-Integer Linear Program (MILP). This allows solutions to this subproblem to feed into a larger remote-sensor collection-scheduling MILP. This enables the scheduler to dynamically consider alternative mosaics, without having to perform any new geometric computations. Our approach to set up the optimization problem uses maximal disk sampling and point-in-polygon geometric calculations. Footprints may be of any shape, even non-convex, and we show examples using a variety of shapes that may occur in practice. The general integer optimization problem can become computationally expensive for large problems. In practice, the number of placed footprints is within an order of magnitude of ten, making the time to solve to optimality on the order of minutes. This is fast enough to make the approach relevant for near real-time mission applications. We provide open source software for all our methods, "GeoPlace."

More Details

Meshes optimized for discrete exterior calculus (DEC)

Mitchell, Scott A.; Mousley, Sarah C.; Deakin, Michael D.; Knupp, Patrick M.; Mitchell, Scott A.

We study the optimization of an energy function used by the meshing community to measure and improve mesh quality. This energy is non-traditional because it is dependent on both the primal triangulation and its dual Voronoi (power) diagram. The energy is a measure of the mesh's quality for usage in Discrete Exterior Calculus (DEC), a method for numerically solving PDEs. In DEC, the PDE domain is triangulated and this mesh is used to obtain discrete approximations of the continuous operators in the PDE. The energy of a mesh gives an upper bound on the error of the discrete diagonal approximation of the Hodge star operator. In practice, one begins with an initial mesh and then makes adjustments to produce a mesh of lower energy. However, we have discovered several shortcomings in directly optimizing this energy, e.g. its non-convexity, and we show that the search for an optimized mesh may lead to mesh inversion (malformed triangles). We propose a new energy function to address some of these issues.

More Details

Statistical Inference for Porous Materials using Persistent Homology

Moon, Chul M.; Heath, Jason; Mitchell, Scott A.

We propose a porous materials analysis pipeline using persistent homology. We rst compute persistent homology of binarized 3D images of sampled material subvolumes. For each image we compute sets of homology intervals, which are represented as summary graphics called persistence diagrams. We convert persistence diagrams into image vectors in order to analyze the similarity of the homology of the material images using the mature tools for image analysis. Each image is treated as a vector and we compute its principal components to extract features. We t a statistical model using the loadings of principal components to estimate material porosity, permeability, anisotropy, and tortuosity. We also propose an adaptive version of the structural similarity index (SSIM), a similarity metric for images, as a measure to determine the statistical representative elementary volumes (sREV) for persistence homology. Thus we provide a capability for making a statistical inference of the uid ow and transport properties of porous materials based on their geometry and connectivity.

More Details

All-quad meshing without cleanup

CAD Computer Aided Design

Rushdi, Ahmad A.; Mitchell, Scott A.; Mahmoud, Ahmed H.; Bajaj, Chandrajit C.; Ebeida, Mohamed S.

We present an all-quad meshing algorithm for general domains. We start with a strongly balanced quadtree. In contrast to snapping the quadtree corners onto the geometric domain boundaries, we move them away from the geometry. Then we intersect the moved grid with the geometry. The resulting polygons are converted into quads with midpoint subdivision. Moving away avoids creating any flat angles, either at a quadtree corner or at a geometry–quadtree intersection. We are able to handle two-sided domains, and more complex topologies than prior methods. The algorithm is provably correct and robust in practice. It is cleanup-free, meaning we have angle and edge length bounds without the use of any pillowing, swapping, or smoothing. Thus, our simple algorithm is fast and predictable. This paper has better quality bounds, and the algorithm is demonstrated over more complex domains, than our prior version.

More Details

POF-Darts: Geometric adaptive sampling for probability of failure

Reliability Engineering and System Safety

Ebeida, Mohamed S.; Mitchell, Scott A.; Swiler, Laura P.; Romero, Vicente J.; Rushdi, Ahmad A.

We introduce a novel technique, POF-Darts, to estimate the Probability Of Failure based on random disk-packing in the uncertain parameter space. POF-Darts uses hyperplane sampling to explore the unexplored part of the uncertain space. We use the function evaluation at a sample point to determine whether it belongs to failure or non-failure regions, and surround it with a protection sphere region to avoid clustering. We decompose the domain into Voronoi cells around the function evaluations as seeds and choose the radius of the protection sphere depending on the local Lipschitz continuity. As sampling proceeds, regions uncovered with spheres will shrink, improving the estimation accuracy. After exhausting the function evaluation budget, we build a surrogate model using the function evaluations associated with the sample points and estimate the probability of failure by exhaustive sampling of that surrogate. In comparison to other similar methods, our algorithm has the advantages of decoupling the sampling step from the surrogate construction one, the ability to reach target POF values with fewer samples, and the capability of estimating the number and locations of disconnected failure regions, not just the POF value. We present various examples to demonstrate the efficiency of our novel approach.

More Details

Dynamic Multi-Sensor Multi-Mission Optimal Planning Tool

Valicka, Christopher G.; Rowe, Stephen R.; Zou, Simon Z.; Mitchell, Scott A.; Irelan, William R.; Pollard, Eric L.; Garcia, Deanna G.; Hackebeil, Gabriel A.; Staid, Andrea S.; Rintoul, Mark D.; Watson, Jean-Paul W.; Hart, William E.; Rathinam, Sivakumar R.; Ntaimo, Lewis N.

Remote sensing systems have firmly established a role in providing immense value to commercial industry, scientific exploration, and national security. Continued maturation of sensing technology has reduced the cost of deploying highly-capable sensors while at the same time increased reliance on the information these sensors can provide. The demand for time on these sensors is unlikely to diminish. Coordination of next-generation sensor systems, larger constellations of satellites, unmanned aerial vehicles, ground telescopes, etc. is prohibitively complex for existing heuristics- based scheduling techniques. The project was a two-year collaboration spanning multiple Sandia centers and included a partnership with Texas A&M University. We have developed algorithms and software for collection scheduling, remote sensor field-of-view pointing models, and bandwidth- constrained prioritization of sensor data. Our approach followed best practices from the operations research and computational geometry communities. These models provide several advantages over state of the art techniques. In particular, our approach is more flexible compared to heuristics that tightly couple models and solution techniques. First, our mixed-integer linear models afford a rig- orous analysis so that sensor planners can quantitatively describe a schedule relative to the best possible. Optimal or near-optimal schedules can be produced with commercial solvers in opera- tional run-times. These models can be modified and extended to incorporate different scheduling and resource constraints and objective function definitions. Further, we have extended these mod- els to proactively schedule sensors under weather and ad hoc collection uncertainty. This approach stands in contrast to existing deterministic schedulers which assume a single future weather or ad hoc collection scenario. The field-of-view pointing algorithm produces a mosaic with the fewest number of images required to fully cover a region of interest. The bandwidth-constrained al- gorithms find the highest priority information that can be transmitted. All of these are based on mixed-integer linear programs so that, in the future, collection scheduling, field-of-view, and band- width prioritization can be combined into a single problem. Experiments conducted using the de- veloped models, commercial solvers, and benchmark datasets have demonstrated that proactively scheduling against uncertainty regularly and significantly outperforms deterministic schedulers. Acknowledgement We would like to acknowledge John T. Feddema, Brian N. Post, John H. Ganter, and Swaroop Darbha for providing critical project stewardship and fruitful remote sensing utilization discus- sions. A special thanks to Mohamed S. Ebeida for his contributions to the development of the Maximal Poisson Sampling technique. We would also like to thank Kaarthik Sundar and Jianglei Qin for their significant scheduling algorithm and model development contributions to the project. The authors would like to acknowledge the Sandia LDRD program for their support of this work. Sandia National Laboratories is a multi-mission laboratory managed and operated by Sandia Cor- poration, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

More Details

Curve Reconstruction with Many Fewer Samples

Computer Graphics Forum

Ohrhallinger, S.; Mitchell, Scott A.; Wimmer, M.

We consider the problem of sampling points from a collection of smooth curves in the plane, such that the Crust family of proximity-based reconstruction algorithms can rebuild the curves. Reconstruction requires a dense sampling of local features, i.e., parts of the curve that are close in Euclidean distance but far apart geodesically. We show that ε < 0.47-sampling is sufficient for our proposed HNN-Crust variant, improving upon the state-of-the-art requirement of ε < -sampling. Thus we may reconstruct curves with many fewer samples. We also present a new sampling scheme that reduces the required density even further than ε < 0.47-sampling. We achieve this by better controlling the spacing between geodesically consecutive points. Our novel sampling condition is based on the reach, the minimum local feature size along intervals between samples. This is mathematically closer to the reconstruction density requirements, particularly near sharp-angled features. We prove lower and upper bounds on reach ρ-sampling density in terms of lfs ε-sampling and demonstrate that we typically reduce the required number of samples for reconstruction by more than half.

More Details

Disk Density Tuning of a Maximal Random Packing

Computer Graphics Forum

Ebeida, Mohamed S.; Rushdi, Ahmad A.; Awad, Muhammad A.; Mahmoud, Ahmed H.; Yan, Dong M.; English, Shawn A.; Owens, John D.; Bajaj, Chandrajit L.; Mitchell, Scott A.

We introduce an algorithmic framework for tuning the spatial density of disks in a maximal random packing, without changing the sizing function or radii of disks. Starting from any maximal random packing such as a Maximal Poisson-disk Sampling (MPS), we iteratively relocate, inject (add), or eject (remove) disks, using a set of three successively more-aggressive local operations. We may achieve a user-defined density, either more dense or more sparse, almost up to the theoretical structured limits. The tuned samples are conflict-free, retain coverage maximality, and, except in the extremes, retain the blue noise randomness properties of the input. We change the density of the packing one disk at a time, maintaining the minimum disk separation distance and the maximum domain coverage distance required of any maximal packing. These properties are local, and we can handle spatially-varying sizing functions. Using fewer points to satisfy a sizing function improves the efficiency of some applications. We apply the framework to improve the quality of meshes, removing non-obtuse angles; and to more accurately model fiber reinforced polymers for elastic and failure simulations.

More Details

Visco-TTI-elastic FWI using discontinuous galerkin

SEG Technical Program Expanded Abstracts

Ober, Curtis C.; Smith, Thomas M.; Overfelt, James R.; Collis, Samuel S.; von Winckel, Gregory J.; van Bloemen Waanders, Bart G.; Downey, Nathan J.; Mitchell, Scott A.; Bond, Stephen D.; Aldridge, David F.; Krebs, Jerome R.

The need to better represent the material properties within the earth's interior has driven the development of higherfidelity physics, e.g., visco-tilted-transversely-isotropic (visco- TTI) elastic media and material interfaces, such as the ocean bottom and salt boundaries. This is especially true for full waveform inversion (FWI), where one would like to reproduce the real-world effects and invert on unprocessed raw data. Here we present a numerical formulation using a Discontinuous Galerkin (DG) finite-element (FE) method, which incorporates the desired high-fidelity physics and material interfaces. To offset the additional costs of this material representation, we include a variety of techniques (e.g., non-conformal meshing, and local polynomial refinement), which reduce the overall costs with little effect on the solution accuracy.

More Details

Efficient Probability of Failure Calculations for QMU using Computational Geometry LDRD 13-0144 Final Report

Mitchell, Scott A.; Ebeida, Mohamed S.; Romero, Vicente J.; Swiler, Laura P.; Rushdi, Ahmad A.; Abdelkader, Ahmad A.

This SAND report summarizes our work on the Sandia National Laboratory LDRD project titled "Efficient Probability of Failure Calculations for QMU using Computational Geometry" which was project #165617 and proposal #13-0144. This report merely summarizes our work. Those interested in the technical details are encouraged to read the full published results, and contact the report authors for the status of the software and follow-on projects.

More Details
Results 1–50 of 75
Results 1–50 of 75