Deep neural networks (DNNs) have achieved state-of-the-art performance across a variety of traditional machine learning tasks, e.g., speech recognition, image classification, and segmentation. The ability of DNNs to efficiently approximate high-dimensional functions has also motivated their use in scientific applications, e.g., to solve partial differential equations and to generate surrogate models. In this paper, we consider the supervised training of DNNs, which arises in many of the above applications. We focus on the central problem of optimizing the weights of the given DNN such that it accurately approximates the relation between observed input and target data. Devising effective solvers for this optimization problem is notoriously challenging due to the large number of weights, nonconvexity, data sparsity, and nontrivial choice of hyperparameters. To solve the optimization problem more efficiently, we propose the use of variable projection (VarPro), a method originally designed for separable nonlinear least-squares problems. Our main contribution is the Gauss--Newton VarPro method (GNvpro) that extends the reach of the VarPro idea to nonquadratic objective functions, most notably cross-entropy loss functions arising in classification. These extensions make GNvpro applicable to all training problems that involve a DNN whose last layer is an affine mapping, which is common in many state-of-the-art architectures. In our four numerical experiments from surrogate modeling, segmentation, and classification, GNvpro solves the optimization problem more efficiently than commonly used stochastic gradient descent (SGD) schemes. Finally, GNvpro finds solutions that generalize well, and in all but one example better than well-tuned SGD methods, to unseen data points.
The U.S. Army Research Office (ARO), in partnership with IARPA, are investigating innovative, efficient, and scalable computer architectures that are capable of executing next-generation large scale data-analytic applications. These applications are increasingly sparse, unstructured, non-local, and heterogeneous. Under the Advanced Graphic Intelligence Logical computing Environment (AGILE) program, Performer teams will be asked to design computer architectures to meet the future needs of the DoD and the Intelligence Community (IC). This design effort will require flexible, scalable, and detailed simulation to assess the performance, efficiency, and validity of their designs. To support AGILE, Sandia National Labs will be providing the AGILE-enhanced Structural Simulation Toolkit (A-SST). This toolkit is a computer architecture simulation framework designed to support fast, parallel, and multi-scale simulation of novel architectures. This document describes the A-SST framework, some of its library of simulation models, and how it may be used by AGILE Performers.
This report provides detailed documentation of the algorithms that were developed and implemented in the Plato software over the course of the Optimization-based Design for Manufacturing LDRD project.
This report provides detailed documentation of the algorithms that where developed and implemented in the Plato software over the course of the Optimization-based Design for Manufacturing LDRD project.
This Laboratory Directed Research and Development project developed and applied closely coupled experimental and computational tools to investigate powder compaction across multiple length scales. The primary motivation for this work is to provide connections between powder feedstock characteristics, processing conditions, and powder pellet properties in the context of powder-based energetic components manufacturing. We have focused our efforts on multicrystalline cellulose, a molecular crystalline surrogate material that is mechanically similar to several energetic materials of interest, but provides several advantages for fundamental investigations. We report extensive experimental characterization ranging in length scale from nanometers to macroscopic, bulk behavior. Experiments included nanoindentation of well-controlled, micron-scale pillar geometries milled into the surface of individual particles, single-particle crushing experiments, in-situ optical and computed tomography imaging of the compaction of multiple particles in different geometries, and bulk powder compaction. In order to capture the large plastic deformation and fracture of particles in computational models, we have advanced two distinct meshfree Lagrangian simulation techniques: 1.) bonded particle methods, which extend existing discrete element method capabilities in the Sandia-developed , open-source LAMMPS code to capture particle deformation and fracture and 2.) extensions of peridynamics for application to mesoscale powder compaction, including a novel material model that includes plasticity and creep. We have demonstrated both methods for simulations of single-particle crushing as well as mesoscale multi-particle compaction, with favorable comparisons to experimental data. We have used small-scale, mechanical characterization data to inform material models, and in-situ imaging of mesoscale particle structures to provide initial conditions for simulations. Both mesostructure porosity characteristics and overall stress-strain behavior were found to be in good agreement between simulations and experiments. We have thus demonstrated a novel multi-scale, closely coupled experimental and computational approach to the study of powder compaction. This enables a wide range of possible investigations into feedstock-process-structure relationships in powder-based materials, with immediate applications in energetic component manufacturing, as well as other particle-based components and processes.
In this paper, we develop a method which we call OnlineGCP for computing the Generalized Canonical Polyadic (GCP) tensor decomposition of streaming data. GCP differs from traditional canonical polyadic (CP) tensor decompositions as it allows for arbitrary objective functions which the CP model attempts to minimize. This approach can provide better fits and more interpretable models when the observed tensor data is strongly non-Gaussian. In the streaming case, tensor data is gradually observed over time and the algorithm must incrementally update a GCP factorization with limited access to prior data. In this work, we extend the GCP formalism to the streaming context by deriving a GCP optimization problem to be solved as new tensor data is observed, formulate a tunable history term to balance reconstruction of recently observed data with data observed in the past, develop a scalable solution strategy based on segregated solves using stochastic gradient descent methods, describe a software implementation that provides performance and portability to contemporary CPU and GPU architectures and integrates with Matlab for enhanced usability, and demonstrate the utility and performance of the approach and software on several synthetic and real tensor data sets.