Projects

The CCR enterprise is closely tied to the laboratories’ broader set of missions and strategies. We share responsibility within Sandia as stewards of important capabilities for the nation in high-strain-rate physics, scientific visualization, optimization, uncertainty quantification, scalable solvers, inverse methods, and computational materials. We also leverage our core technologies to execute projects through various partnerships, such as Cooperative Research and Development Agreements (CRADAs) and Strategic Partnerships, as well as partnerships with universities.

Advanced Tri-lab Software Environment (ATSE)

Advanced Tri-lab Software Environment (ATSE)

The Advanced Tri-lab Software Environment (ATSE) is an effort led by Sandia in partnership

The Advanced Tri-lab Software Environment (ATSE) is an effort led by Sandia in partnership with Lawrence Livermore National Laboratory and Los Alamos National Laboratory to build an open, modular, extensible, community-engaged, vendor-adaptable software ecosystem that enables the prototyping of new technologies for improving the ASC computing environment.

Project Website

Departments
Scalable System Software
Contact
Matthew Curry, mlcurry@sandia.gov
Albany

Albany

Albany is an implicit, unstructured grid, finite element code for the solution and analysis of partial differential equations. Albany is the main demonstration application of the AgileComponents software development strategy at Sandia. It is a PDE code that strives to be built almost entirely from functionality contained within reusable libraries (such as Trilinos/STK/Dakota/PUMI). Albany plays a large role in demonstrating and maturing functionality of new libraries, and also in the interfaces and interoperability between these libraries. It also serves to expose gaps in our coverage of capabilities and interface design.

In addition to the component-based code design strategy, Albany also attempts to showcase the concept of Analysis Beyond Simulation, where an application code is developed up from for a design and analysis mission. All Albany applications are born with the ability to perform sensitivity analysis, stability analysis, optimization, and uncertainty quantification, with a clean interfaces for exposing design parameters for manipulation by analysis algorithms. 

Albany also attempts to be a model for software engineering tools and processes, so that new research codes can adopt the Albany infrastructure as a good starting point. This effort involves a close collaboration with the 1400 SEMS team.

The Albany code base is host to several application projects, notably:

  • LCM (Laboratory for Computational Mechanics) [PI J. Ostien]: A platform for research in finite deformation mechanics, including algorithms for failure and fracture modeling, discretizations, implicit solution algorithms, advanced material models, coupling to particle-based methods, and efficient implementations on new architectures.
  • QCAD (Quantum Computer Aided Design) [PI Nielsen]: A code to aid in the design of quantum dots from built in doped silicon devices. QCAD solves the coupled Schoedinger-Poisson system. When wrapped in Dakota, optimal operating conditions can be found.
  • FELIX (Finite Element for Land Ice eXperiments) [PI Salinger]: This application solves variants of a nonlinear Stokes flow for simulating the evolution of Ice Sheets. In particular we are modeling the Greenland and Antarctic Ice Sheets for modeling effects of Climate change, in particularly their influence on Sea-Level Rise. Will be linked into ACME.
  • Aeras [PI Spotz]: A component-based approach to atmospheric modeling, where advanced analysis algorithms and design for efficient code on new architectures are built into the code.

In addition, Albany is used as a platform for algorithmic research:

  • FASTMath SciDAC project: We are developing a capability for adaptive mesh refinement within an unstructured grid application, in collaboration with Mark Shephard’s group at the SCOREC center at RPI.
  • Embedded UQ: Research into embedded UQ algorithms led by Eric Phipps often uses Albany as a demonstration platform.
  • Performance Portable Kernels for new architectures: Albany is serving as a research vehicle for programming finite element assembly kernels using the Trilinos/Kokkos programming model and library.
Alegra

Alegra

The ALEGRA application is targeted at the simulation of high strain rate, magnetohydrodynamic, electromechanic and high energy density physics phenomena for the U.S. defense and energy programs. Research and development in advanced methods, including code frameworks, large scale inline meshing, multiscale lagrangian hydrodynamics, resistive magnetohydrodynamic methods, material interface reconstruction, and code verification and validation, keeps the software on the cutting edge of high performance computing.

COINFLIPS

COINFLIPS

COINFLIPS – CO-designed Improved Neural Foundations Leveraging Inherent Physics Stochasticity – is a DOE Office of Science Co-Design in Microelectronics project that aims to develop a novel computing paradigm for probabilistic computing that leverages stochastic devices and brain-inspired architecture and algorithm principles.

The brain has effectively proven a powerful inspiration for the development of computing architectures in which processing is tightly integrated with memory, communication is event-driven, and analog computation can be performed at scale. These neuromorphic systems increasingly show an ability to improve the efficiency and speed of scientific computing and artificial intelligence applications.  We propose that the brain’s ubiquitous stochasticity represents an additional source of inspiration for expanding the reach of neuromorphic computing to probabilistic applications.  To date, many efforts exploring probabilistic computing have focused primarily on one scale of the microelectronics stack, such as implementing probabilistic algorithms on deterministic hardware or developing probabilistic devices and circuits with the expectation that they will be leveraged by eventual probabilistic architectures.

In COINFLIPS, we are exploring a co-design vision by which large numbers of devices, such as magnetic tunnel junctions and tunnel diodes, can be operated in a stochastic regime and incorporated into a scalable neuromorphic architecture that can impact a number of probabilistic computing applications, such as Monte Carlo simulations and Bayesian neural networks.  

COINFLIPS has partners at New York University, University of Texas at Austin, University of Tennessee, Oak Ridge National Laboratory, and Temple University.

Project Website

Departments
Cognitive & Emerging Computing
Focus Areas
Cognitive Science
Computer Architecture
Contact
Aimone, James Bradley, jbaimon@sandia.gov

Keywords:

  • artificial intelligence
  • neural networks
  • neuromorphic
  • probabilistic
  • stochastic
E3SM – Energy Exascale Earth System Model

E3SM – Energy Exascale Earth System Model

E3SM is an unprecedented collaboration among seven National Laboratories, the National Center for Atmospheric Research, four academic institutions and one private-sector company to develop and apply the most complete leading-edge climate and Earth system models to the most challenging and demanding climate change research imperatives. It is the only major national modeling project designed to address U.S. Department of Energy (DOE) mission needs and specifically targets DOE Leadership Computing Facility resources now and in the future, because DOE researchers do not have access to other major climate computing centers. A major motivation for the E3SM project is the coming paradigm shift in computing architectures and their related programming models as capability moves into the Exascale era.  DOE, through its science programs and early adoption of new computing architectures, traditionally leads many scientific communities, including climate and Earth system simulation, through these disruptive changes in computing. 

Software
Kokkos
E3SM
Departments
Computational Mathematics
Focus Areas
Scalable Algorithms
Contact
Taylor, Mark A, mataylo@sandia.gov
ECP Supercontainers

ECP Supercontainers

Container computing has revolutionized how many industries and enterprises develop and deploy software and services. Recently, this model has gained traction in the High Performance Computing (HPC) community through enabling technologies including Charliecloud, Shifter, Singularity, and Docker.  In this same trend, container-based computing paradigms have gained significant interest within the DOE/NNSA Exascale Computing Project (ECP). While containers provide greater software flexibility, reliability, ease of deployment, and portability for users, there are still several challenges in this area for Exascale.

The goal of the ECP Supercomputing Containers Project (called Supercontainers) is to use a multi-level approach to accelerate adoption of container technologies for Exascale, ensuring that HPC container runtimes will be scalable, interoperable, and integrated into Exascale supercomputing across DOE. The core components of the SuperContainer project focus on foundational system software research needed for ensuring containers can be deployed at scale, enhanced user and developer support for enabling ECP Application Development (AD) and Software Technology (ST) projects looking to utilize containers, validated container runtime interoperability, and both vendor and E6 facilities system software integration with containers.

Contact
Younge, Andrew J, ajyoung@sandia.gov
FASTMath

FASTMath

The FASTMath SciDAC Institute develops and deploys scalable mathematical algorithms and software tools for reliable simulation of complex physical phenomena and collaborates with application scientists to ensure the usefulness and applicability of FASTMath technologies.

FASTMath is a collaboration between Argonne National Laboratory, Lawrence Berkeley National Laboratory, Lawrence Livermore National Laboratory, Massachusetts Institute of Technology, Rensselaer Polytechnic Institute, Sandia National Laboratories, Southern Methodist University, and University of Southern California.  Dr. Esmond Ng, LBNL, leads the FASTMath project.

Hobbes – Extreme-Scale Operating Systems Project

Hobbes – Extreme-Scale Operating Systems Project

Hobbes was a Sandia-led collaboration between four national laboratories and eight universities supported by the DOE Office of Science Advanced Scientific Computing Research program office. The goal of this three-year project was to deliver an operating system for future extreme-scale parallel computing platforms that will address the major technical challenges of energy efficiency, managing massive parallelism and deep memory hierarchies, and providing resilience in the presence of increasing failures. Our approach was to enable application composition through lightweight virtualization. Application composition is a critical capability that will be the foundation of the way extreme-scale systems must be used in the future. The tighter integration of modeling and simulation capability with analysis and the increasing complexity of application workflows demand more sophisticated machine usage models and new system-level services. Ensemble calculations for uncertainty quantification, large graph analytics, multi-materials and multi-physics applications are just a few examples that are driving the need for these new system software interfaces and mechanisms for managing memory, network, and computational resources. Rather than providing a single unified operating system and runtime system that supports several parallel programming models, Hobbes leveraged lightweight virtualization to provide the flexibility to construct and efficiently execute custom OS/R environments. Hobbes extended previous work on the Kitten lightweight operating system and the Palacios lightweight virtual machine monitor.

Contact
Brightwell, Ronald B. (Ron), rbbrigh@sandia.gov
HPC Resource Allocation

HPC Resource Allocation

HPC resource allocation consists of a pipeline of methods by which distributed-memory work is assigned to distributed-memory resources to complete that work. This pipeline spans both system and application level software. At the system level, it consists of scheduling and allocation. At the application level, broadly speaking, it consists of discretization (meshing), partitioning, and task mapping. Scheduling, given requests for resources and available resources, decides which request will be assigned resources next or when a request will be assigned resources. When a request is granted, allocation decides which specific resources will be assigned to that request. For the application, HPC resource allocation begins with the discretization and partitioning of the work into a distributed-memory model and ends with the task mapping that matches the allocated resources to the partitioned work. Additionally, network architecture and routing have a strong impact on HPC resource allocation. Each of these problems is solved independently and makes assumptions about how the other problems are solved. We have worked in all of these areas and have recently begun work to combine some of them, in particular allocation and task mapping.  We have used analysis, simulation, and real system experiments in this work. Techniques specific to any particular application have not been considered in this work.

Software
Zoltan
Focus Areas
Scalable Algorithms
System Software
Contact
Leung, Vitus J., vjleung@sandia.gov
Institute for the Design of Advanced Energy Systems (IDAES)

Institute for the Design of Advanced Energy Systems (IDAES)

Transforming and decarbonizing the world’s energy systems to make them environmentally sustainable while maintaining high reliability and low cost is a task that requires the very best computational and simulation capabilities to examine a complete range of technology options, ensure that the best choices are made, and to support their rapid and effective implementation.

The Institute for Design of Advanced Energy Systems (IDAES) was originated to bring the most advanced modeling and optimization capabilities to these challenges. The resulting IDAES integrated platform utilizes the most advanced computational algorithms to enable the design and optimization of complex, interacting energy and process systems from individual plant components to the entire electrical grid.

IDAES is led by the National Energy Technology Laboratory (NETL) with participants from Lawrence Berkeley National Laboratory (LBNL), Sandia National Laboratories (SNL), Carnegie-Mellon University, West Virginia University, University of Notre Dame, and Georgia Institute of Technology.

The IDAES leadership team is:

  • David C. Miller, Technical Director
  • Anthony Burgard, NETL PI
  • Deb Agarwal, LBNL PI
  • John Siirola, SNL PI
Kitten Lightweight Kernel

Kitten Lightweight Kernel

Kitten is a current-generation lightweight kernel (LWK) compute node operating system designed for large-scale parallel computing systems. Kitten is the latest in a long line of successful LWKs, including SUNMOS, Puma, Cougar, and Catamount. Kitten distinguishes itself from these prior LWKs by providing a Linux-compatible user environment, a more modern and extendable open-source codebase, and a virtual machine monitor capability via Palacios that allows full-featured guest operating systems to be loaded on-demand.

Project Website

Departments
Scalable System Software
Contact
Pedretti, Kevin, ktpedre@sandia.gov
Kokkos

Kokkos

Modern high performance computing (HPC) nodes have diverse and heterogeneous types of cores and memory.  For applications and domain-specific libraries/languages to scale, port, and perform well on these next generation architectures, their on-node algorithms must be re-engineered for thread scalability and performance portability.  The Kokkos programming model and its C++ library implementation helps HPC applications and domain libraries implement intra-node thread-scalable algorithms that are performance portable across diverse manycore architectures such as multicore CPUs, Intel Xeon Phi, NVIDIA GPU, and AMD GPU.

This research, development, and deployment project advances the Kokkos programming model with new intra-node parallel algorithm abstractions, implements these abstractions in the Kokkos library, and supports applications’ and domain libraries’ effective use of Kokkos through consulting and tutorials. The project fosters numerous internal and external collaborations, especially with the ISO/C++ language standard committee to promote Kokkos abstractions into future ISO/C++ language standards. Kokkos is part of the DOE Exascale Computing Project. 

Mesquite

Mesquite

MESQUITE is a linkable software library that applies a variety of node-movement algorithms to improve the quality and/or adapt a given mesh. Mesquite uses advanced smoothing and optimization to:

  • Untangle meshes,
  • Provide local size control,
  • Improve angles, orthogonality, and skew,
  • Increase minimum edge-lengths for increased time-steps,
  • Improve mesh smoothness,
  • Perform anisotropic smoothing,
  • Improve surface meshes, adapt to surface curvature,
  • Improve hybrid meshes (including pyramids & wedges),
  • Smooth meshes with hanging nodes,
  • Maintain quality of moving and/or deforming meshes,
  • Perform ALE rezoning,
  • Improve mesh quality on and near boundaries,
  • Improve transitions across internal boundaries,
  • Align meshes with vector fields, and
  • R-adapt meshes to solutions using error estimates.

Mesquite improves surface or volume meshes which are structured, unstructured, hybrid, or non-comformal. A variety of element types are permitted. Mesquite is designed to be as efficient as possible so that large meshes can be improved.

Portals Interconnect API

Portals Interconnect API

Portals is an interconnect API intended to allow scalable, high-performance network communication between nodes of a large-scale parallel computing system. Portals is based on a building blocks approach that enables multiple upper-level protocols, such as MPI and SHMEM, to be used simultaneously within a process. This approach also encapsulates important semantics that can be offloaded to a network interface controller (NIC) to optimize performance-critical functionality. Previous generations of Portals have been deployed on large-scale production systems, including the Intel ASCI Red machine and Cray’s SeaStar interconnect for their XT product line. The current generation API is being used to enable advanced NIC architecture research for future extreme-scale systems.

Project Website

Departments
Scalable System Software
Focus Areas
Computer Architecture
Contact
Brightwell, Ronald B. (Ron), rbbrigh@sandia.gov
Power API

Power API

Achieving practical exascale supercomputing will require massive increases in energy efficiency. The bulk of this improvement will likely be derived from hardware advances such as improved semiconductor device technologies and tighter integration, hopefully resulting in more energy efficient computer architectures. Still, software will have an important role to play. With every generation of new hardware, more power measurement and control capabilities are exposed. Many of these features require software involvement to maximize feature benefits. This trend will allow algorithm designers to add power and energy efficiency to their optimization criteria. Similarly, at the system level, opportunities now exist for energy-aware scheduling to meet external utility constraints such as time of day cost charging and power ramp rate limitations. Finally, future architectures might not be able to operate all components at full capability for a range of reasons including temperature considerations or power delivery limitations. Software will need to make appropriate choices about how to allocate the available power budget given many, sometimes conflicting considerations. For these reasons, we have developed a portable API for power measurement and control.

Project Website

Departments
Scalable System Software
Contact
Ron Brightwell, rbbrigh@sandia.gov
Stitch – IO Library for highly localized simulations

Stitch – IO Library for highly localized simulations

IO libraries typically focus on writing the entire simulation domain for each output. For many computation classes, this is the correct choice. However, there are some cases where this approach is wasteful in time and space.

The Stitch library was developed initially for use with the SPPARKS kinetic monte carlo simulation to handle IO tasks for a welding simulation. This simulation type has a particular feature where there is computational intensity in a small part of the simulation domain with the rest being idle. Given this intensity, only writing the area that changes is far more space efficient than writing the entire simulation domain for each output. Further, the computation can be focused strictly on the area where the data will change rather than the whole domain. This can yield a reduction from 1024 to 16 processes and 1/64th the data written. These combined can lead to a reduction in the computation time with no loss in data quality. If anything, by reducing the amount written each time, more output is possible.

This approach is also applicable for finite element codes that share the same localized physics.

The code is in the final stages of copyright review and will be released on github.com. A work in progress paper was presented at PDSW-DISCS @ SC18 and a full CS conference paper is planned for H1 2019 and a follow-on materials science journal paper.

Focus Areas
Computational Materials
System Software
Contact
Lofstead, Gerald Fredrick (Jay), gflofst@sandia.gov
Structural Simulation Toolkit (SST)

Structural Simulation Toolkit (SST)

High performance computing architectures are undergoing a marked transformation. Increasing performance of the largest parallel machines at the same exponential rate will require that applications expose more parallelism at an accelerated pace due to the advent of multi-core processors at relatively flat clock rates. The extreme number of hardware components in this machines along with I/O bottlenecks will necessitate looking beyond the traditional checkpoint/restart mechanism for dealing with machine failures. Additionally, as a result of the high power requirements of these machines the energy required to obtain a result will become as important as the time to solution. These changes mean that a new approach to the development of extreme-scale hardware and software is needed relying on the simulaneous exploration of both the hardware and software design space, a process referred to as co-design.

The Structural Simulation Toolkit (SST) enables co-design of extreme-scale architectures by allowing simulation of diverse aspects of hardware and software relevant to such environments. Innovations in instruction set architecture, memory systems, the network interface, and full system network can be explored in the context of design choices for the programming model and algorithms. The package provides two novel capabilities. The first is a fully modular design that enables extensive exploration of an individual system parameter without the need for intrusive changes to the simulator. The second is a parallel simulation environment based on MPI. This provides a high level of performance and the ability to look at large systems. The framework has been successfully used to model concepts ranging from processing in memory to conventional processors connected by conventional network interfaces and running MPI.

Project Website

Contact
Gwen Voskuilen, grvosku@sandia.gov
Trilinos

Trilinos

The Trilinos Project is an effort to develop algorithms and enabling technologies within an object-oriented software framework for the solution of large-scale, complex multi-physics engineering and scientific problems. A unique design feature of Trilinos is its focus on packages.

Vanguard

Vanguard

The Vanguard project is expanding the high-performance computing ecosystem by evaluting and accelerating the development of emerging technologies in order to increase their viability for future large-scale production platforms. The goal of the project is to reduce the risk in deploying unproven technologies by identifying gaps in the hardware and software ecosystem and making focused investments to address them. The approach is to make early investments that identify the essential capabilities needed to move technologies from small-scale testbed to large-scale production use.

Project Website

Departments
Scalable System Software
Focus Areas
Computer Architecture
Contact
Laros, James H., jhlaros@sandia.gov
XPRESS – eXascale Programming Environment and System Software

XPRESS – eXascale Programming Environment and System Software

The XPRESS Project was one of four major projects of the DOE Office of Science Advanced Scientific Computing Research X-stack Program initiated in September, 2012. The purpose of XPRESS was to devise an innovative system software stack to enable practical and useful exascale computing around the end of the decade with near-term contributions to efficient and scalable operation of trans-Petaflops performance systems in the next two to three years; both for DOE mission-critical applications. To this end, XPRESS directly addressed critical challenges in computing of efficiency, scalability, and programmability through introspective methods of dynamic adaptive resource management and task scheduling.

Contact
Brightwell, Ronald B. (Ron), rbbrigh@sandia.gov
Zoltan

Zoltan

The Zoltan project focuses on parallel algorithms for parallel combinatorial scientific computing, including partitioning, load balancing, task placement, graph coloring, matrix ordering, distributed data directories, and unstructured communication plans.

The Zoltan toolkit is an open-source library of MPI-based distributed memory algorithms.  It includes geometric and hypergraph partitioners, global graph coloring, distributed data directories using rendezvous algorithms, primitives to simplify data movement and unstructured communication, and interfaces to the ParMETIS, Scotch and PaToH partitioning libraries.  It is written in C and can be used as a stand-alone library.

The Zoltan2 toolkit is the next-generation toolkit for multicore architectures.  It includes MPI+OpenMP algorithms for geometric partitioning, architecture-aware task placement, and local matrix ordering.  It is written in templated C++ and is tightly integrated with the Trilinos toolkit.

Project Website

Software
Zoltan
Departments
Scalable Algorithms
Focus Areas
Scalable Algorithms
Contact
Boman, Erik Gunnar, egboman@sandia.gov