Publications Search

Finite Element Tools for Performance Portability of Implicit and IMEX Simulations on Next Generation Architectures

Pawlowski, Roger P.; Phipps, Eric T.; Trott, Christian R.; Cyr, Eric C.; Shadid, John N.

Abstract not provided.

TYPE Conference Presenation YEAR 2020

OSTI DOI

Adaptive Computational Plasticity with a Composite Tetrahedral Element

Granzow, Brian N.; Foulk, James W.; Ibanez-Granados, Daniel A.; Mota, Alejandro M.; Ostien, Jakob O.; Talamini, Brandon T.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Triethyl Aluminium - A Precursor for Atomically-Precise Acceptor Dopant Placement?

Owen, James H.G.; Campbell, Quinn C.; Santini, Robin S.; Ivie, Jeffrey A.; Baczewski, Andrew D.; Schmucker, Scott W.; Bussmann, Ezra B.; Misra, Shashank M.; Randall, John R.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Distributed Memory Graph Coloring Algorithms for Multiple GPUs

Proceedings of IA3 2020: 10th Workshop on Irregular Applications: Architectures and Algorithms, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis

Bogle, Ian; Boman, Erik G.; Devine, Karen D.; Rajamanickam, Sivasankaran R.; Slota, George M.

Graph coloring is often used in parallelizing scientific computations that run in distributed and multi-GPU environments; it identifies sets of independent data that can be updated in parallel. Many algorithms exist for graph coloring on a single GPU or in distributed memory, but hybrid MPI+GPU algorithms have been unexplored until this work, to the best of our knowledge. We present several MPI+GPU coloring approaches that use implementations of the distributed coloring algorithms of Gebremedhin et al. and the shared-memory algorithms of Deveci et al. The on-node parallel coloring uses implementations in KokkosKernels, which provide parallelization for both multicore CPUs and GPUs. We further extend our approaches to solve for distance-2 coloring, giving the first known distributed and multi-GPU algorithm for this problem. In addition, we propose novel methods to reduce communication in distributed graph coloring. Our experiments show that our approaches operate efficiently on inputs too large to fit on a single GPU and scale up to graphs with 76.7 billion edges running on 128 GPUs.

More Details

TYPE Conference Paper YEAR 2020

Scopus OSTI

Solving the multiscale modeling problem of plasma physics with heterogeneous methods

Bettencourt, Matthew T.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

CSRI Summer Proceedings 2020

Rushdi, Ahmad R.

The Computer Science Research Institute (CSRI) brings university faculty and students to Sandia for focused collaborative research on Department of Energy (DOE) computer and computational science problems. The institute provides an opportunity for university researchers to learn about problems in computer and computational science at DOE laboratories. Participants conduct leading-edge research, interact with scientists and engineers at the laboratories, and help transfer results of their research to programs at the labs. Some specific CSRI research interest areas are: scalable solvers, optimization, adaptivity and mesh refinement, graph-based, discrete, and combinatorial algorithms, uncertainty estimation, mesh generation, dynamic load-balancing, virus and other malicious-code defense, visualization, scalable cluster computers, data-intensive computing, environments for scalable computing, parallel input/output, advanced architectures, and theoretical computer science. The CSRI Summer Program is organized by CSRI and typically includes the organization of a weekly seminar series and the publication of a summer proceedings. In 2020, the CSRI summer program was executed completely virtually; all student interns worked from home, due to the COVID-19 pandemic.

More Details

TYPE Other Report YEAR 2020

OSTI DOI

Mixed-Precision GMRES in Trilinos

Loe, Jennifer A.; Glusa, Christian A.; Yamazaki, Ichitaro Y.; Boman, Erik G.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Linear Solvers and Computing: Behind the Scenes

Loe, Jennifer A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Promotion to the Next Level ? Industry and Labs

Phillips, Cynthia A.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Hierarchical Parallelism for Transient Solid Mechanics Simulations

Littlewood, David J.; Plews, Julia A.; Morales, Nicolas M.; Jones, Reese E.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Conservative Remap and Interpolation for the Discrete Element for Sea Ice Model

Peterson, Kara J.; Bolintineanu, Dan S.; Turner, Adrian T.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Towards generic parallel programming in computer science education with kokkos

Proceedings of EduHPC 2020: Workshop on Education for High Performance Computing, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis

Ciesko, Jan; Poliakoff, David; Hollman, Daisy S.; Trott, Christian C.; Lebrun-Grandie, Damien

Parallel patterns, views, and spaces are promising abstractions to capture the programmer's intent as well as the contextual information that can be used by an underlying runtime to efficiently map software to parallel hardware. These abstractions can be valuable in cases where an algorithm must accommodate requirements of code and performance portability across hardware architectures and vendor programming models. Kokkos is a parallel programming model for host- and accelerator architectures that relies on these abstractions and targets these requirements. It consists of a pure C++ interface, a specification, and a programming library. The programming library exposes patterns and types and maps them to an underlying abstract machine model. The abstract machine model offers a generic view of parallel hardware. While Kokkos is gaining popularity in large-scale HPC applications at some DOE laboratories, we believe that the implemented concepts are of interest to a broader audience including academia as they may contribute to a generic, vendor, and architecture-independent education of parallel programming. In this work, we give an insight into the design considerations of this programming model and list important abstractions. Further, we document best practices obtained from giving virtual classes on Kokkos and give pointers to resources that the reader may consider valuable for a lecture on generic parallel programming for students with preexisting knowledge on this matter.

More Details

TYPE Conference Presenation YEAR 2020

Scopus OSTI DOI

CANOPIE-HPC Workshop at SC20

Younge, Andrew J.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Learning continuum-scale models from micro-scale dynamics via Operator Regression

Patel, Ravi G.; Trask, Nathaniel A.; Wood, Mitchell A.; Cyr, Eric C.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Meshless Methods for Manifolds GMLS Approximations of Hydrodynamic Responses in Curved Fluid Interfaces

Gross, Ben G.; Kuberry, Paul A.; Trask, Nathaniel A.; Atzberger, Paul J.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Radd runtimes: Radical and different distributed runtimes with smartnics

Proceedings of IPDRM 2020: 4th Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis

Grant, Ryan E.; Schonbein, William W.; Levy, Scott

As network speeds increase, the overhead of processing incoming messages is becoming onerous enough that many manufacturers now provide network interface cards (NICs) with offload capabilities to handle these overheads. This increase in NIC capabilities creates an opportunity to enable computation on data in-situ on the NIC. These enhanced NICs can be classified into several different categories of SmartNICs. SmartNICs present an interesting opportunity for future runtime software designs. Designing runtime software to be located in the network as opposed to the host level leads to new radical distributed runtime possibilities that were not practical prior to SmartNICs. In the process of transitioning to a radically different runtime software design for SmartNICs there are intermediary steps of migrating current runtime software to be offloaded onto a SmartNIC that also present interesting possibilities. This paper will describe SmartNIC design and how SmartNICs can be leveraged to offload current generation runtime software and lead to future radically different in-network distributed runtime systems.

More Details

TYPE Conference Presenation YEAR 2020

Scopus OSTI DOI

B- and Al- Doped Delta Layers in Si Using Halogen-Based Precursors

Baek, S.B.; Farzani, A.F.; Radue, M.S.R.; Campbell, Quinn C.; Dwyer, K.J.D.; Baczewski, Andrew D.; Mo, Y.M.; Misra, Shashank M.; Butera, R.E.B.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

ML for Trajectories: Bridging the Gap from Computer to Analyst with Tracktable

Rintoul, Mark D.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Models of Models: Recognizing and Managing the Uncertainties of Machine Learning in Engineering Applications

Stracuzzi, David J.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

GALINI: An extensible mixed-integer quadratically-constrained optimization solver

Optimization Online Repository

Ceccon, Francesco C.; Baltean-Lugojan, Radu B.; Bynum, Michael L.; Li, C.K.; Misener, Ruth M.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2020

OSTI

Intelligent Networks for High Performance Computing

Schonbein, William W.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Low-cost MPI Multithreaded Message Matching Benchmarking

Schonbein, William W.; Grant, Ryan E.; Levy, Scott L.; Dosanjh, Matthew D.; Marts, William P.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

A Five-Moment Multifluid Model for Partially Ionized Plasmas With Arbitrarily Many Species

Crockatt, Michael M.; Shadid, John N.; Conde, Sidafa C.; Pawlowski, Roger P.; Mabuza, Sibu M.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

Neural Network Approaches for Enabling Automatic Target Recognition

Vineyard, Craig M.; Melzer, Ryan D.; Musuvathy, Srideep M.; Richards, John R.; Severa, William M.; Smith, John D.

Abstract not provided.

More Details

TYPE Conference Presenation YEAR 2020

OSTI DOI

A performance-portable nonhydrostatic atmospheric dycore for the energy exascale earth system model running at cloud-resolving resolutions

International Conference for High Performance Computing, Networking, Storage and Analysis, SC

Bertagna, Luca B.; Guba, Oksana G.; Taylor, Mark A.; Foucar, James G.; Larkin, Jeff; Bradley, Andrew M.; Rajamanickam, Sivasankaran R.; Salinger, Andrew G.

We present an effort to port the nonhydrostatic atmosphere dynamical core of the Energy Exascale Earth System Model (E3SM) to efficiently run on a variety of architectures, including conventional CPU, many-core CPU, and GPU. We specifically target cloud-resolving resolutions of 3 km and 1 km. To express on-node parallelism we use the C++ library Kokkos, which allows us to achieve a performance portable code in a largely architecture-independent way. Our C++ implementation is at least as fast as the original Fortran implementation on IBM Power9 and Intel Knights Landing processors, proving that the code refactor did not compromise the efficiency on CPU architectures. On the other hand, when using the GPUs, our implementation is able to achieve 0.97 Simulated Years Per Day, running on the full Summit supercomputer. To the best of our knowledge, this is the most achieved to date by any global atmosphere dynamical core running at such resolutions.

More Details

TYPE Conference Presenation YEAR 2020

Scopus OSTI DOI