Publications Search

Distributed Memory Graph Coloring Algorithms for Multiple GPUs

Bogle, Ian A.; Boman, Erik G.; Devine, Karen D.; Rajamanickam, Sivasankaran R.; Slota, George M.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Performance Portable Supernode-based Sparse Triangular Solver for Manycore Architectures

ACM International Conference Proceeding Series

Yamazaki, Ichitaro Y.; Rajamanickam, Sivasankaran R.; Ellingwood, Nathan D.

Sparse triangular solver is an important kernel in many computational applications. However, a fast, parallel, sparse triangular solver on a manycore architecture such as GPU has been an open issue in the field for several years. In this paper, we develop a sparse triangular solver that takes advantage of the supernodal structures of the triangular matrices that come from the direct factorization of a sparse matrix. We implemented our solver using Kokkos and Kokkos Kernels such that our solver is portable to different manycore architectures. This has the additional benefit of allowing our triangular solver to use the team-level kernels and take advantage of the hierarchical parallelism available on the GPU. We compare the effects of different scheduling schemes on the performance and also investigate an algorithmic variant called the partitioned inverse. Our performance results on an NVIDIA V100 or P100 GPU demonstrate that our implementation can be 12.4 × or 19.5 × faster than the vendor optimized implementation in NVIDIA's CuSPARSE library.

More Details

TYPE Conference Poster YEAR 2020

Scopus OSTI

Performance Portable Supernode-based Sparse Triangular Solver for Manycore Architecture

Yamazaki, Ichitaro Y.; Rajamanickam, Sivasankaran R.; Ellingwood, Nathan D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

A performance-portable nonhydrostatic atmospheric dycore for the Energy Exascale Earth System Model running at cloud-resolving resolutions

Bertagna, Luca B.; Guba, Oksana G.; Taylor, Mark A.; Foucar, James G.; Larkin, Jeff L.; Bradley, Andrew M.; Rajamanickam, Sivasankaran R.; Salinger, Andrew G.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Recent experiences withMachine Learning Perspectives fromAlgorithms Architectures and Applications

Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Supernode-based Sparse Triangular Solver using Kokkos

Yamazaki, Ichitaro Y.; Rajamanickam, Sivasankaran R.; Ellingwood, Nathan D.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

Accelerating Multiscale Materials Modeling with Machine Learning

Ellis, John E.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Presentation YEAR 2020

OSTI

Scalable Inference for Sparse Deep Neural Networks using Kokkos Kernels

Ellis, John E.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI DOI

Practices and Challenges of Software Development for a Performance Portable Ecosystem

Ellingwood, Nathan D.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

ADELUS: A Performance-Portable Dense LU Solver for Distributed-Memory Hardware-Accelerated Systems

Kotulski, J.D.; Dang, Vinh Q.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

SPHYNX: Spectral partitioning for HYbrid and aXelerator-enabled systems

Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020

Acer, Seher A.; Boman, Erik G.; Rajamanickam, Sivasankaran R.

Graph partitioning has been an important tool to partition the work among several processors to minimize the communication cost and balance the workload. While accelerator-based supercomputers are emerging to be the standard, the use of graph partitioning becomes even more important as applications are rapidly moving to these architectures. However, there is no scalable, distributed-memory, multi-GPU graph partitioner available for applications. We developed a spectral graph partitioner, Sphynx, using the portable, accelerator-friendly stack of the Trilinos framework. We use Sphnyx to systematically evaluate the various algorithmic choices in spectral partitioning with a focus on GPU performance. We perform those evaluations on irregular graphs, because state-of-the-art partitioners have the most difficulty on them. We demonstrate that Sphynx is up to 17x faster on GPUs compared to the case on CPUs, and up to 580x faster compared to a state-of-the-art multilevel partitioner. Sphynx provides a robust alternative for applications looking for a GPU-based partitioner.

More Details

TYPE Conference Poster YEAR 2020

Scopus OSTI

Utilizing Spatial Accelerators for Machine Learning and Linear Algebra Kernels

Moon, Gordon E.; Rajamanickam, Sivasankaran R.; Krishna, Tushar K.; Kwon, Hyoukjun K.; Chatarasi, Prasanth C.; Qin, Eric Q.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

MINT: Microarchitecture for Efficient and Interchangeable CompressioN Formats on Tensor Algebra

Qin, Eric Q.; Jeong, Geonhwa J.; Won, William W.; Kao, Sheng-Chun K.; Kwon, Hyoukjun K.; Srinivasan, Sudarshan S.; Das, Dipankar D.; Moon, Gordon E.; Rajamanickam, Sivasankaran R.; Krishna, Tushar K.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

SPHYNX: Spectral Partitioning for HYbrid aNd aXelerator-based systems

Acer, Seher A.; Boman, Erik G.; Rajamanickam, Sivasankaran R.

Abstract not provided.

More Details

TYPE Conference Poster YEAR 2020

OSTI

ECP Report: Update on Proxy Applications and Vendor Interactions

Ang, Jim A.; Sweeney, Christine S.; Wolf, Michael W.; Ellis, John E.; Ghosh, Sayan G.; Kagawa, Ai K.; Huang, Yunzhi H.; Rajamanickam, Sivasankaran R.; Ramakrishnaiah, Vinay R.; Schram, Malachi S.; Yoo, Shinjae Y.

The ExaLearn miniGAN team (Ellis and Rajamanickam) have released miniGAN, a generative adversarial network(GAN) proxy application, through the ECP proxy application suite. miniGAN is the first machine learning proxy application in the suite (note: the ECP CANDLE project did previously release some benchmarks) and models the performance for training generator and discriminator networks. The GAN's generator and discriminator generate plausible 2D/3D maps and identify fake maps, respectively. miniGAN aims to be a proxy application for related applications in cosmology (CosmoFlow, ExaGAN) and wind energy (ExaWind). miniGAN has been developed so that optimized mathematical kernels (e.g., kernels provided by Kokkos Kernels) can be plugged into to the proxy application to explore potential performance improvements. miniGAN has been released as open source software and is available through the ECP proxy application website (https://proxyapps.exascaleproject.ordecp-proxy-appssuite/) and on GitHub (https://github.com/SandiaMLMiniApps/miniGAN). As part of this release, a generator is provided to generate a data set (series of images) that are inputs to the proxy application.

More Details

TYPE Other Report YEAR 2020

OSTI DOI