Publications Search

Proceedings of ExaMPI 2014: Exascale MPI 2014 - held in conjunction with SC 2014: The International Conference for High Performance Computing, Networking, Storage and Analysis

Stark, Dylan S.; Barrett, Richard F.; Grant, Ryan E.; Olivier, Stephen L.; Pedretti, Kevin P.; Vaughan, Courtenay T.

Advances in node-level architecture and interconnect technology needed to reach extreme scale necessitate a reevaluation of long-standing models of computation, in particular bulk synchronous processing. The end of Dennard-scaling and subsequent increases in CPU core counts each successive generation of general purpose processor has made the ability to leverage parallelism for communication an increasingly critical aspect for future extreme-scale application performance. But the use of massive multithreading in combination with MPI is an open research area, with many proposed approaches requiring code changes that can be unfeasible for important large legacy applications already written in MPI. This paper covers the design and initial evaluation of an extension of a massive multithreading runtime system supporting dynamic parallelism to interface with MPI to handle fine-grain parallel communication and communication-computation overlap. Our initial evaluation of the approach uses the ubiquitous stencil computation, in three dimensions, with the halo exchange as the driving example that has a demonstrated tie to real code bases. The preliminary results suggest that even for a very well-studied and balanced workload and message exchange pattern, co-scheduling work and communication tasks is effective at significant levels of decomposition using up to 131,072 cores. Furthermore, we demonstrate useful communication-computation overlap when handling blocking send and receive calls, and show evidence suggesting that we can decrease the burstiness of network traffic, with a corresponding decrease in the rate of stalls (congestion) seen on the host link and network.

More Details

TYPE Conference Poster YEAR 2014

Scopus OSTI DOI

An Evaluation of BitTorrent's Performance In HPC Environments

Dosanjh, Matthew D.; Kelly, Suzanne M.; Laros, James H.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2014

OSTI DOI

Reducing the bulk of the bulk synchronous parallel model

Parallel Processing Letters

Barrett, Richard F.; Vaughan, Courtenay T.; Hammond, Simon D.

For over two decades the dominant means for enabling portable performance of computational science and engineering applications on parallel processing architectures has been the bulk-synchronous parallel programming (BSP) model. Code developers, motivated by performance considerations to minimize the number of messages transmitted, have typically pursued a strategy of aggregating message data into fewer, larger messages. Emerging and future high-performance architectures, especially those seen as targeting Exascale capabilities, provide motivation and capabilities for revisiting this approach. In this paper we explore alternative configurations within the context of a large-scale complex multi-physics application and a proxy that represents its behavior, presenting results that demonstrate some important advantages as the number of processors increases in scale.

More Details

TYPE Journal Article YEAR 2013

OSTI DOI

An Evaluation of BitTorrent's Performance In HPC Enviroments

Kelly, Suzanne M.; Laros, James H.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

NNSA/ASC Test Bed Update

Hammond, Simon D.; Barrett, Richard F.; Vaughan, Courtenay T.; Trott, Christian R.; Laros, James H.; Kelly, Suzanne M.; Ang, James A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2013

OSTI

A first look at miniAMR

Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Application Explorations for Future Interconnects

Barrett, Richard F.; Vaughan, Courtenay T.; Hammond, Simon D.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Using the Cray Gemini Performance Counters

Pedretti, Kevin P.; Vaughan, Courtenay T.; Barrett, Richard F.; Devine, Karen D.; Hemmert, Karl S.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Ensuring Continued Scalability of Mesh Based Hydrocodes

Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Application Explorations for Future Interconnects

Barrett, Richard F.; Vaughan, Courtenay T.; Hammond, Simon D.; Hammond, Simon D.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Using the Cray Gemini Performance Counters

Pedretti, Kevin P.; Vaughan, Courtenay T.; Hemmert, Karl S.; Barrett, Richard F.

Abstract not provided.

More Details

TYPE Conference YEAR 2013

OSTI

Navigating an Evolutionary Fast Path to Exascale

Barrett, Richard F.; Hammond, Simon D.; Vaughan, Courtenay T.; Doerfler, Douglas W.; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Assessing the predictive capabilities of mini-applications

Barrett, Richard F.; Crozier, Paul C.; Doerfler, Douglas W.; Hammond, Simon D.; Heroux, Michael A.; Lin, Paul L.; Trucano, Timothy G.; Vaughan, Courtenay T.; Williams, Alan B.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Unprecedented Scalability and Performance of the new NNSA Tri-Lab Capacity Cluster 2 (TLCC2)

Rajan, Mahesh R.; Doerfler, Douglas W.; Lin, Paul L.; Hammond, Simon D.; Barrett, Richard F.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Unprecedented Scalability and Performance of the new NNSA Tri-Lab Capacity Cluster 2 (TLCC2)

Doerfler, Douglas W.; Lin, Paul L.; Hammond, Simon D.; Barrett, Richard F.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Navigating An Evolutionary Fast Path to Exascale

Barrett, Richard F.; Hammond, Simon D.; Vaughan, Courtenay T.; Doerfler, Douglas W.; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Characterize the Role of the Mini-Applications in Predicting Key Performance Characteristics of Real Applications

Barrett, Richard F.; Doerfler, Douglas W.; Crozier, Paul C.; Heroux, Michael A.; Lin, Paul L.; Thornquist, Heidi K.; Trucano, Timothy G.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Energy Based Performance Tuning for Large Scale High Performance Computing Systems

Laros, James H.; Pedretti, Kevin P.; Kelly, Suzanne M.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

MiniGhost : a miniapp for exploring boundary exchange strategies using stencil computations in scientific parallel computing

Barrett, Richard F.; Vaughan, Courtenay T.; Heroux, Michael A.

A broad range of scientific computation involves the use of difference stencils. In a parallel computing environment, this computation is typically implemented by decomposing the spacial domain, inducing a 'halo exchange' of process-owned boundary data. This approach adheres to the Bulk Synchronous Parallel (BSP) model. Because commonly available architectures provide strong inter-node bandwidth relative to latency costs, many codes 'bulk up' these messages by aggregating data into a message as a means of reducing the number of messages. A renewed focus on non-traditional architectures and architecture features provides new opportunities for exploring alternatives to this programming approach. In this report we describe miniGhost, a 'miniapp' designed for exploration of the capabilities of current as well as emerging and future architectures within the context of these sorts of applications. MiniGhost joins the suite of miniapps developed as part of the Mantevo project.

More Details

TYPE SAND Report YEAR 2012

OSTI DOI