Publications Search

Characterize the Role of the Mini-Applications in Predicting Key Performance Characteristics of Real Applications

Barrett, Richard F.; Doerfler, Douglas W.; Crozier, Paul C.; Heroux, Michael A.; Lin, Paul L.; Thornquist, Heidi K.; Trucano, Timothy G.; Vaughan, Courtenay T.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Exascale Design Space Exploration and Co-design

Proposed for publication in Future Generation Computer Systems.

Barrett, Richard F.; Trucano, Timothy G.; Doerfler, Douglas W.; Dosanjh, Sudip S.; Hammond, Simon D.; Hemmert, Karl S.; Heroux, Michael A.; Lin, Paul L.; Pedretti, Kevin P.; Rodrigues, Arun

Abstract not provided.

More Details

TYPE Journal Article YEAR 2012

OSTI

MiniGhost : a miniapp for exploring boundary exchange strategies using stencil computations in scientific parallel computing

Barrett, Richard F.; Vaughan, Courtenay T.; Heroux, Michael A.

A broad range of scientific computation involves the use of difference stencils. In a parallel computing environment, this computation is typically implemented by decomposing the spacial domain, inducing a 'halo exchange' of process-owned boundary data. This approach adheres to the Bulk Synchronous Parallel (BSP) model. Because commonly available architectures provide strong inter-node bandwidth relative to latency costs, many codes 'bulk up' these messages by aggregating data into a message as a means of reducing the number of messages. A renewed focus on non-traditional architectures and architecture features provides new opportunities for exploring alternatives to this programming approach. In this report we describe miniGhost, a 'miniapp' designed for exploration of the capabilities of current as well as emerging and future architectures within the context of these sorts of applications. MiniGhost joins the suite of miniapps developed as part of the Mantevo project.

More Details

TYPE SAND Report YEAR 2012

OSTI DOI

Enabling Extreme-Scale Computation for Emerging Discretizations

Parks, Michael L.; Heroux, Michael A.; Day, David M.; Littlewood, David J.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Evaluation Optimization and Application of Execution Models for Exascale Computing

Hendry, Gilbert H.; Heroux, Michael A.; Clay, Robert L.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

The Trilinos Project - Enabling predictive science and engineering through software libraries for scalable computing

Willenbring, James M.; Heroux, Michael A.; Devine, Karen D.; Boman, Erik G.

Abstract not provided.

More Details

TYPE Presentation YEAR 2012

OSTI

Fault-tolerant iterative methods via selective reliability

Ferreira, Kurt; Heroux, Michael A.; Hoemmen, Mark F.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

Towards Efficient Preconditioning in Manycore Architectures

Rajamanickam, Sivasankaran R.; Heroux, Michael A.; Boman, Erik G.

Abstract not provided.

More Details

TYPE Conference YEAR 2012

OSTI

TriBITS lifecycle model. Version 1.0, a lean/agile software lifecycle model for research-based computational science and engineering and applied mathematical software

Willenbring, James M.; Heroux, Michael A.

Software lifecycles are becoming an increasingly important issue for computational science and engineering (CSE) software. The process by which a piece of CSE software begins life as a set of research requirements and then matures into a trusted high-quality capability is both commonplace and extremely challenging. Although an implicit lifecycle is obviously being used in any effort, the challenges of this process - respecting the competing needs of research vs. production - cannot be overstated. Here we describe a proposal for a well-defined software lifecycle process based on modern Lean/Agile software engineering principles. What we propose is appropriate for many CSE software projects that are initially heavily focused on research but also are expected to eventually produce usable high-quality capabilities. The model is related to TriBITS, a build, integration and testing system, which serves as a strong foundation for this lifecycle model, and aspects of this lifecycle model are ingrained in the TriBITS system. Here, we advocate three to four phases or maturity levels that address the appropriate handling of many issues associated with the transition from research to production software. The goals of this lifecycle model are to better communicate maturity levels with customers and to help to identify and promote Software Engineering (SE) practices that will help to improve productivity and produce better software. An important collection of software in this domain is Trilinos, which is used as the motivation and the initial target for this lifecycle model. However, many other related and similar CSE (and non-CSE) software projects can also make good use of this lifecycle model, especially those that use the TriBITS system. Indeed this lifecycle process, if followed, will enable large-scale sustainable integration of many complex CSE software efforts across several institutions.

More Details

TYPE SAND Report YEAR 2012

OSTI DOI

Precision Neutral Computation Enables Efficient Robust Algorithms

Parks, Michael L.; Heroux, Michael A.; Day, David M.; Frischknecht, Amalie F.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Fault-tolerant iterative methods via selective reliability

Hoemmen, Mark F.; Heroux, Michael A.; Ferreira, Kurt

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

A High-Level View of the Trilinos Project

Scientific Programming

Willenbring, James M.; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2011

OSTI

Copy of Mini-applications: Vehicles for Co-Design

Barrett, Richard F.; Heroux, Michael A.; Lin, Paul L.; Vaughan, Courtenay T.; Williams, Alan B.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Enabling Tools for Extreme Scale Computation of Nanoscale Fluids

Parks, Michael L.; Heroux, Michael A.; Frischknecht, Amalie F.; Day, David M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

A Tutorial on Anasazi and Belos

Thornquist, Heidi K.; Hoemmen, Mark F.; Heroux, Michael A.; Lehoucq, Richard B.; Parks, Michael L.; Day, David M.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Enabling Next-Generation Parallel Circuit Simulation with Trilinos

Boman, Erik G.; Heroux, Michael A.; Keiter, Eric R.; Rajamanickam, Sivasankaran R.; Schiek, Richard S.; Thornquist, Heidi K.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

LDRD final report : autotuning for scalable linear algebra

Heroux, Michael A.

This report summarizes the progress made as part of a one year lab-directed research and development (LDRD) project to fund the research efforts of Bryan Marker at the University of Texas at Austin. The goal of the project was to develop new techniques for automatically tuning the performance of dense linear algebra kernels. These kernels often represent the majority of computational time in an application. The primary outcome from this work is a demonstration of the value of model driven engineering as an approach to accurately predict and study performance trade-offs for dense linear algebra computations.

More Details

TYPE SAND Report YEAR 2011

OSTI DOI

Multicore/GPGPU Portable Computational Kernels via Multidimensional Arrays

Sunderland, Daniel S.; Porter, V.L.; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Preparing for Tomorrow's Systems: Manycore Resilience Patterns and Transition

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Mini-applications: Vehicles for Co-Design

Barrett, Richard F.; Heroux, Michael A.; Lin, Paul L.; Vaughan, Courtenay T.; Williams, Alan B.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Fault-tolerant iterative methods via selective reliability

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Cooperative Application/OS DRAM Fault Recovery

Hoemmen, Mark F.; Ferreira, Kurt; Heroux, Michael A.; Brightwell, Ronald B.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

A Hybrid solver for general sparse linear systems

Rajamanickam, Sivasankaran R.; Boman, Erik G.; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Toward portable programming of numerical linear algebra on manycore nodes

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

A Hybrid-Hybrid Solver for Manycore Platforms

Rajamanickam, Sivasankaran R.; Boman, Erik G.; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Fault-tolerant iterative methods

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Building the Next Generation of Parallel Applications: Co-Design Opportunities and Challenges

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Factors impacting performance of multithreaded sparse riangular solvet

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Wolf, Michael M.; Heroux, Michael A.; Boman, Erik G.

As computational science applications grow more parallel with multi-core supercomputers having hundreds of thousands of computational cores, it will become increasingly difficult for solvers to scale. Our approach is to use hybrid MPI/threaded numerical algorithms to solve these systems in order to reduce the number of MPI tasks and increase the parallel efficiency of the algorithm. However, we need efficient threaded numerical kernels to run on the multi-core nodes in order to achieve good parallel efficiency. In this paper, we focus on improving the performance of a multithreaded triangular solver, an important kernel for preconditioning. We analyze three factors that affect the parallel performance of this threaded kernel and obtain good scalability on the multi-core nodes for a range of matrix sizes. © 2011 Springer-Verlag Berlin Heidelberg.

More Details

TYPE Conference YEAR 2011

Scopus OSTI

Self-similarity of parallel machines

Parallel Computing

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Journal Article YEAR 2011

OSTI

Supercomputer and Cluster Application Performance Analysis using Python and MySQL

Barnette, Daniel W.; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Building the next generation of scalable manycore applications and libraries

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Improving CSE Software through Reproducibility Requirements

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Miniapplications: Vehicles for Co-design

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

A Hybrid Parallel Sparse Solver

Boman, Erik G.; Rajamanickam, Sivasankaran R.; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2011

OSTI

Building the Next Generation of Parallel Applications

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2011

OSTI

Recent developments in sparse direct methods in trilinos

Rajamanickam, Sivasankaran R.; Boman, Erik G.; Heroux, Michael A.; Day, David M.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Expanding the Trilinos developer community

Heroux, Michael A.

The Trilinos Project started approximately nine years ago as a small effort to enable research, development and ongoing support of small, related solver software efforts. The 'Tri' in Trilinos was intended to indicate the eventual three packages we planned to develop. In 2007 the project expanded its scope to include any package that was an enabling technology for technical computing. Presently the Trilinos repository contains over 55 packages covering a broad spectrum of reusable tools for constructing full-featured scalable scientific and engineering applications. Trilinos usage is now worldwide, and many applications have an explicit dependence on Trilinos for essential capabilities. Users come from other US laboratories, universities, industry and international research groups. Awareness and use of Trilinos is growing rapidly outside of Sandia. Members of the external research community are becoming more familiar with Trilinos, its design and collaborative nature. As a result, the Trilinos project is receiving an increasing number of requests from external community members who want to contribute to Trilinos as developers. To-date we have worked with external developers in an ad hoc fashion. Going forward, we want to develop a set of policies, procedures, tools and infrastructure to simplify interactions with external developers. As we go forward with multi-laboratory efforts such as CASL and X-Stack, and international projects such as IESP, we will need a more streamlined and explicit process for making external developers 'first-class citizens' in the Trilinos development community. This document is intended to frame the discussion for expanding the Trilinos community to all strategically important external members, while at the same time preserving Sandia's primary leadership role in the project.

More Details

TYPE SAND Report YEAR 2010

OSTI DOI

Obtaining Parallelism on Multicore and GPU Architectures in a Painless Manner

Wolf, Michael W.; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Trilinos for emerging parallel computing systems

Heroux, Michael A.

Trilinos is an object-oriented software framework to enabled the solution of large-scale, complex multiphysics engineering and scientific problems. Different Trilinos packages build on each other to create a stack providing the necessary capability: (1) Non-linear solver; (2) Linear solver/preconditioner; (3) Distributed linear algebra; and (4) Local linear algebra.

More Details

TYPE Conference YEAR 2010

OSTI

Factors impacting performance of multithreaded triangular solve

Wolf, Michael W.; Heroux, Michael A.; Boman, Erik G.

As computational science applications grow more parallel with multi-core supercomputers having hundreds of thousands of computational cores, it will become increasingly difficult for solvers to scale. Our approach is to use hybrid MPI/threaded numerical algorithms to solve these systems in order to reduce the number of MPI tasks and increase the parallel efficiency of the algorithm. However, we need efficient threaded numerical kernels to run on the multi-core nodes in order to achieve good parallel efficiency. In this paper, we focus on improving the performance of a multithreaded triangular solver, an important kernel for preconditioning. We analyze three factors that affect the parallel performance of this threaded kernel and obtain good scalability on the multi-core nodes for a range of matrix sizes.

More Details

TYPE Conference YEAR 2010

OSTI

Building the next generation of parallel applications

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Extreme Algorithms and Software Co-Design: It's EASI!

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2010

OSTI

Inexact Krylov Subspace Methods for Fluid Density Functional Theories

Parks, Michael L.; Frischknecht, Amalie F.; Day, David M.; Heroux, Michael A.

Abstract not provided.

More Details

TYPE Presentation YEAR 2010

OSTI

Enabling Architectures for Large-Scale Applications (Presentation)

Heroux, Michael A.

Abstract not provided.

More Details

TYPE Conference YEAR 2010

OSTI

Parallel phase model: A programming model for high-end parallel machines with manycores

Proceedings of the International Conference on Parallel Processing

Brightwell, Ronald B.; Heroux, Michael A.; Wen, Zhaofang W.; Wu, Junfeng

This paper presents a parallel programming model, Parallel Phase Model (PPM), for next-generation high-end parallel machines based on a distributed memory architecture consisting of a networked cluster of nodes with a large number of cores on each node. PPM has a unified high-level programming abstraction that facilitates the design and implementation of parallel algorithms to exploit both the parallelism of the many cores and the parallelism at the cluster level. The programming abstraction will be suitable for expressing both fine-grained and coarse-grained parallelism. It includes a few high-level parallel programming language constructs that can be added as an extension to an existing (sequential or parallel) programming language such as C; and the implementation of PPM also includes a light-weight runtime library that runs on top of an existing network communication software layer (e.g. MPI). Design philosophy of PPM and details of the programming abstraction are also presented. Several unstructured applications that inherently require high-volume random fine-grained data accesses have been implemented in PPM with very promising results. © 2009 IEEE.

More Details

TYPE SAND Report YEAR 2009

Scopus OSTI DOI