From 2009-2011, I was a postdoctoral researcher at Sandia National Laboratories (NM), working primarily on the following three projects:
Extreme-scale Algorithms and Software Institute (EASI)
As a postdoc, I worked to develop architecture-aware algorithms for scalable performance as part of the Extreme-scale Algorithms and Software Institute (EASI). The basic idea behind this project is that there is a significant discrepancy between the theoretical peak performance of a supercomputer and the realized performance of important large-scale computational science applications. The thrust of this project is to improve the parallel performance of these applications to realize a higher percentage of this peak performance. My early work was spent on developing parallel triangular solvers for multi-core and many-core architectures. I also worked on researching and developing techniques for interfacing traditional MPI applications with hybrid MPI/multithreaded solvers.
CSCAPES
I continued some of my thesis work as part as the Institute for Combinatorial Scientific Computing and Petascale Simulations (CSCAPES), a SciDAC institute. In particular, I researched and developed new two-dimensional sparse matrix partitioning algorithms and worked on implementing 2D partitioning into Isorropia, so they will be readily available for Trilinos users. I also worked on the early development of Zoltan2, a package for partitioning, load-balancing, and other important combinatorial problems (and successor to the Zoltan package). As part of my CSCAPES work, I also led Sandia effort in sponsoring Harvey Mudd College clinic project on sparse matrix partitioning.
Climate Modeling
I also been worked on the NNSA BER Climate Research Project, to migrate FV-MAS (finite volumes for modeling across scales) into high-performance computing environments such as leadership class supercomputers. My efforts were mostly focused on improving the parallel performance of the parallel software and interfacing the climate modeling code with the load-balancing and ordering tools in Zoltan. I also was involved in developing a communication abstraction to allow for the usage of different parallel programming models within the climate application.
Papers
Erik G. Boman and Michael M. Wolf, “A Nested Dissection Partitioning Method for Parallel Sparse Matrix-Vector Multiplication,” IEEE HPEC 2013, Waltham, MA, September 2013.
Michael M. Wolf, Michael A. Heroux, and Erik G. Boman, “Factors Impacting Performance of Multithreaded Sparse Triangular Solve,” High Performance Computing for Computational Science: VECPAR 2010, Berkeley, CA, June 22-25, 2010. [paper, slides]
E.G. Boman, U.V. Catalyurek, C. Chevalier, K.D. Devine, I. Safro, and M.M. Wolf. “Advances in Parallel Partitioning, Load Balancing, and Matrix Ordering,” J. of Physics: Conference Series, vol. 180, 012008. (SciDAC09 Conference, San Diego, June 2009.)
Presentations
“Hybrid MPI/Multithreaded PCG: A Use Case for MPI Shared Memory Allocation,” Supercomputing 2010, New Orleans, November 13-19, 2010. (Poster presentation)
“Obtaining Parallelism on Multicore and GPU Architectures in a Painless Manner,” 2010 SEG Post-Convention Workshop on High Performance Implementation of Geophysical Applications, Denver, October 21, 2010. (Invited Talk)
“Recent Advances in Two-dimensional Sparse Matrix Partitioning,” SIAM Conference on Parallel Processing for Scientific Computing (PP10), Seattle, WA, February 24-26, 2010. (Minisymposium Presentation.)
“Improved Data Partitioning by Nested Dissection with Applications to Information Retrieval,” SIAM Workshop on Combinatorial Scientific Computing (CSC09), Seaside, CA, October 29-31, 2009. (Refereed Presentation).