Publications

5 Results
Skip to search filters

Advances in Mixed Precision Algorithms: 2021 Edition

Abdelfattah, Ahmad A.; Anzt, Hartwig A.; Ayala, Alan A.; Boman, Erik G.; Carson, Erin C.; Cayrols, Sebastien C.; Cojean, Terry C.; Dongarra, Jack D.; Falgout, Rob F.; Gates, Mark G.; Gr\"{u}tzmacher, Thomas G.; Higham, Nicholas J.; Kruger, Scott E.; Li, Sherry L.; Lindquist, Neil L.; Liu, Yang L.; Loe, Jennifer A.; Nayak, Pratik N.; Osei-Kuffuor, Daniel O.; Pranesh, Sri P.; Rajamanickam, Sivasankaran R.; Ribizel, Tobias R.; Smith, Bryce B.; Swirydowicz, Kasia S.; Thomas, Stephen T.; Tomov, Stanimire T.; M. Tsai, Yaohung M.; Yamazaki, Ichitaro Y.; Yang, Urike M.

Over the last year, the ECP xSDK-multiprecision effort has made tremendous progress in developing and deploying new mixed precision technology and customizing the algorithms for the hardware deployed in the ECP flagship supercomputers. The effort also has succeeded in creating a cross-laboratory community of scientists interested in mixed precision technology and now working together in deploying this technology for ECP applications. In this report, we highlight some of the most promising and impactful achievements of the last year. Among the highlights we present are: Mixed precision IR using a dense LU factorization and achieving a 1.8× speedup on Spock; results and strategies for mixed precision IR using a sparse LU factorization; a mixed precision eigenvalue solver; Mixed Precision GMRES-IR being deployed in Trilinos, and achieving a speedup of 1.4× over standard GMRES; compressed Basis (CB) GMRES being deployed in Ginkgo and achieving an average 1.4× speedup over standard GMRES; preparing hypre for mixed precision execution; mixed precision sparse approximate inverse preconditioners achieving an average speedup of 1.2×; and detailed description of the memory accessor separating the arithmetic precision from the memory precision, and enabling memory-bound low precision BLAS 1/2 operations to increase the accuracy by using high precision in the computations without degrading the performance. We emphasize that many of the highlights presented here have also been submitted to peer-reviewed journals or established conferences, and are under peer-review or have already been published.

More Details
5 Results
5 Results