Publications

Publications / SAND Report

Large-Scale Data Analytics and Its Relationship to Simulation

Leland, Robert

Large-Scale Data Analytics (LSDA) problems require finding meaningful patterns in data sets that are so large as to require leading-edge processing and storage capability. LSDA problems are increasingly important for government mission work, industrial application, and scientific discovery. Effective solution of some important LSDA problems requires a computational workload that is substantially different from that associated with traditional High Performance Computing (HPC) simulations intended to help understand physical phenomena or to conduct engineering. While traditional HPC application codes exploit structural regularity and data locality to improve performance, many analytics problems lead more naturally to very fine-grained communication between unpredictable sets of processors, resulting in less regular communication patterns that do not map efficiently on to typical HPC systems. In both simulation and analytics domains, however, data movement increasingly dominates the performance, energy usage, and price of computing systems. It is therefore plausible that we could find a more synergistic technology path forward. Even though future machines may continue to be configured differently for the two domains, a more common technological roadmap between them in the form of a degree of convergence in the underlying componentry and design principles to address these common technical challenges could have substantial technical and economic benefits. 1 Senior Advisor, High Performance Computing, National Security and International Affairs Division, Office of Science and Technology Policy Institute 2 Senior Advanced Memory Systems Architect, DRAM Solutions Group, Micron Technologies, Inc. 3 Director, Computing Research, Sandia National Laboratories 4 Associate Laboratory Director for Computing Sciences, Lawrence Berkeley National Laboratory 5 Computational Sciences and Mathematics Division Manager, Pacific Northwest National Laboratory 6 Principal Member of Technical Staff, Sandia National Laboratories