Publications
Scalable hashing for shared memory supercomputers
Goodman, Eric G.; Lemaster, M.N.; Jimenez, Edward S.
Hashing is a fundamental technique in computer science to allow O(1) insert and lookups of items in an associative array. Here we present several thread coordination and hashing strategies and compare and contrast their performance on large, shared memory symmetric multiprocessor machines, each possessing between a half to a full terabyte of memory. We show how our approach can be used as a key kernel for fundamental paradigms such as dynamic programming and MapReduce. We further show that a set of approaches yields close to linear speedup for both uniform random and more difficult power law distributions. This scalable performance is in spite of the fact that our set of approaches is not completely lock-free. Our experimental results utilize and compare an SGI Altix UV with 4 Xeon processors (32 cores) and a Cray XMT with 128 processors. On the scale of data we addressed, on the order of 5 billion integers, we show that the Altix UV far exceeds the performance of the Cray XMT for power law distributions. However, the Cray XMT exhibits greater scalability. Copyright 2011 ACM.