Publications

Results 76–100 of 210
Skip to search filters

Designing an analog crossbar based neuromorphic accelerator

2017 5th Berkeley Symposium on Energy Efficient Electronic Systems, E3S 2017 - Proceedings

Agarwal, Sapan A.; Hsia, Alexander W.; Jacobs-Gedrim, Robin B.; Hughart, David R.; Plimpton, Steven J.; James, Conrad D.; Marinella, Matthew J.

Resistive memory crossbars can dramatically reduce the energy required to perform computations in neural algorithms by three orders of magnitude when compared to an optimized digital ASIC [1]. For data intensive applications, the computational energy is dominated by moving data between the processor, SRAM, and DRAM. Analog crossbars overcome this by allowing data to be processed directly at each memory element. Analog crossbars accelerate three key operations that are the bulk of the computation in a neural network as illustrated in Fig 1: vector matrix multiplies (VMM), matrix vector multiplies (MVM), and outer product rank 1 updates (OPU)[2]. For an NxN crossbar the energy for each operation scales as the number of memory elements O(N2) [2]. This is because the crossbar performs its entire computation in one step, charging all the capacitances only once. Thus the CV2 energy of the array scales as array size. This fundamentally better than trying to read or write a digital memory. Each row of any NxN digital memory must be accessed one at a time, resulting in N columns of length O(N) being charged N times, requiring O(N3) energy to read a digital memory. Thus an analog crossbar has a fundamental O(N) energy scaling advantage over a digital system. Furthermore, if the read operation is done at low voltage and is therefore noise limited, the read energy can even be independent of the crossbar size, O(1) [2].

More Details

Molecular-Level Simulations of Turbulence and Its Decay

Physical Review Letters

Gallis, Michail A.; Bitter, Neal B.; Koehler, Timothy P.; Torczynski, J.R.; Plimpton, Steven J.; Papadakis, G.

We provide the first demonstration that molecular-level methods based on gas kinetic theory and molecular chaos can simulate turbulence and its decay. The direct simulation Monte Carlo (DSMC) method, a molecular-level technique for simulating gas flows that resolves phenomena from molecular to hydrodynamic (continuum) length scales, is applied to simulate the Taylor-Green vortex flow. The DSMC simulations reproduce the Kolmogorov -5/3 law and agree well with the turbulent kinetic energy and energy dissipation rate obtained from direct numerical simulation of the Navier-Stokes equations using a spectral method. This agreement provides strong evidence that molecular-level methods for gases can be used to investigate turbulent flows quantitatively.

More Details

A historical survey of algorithms and hardware architectures for neural-inspired and neuromorphic computing applications

Biologically Inspired Cognitive Architectures

James, Conrad D.; Aimone, James B.; Miner, Nadine E.; Vineyard, Craig M.; Rothganger, Fredrick R.; Carlson, Kristofor D.; Mulder, Samuel A.; Draelos, Timothy J.; Faust, Aleksandra; Marinella, Matthew J.; Naegle, John H.; Plimpton, Steven J.

Biological neural networks continue to inspire new developments in algorithms and microelectronic hardware to solve challenging data processing and classification problems. Here, we survey the history of neural-inspired and neuromorphic computing in order to examine the complex and intertwined trajectories of the mathematical theory and hardware developed in this field. Early research focused on adapting existing hardware to emulate the pattern recognition capabilities of living organisms. Contributions from psychologists, mathematicians, engineers, neuroscientists, and other professions were crucial to maturing the field from narrowly-tailored demonstrations to more generalizable systems capable of addressing difficult problem classes such as object detection and speech recognition. Algorithms that leverage fundamental principles found in neuroscience such as hierarchical structure, temporal integration, and robustness to error have been developed, and some of these approaches are achieving world-leading performance on particular data classification tasks. In addition, novel microelectronic hardware is being developed to perform logic and to serve as memory in neuromorphic computing systems with optimized system integration and improved energy efficiency. Key to such advancements was the incorporation of new discoveries in neuroscience research, the transition away from strict structural replication and towards the functional replication of neural systems, and the use of mathematical theory frameworks to guide algorithm and hardware developments.

More Details

Direct simulation monte carlo investigation of hydrodynamic instabilities in gases

AIP Conference Proceedings

Gallis, Michail A.; Koehler, Timothy P.; Torczynski, J.R.; Plimpton, Steven J.

The Rayleigh-Taylor instability (RTI) is investigated using the Direct Simulation Monte Carlo (DSMC) method of molecular gas dynamics. Here, two-dimensional and three-dimensional DSMC RTI simulations are performed to quantify the growth of flat and single-mode-perturbed interfaces between two atmospheric-pressure monatomic gases. The DSMC simulations reproduce all qualitative features of the RTI and are in reasonable quantitative agreement with existing theoretical and empirical models in the linear, nonlinear, and self-similar regimes. At late times, the instability is seen to exhibit a self-similar behavior, in agreement with experimental observations. For the conditions simulated diffusion can influence the initial instability growth significantly.

More Details

Resistive memory device requirements for a neural algorithm accelerator

Proceedings of the International Joint Conference on Neural Networks

Agarwal, Sapan A.; Plimpton, Steven J.; Hughart, David R.; Hsia, Alexander W.; Richter, Isaac; Cox, Jonathan A.; James, Conrad D.; Marinella, Matthew J.

Resistive memories enable dramatic energy reductions for neural algorithms. We propose a general purpose neural architecture that can accelerate many different algorithms and determine the device properties that will be needed to run backpropagation on the neural architecture. To maintain high accuracy, the read noise standard deviation should be less than 5% of the weight range. The write noise standard deviation should be less than 0.4% of the weight range and up to 300% of a characteristic update (for the datasets tested). Asymmetric nonlinearities in the change in conductance vs pulse cause weight decay and significantly reduce the accuracy, while moderate symmetric nonlinearities do not have an effect. In order to allow for parallel reads and writes the write current should be less than 100 nA as well.

More Details

Direct simulation Monte Carlo investigation of the Rayleigh-Taylor instability

Physical Review Fluids

Gallis, Michail A.; Koehler, Timothy P.; Torczynski, J.R.; Plimpton, Steven J.

In this paper, the Rayleigh-Taylor instability (RTI) is investigated using the direct simulation Monte Carlo (DSMC) method of molecular gas dynamics. Here, fully resolved two-dimensional DSMC RTI simulations are performed to quantify the growth of flat and single-mode perturbed interfaces between two atmospheric-pressure monatomic gases as a function of the Atwood number and the gravitational acceleration. The DSMC simulations reproduce many qualitative features of the growth of the mixing layer and are in reasonable quantitative agreement with theoretical and empirical models in the linear, nonlinear, and self-similar regimes. In some of the simulations at late times, the instability enters the self-similar regime, in agreement with experimental observations. Finally, for the conditions simulated, diffusion can influence the initial instability growth significantly.

More Details

Increasing Molecular Dynamics Simulation Rates with an 8-Fold Increase in Electrical Power Efficiency

International Conference for High Performance Computing, Networking, Storage and Analysis, SC

Brown, W.M.; Semin, Andrey; Hebenstreit, Michael; Khvostov, Sergey; Raman, Karthik; Plimpton, Steven J.

Electrical power efficiency is a primary concern in designing modern HPC systems. Common strategies to improve CPU power efficiency rely on increased parallelism within a processor that is enabled both by an increase in the vector capabilities within the core and also the number of cores within a processor. Although many-core processors have been available for some time, achieving power-efficient performance has been challenging due to the offload model. Here, we evaluate performance of the molecular dynamics code LAMMPS on two new Intel® processors including the second generation many-core Intel® Xeon Phi™ processor that is available as a bootable CPU. We describe our approach to measure power consumption out-of-band and software optimizations necessary to achieve energy efficiency. We analyze benefits from Intel® Advanced Vector Extensions 512 instructions and demonstrate increased simulations rates with over 9X the CPU+DRAM power efficiency when compared to the unoptimized code on previous generation processors.

More Details
Results 76–100 of 210
Results 76–100 of 210