This project was broadly motivated by the need for new hardware that can process information such as images and sounds right at the point of where the information is sensed (e.g. edge computing). The project was further motivated by recent discoveries by group demonstrating that while certain organic polymer blends can be used to fabricate elements of such hardware, the need to mix ionic and electronic conducting phases imposed limits on performance, dimensional scalability and the degree of fundamental understanding of how such devices operated. As an alternative to blended polymers containing distinct ionic and electronic conducting phases, in this LDRD project we have discovered that a family of mixed valence coordination compounds called Prussian blue analogue (PBAs), with an open framework structure and ability to conduct both ionic and electronic charge, can be used for inkjet-printed flexible artificial synapses that reversibly switch conductance by more than four orders of magnitude based on electrochemically tunable oxidation state. Retention of programmed states is improved by nearly two orders of magnitude compared to the extensively studied organic polymers, thus enabling in-memory compute and avoiding energy costly off-chip access during training. We demonstrate dopamine detection using PBA synapses and biocompatibility with living neurons, evoking prospective application for brain - computer interfacing. By application of electron transfer theory to in-situ spectroscopic probing of intervalence charge transfer, we elucidate a switching mechanism whereby the degree of mixed valency between N-coordinated Ru sites controls the carrier concentration and mobility, as supported by density functional theory (DFT) .
This project evaluated the use of emerging spintronic memory devices for robust and efficient variational inference schemes. Variational inference (VI) schemes, which constrain the distribution for each weight to be a Gaussian distribution with a mean and standard deviation, are a tractable method for calculating posterior distributions of weights in a Bayesian neural network such that this neural network can also be trained using the powerful backpropagation algorithm. Our project focuses on domain-wall magnetic tunnel junctions (DW-MTJs), a powerful multi-functional spintronic synapse design that can achieve low power switching while also opening the pathway towards repeatable, analog operation using fabricated notches. Our initial efforts to employ DW-MTJs as an all-in-one stochastic synapse with both a mean and standard deviation didn’t end up meeting the quality metrics for hardware-friendly VI. In the future, new device stacks and methods for expressive anisotropy modification may make this idea still possible. However, as a fall back that immediately satisfies our requirements, we invented and detailed how the combination of a DW-MTJ synapse encoding the mean and a probabilistic Bayes-MTJ device, programmed via a ferroelectric or ionically modifiable layer, can robustly and expressively implement VI. This design includes a physics-informed small circuit model, that was scaled up to perform and demonstrate rigorous uncertainty quantification applications, up to and including small convolutional networks on a grayscale image classification task, and larger (Residual) networks implementing multi-channel image classification. Lastly, as these results and ideas all depend upon the idea of an inference application where weights (spintronic memory states) remain non-volatile, the retention of these synapses for the notched case was further interrogated. These investigations revealed and emphasized the importance of both notch geometry and anisotropy modification in order to further enhance the endurance of written spintronic states. In the near future, these results will be mapped to effective predictions for room temperature and elevated operation DW-MTJ memory retention, and experimentally verified when devices become available.
Neural networks are largely based on matrix computations. During forward inference, the most heavily used compute kernel is the matrix-vector multiplication (MVM): $W \vec{x} $. Inference is a first frontier for the deployment of next-generation hardware for neural network applications, as it is more readily deployed in edge devices, such as mobile devices or embedded processors with size, weight, and power constraints. Inference is also easier to implement in analog systems than training, which has more stringent device requirements. The main processing kernel used during inference is the MVM.
We demonstrate SONOS (silicon-oxide-nitride-oxide-silicon) analog memory arrays that are optimized for neural network inference. The devices are fabricated in a 40nm process and operated in the subthreshold regime for in-memory matrix multiplication. Subthreshold operation enables low conductances to be implemented with low error, which matches the typical weight distribution of neural networks, which is heavily skewed toward near-zero values. This leads to high accuracy in the presence of programming errors and process variations. We simulate the end-To-end neural network inference accuracy, accounting for the measured programming error, read noise, and retention loss in a fabricated SONOS array. Evaluated on the ImageNet dataset using ResNet50, the accuracy using a SONOS system is within 2.16% of floating-point accuracy without any retraining. The unique error properties and high On/Off ratio of the SONOS device allow scaling to large arrays without bit slicing, and enable an inference architecture that achieves 20 TOPS/W on ResNet50, a > 10× gain in energy efficiency over state-of-The-Art digital and analog inference accelerators.
The core function of many neural network algorithms is the dot product, or vector matrix multiply (VMM) operation. Crossbar arrays utilizing resistive memory elements can reduce computational energy in neural algorithms by up to five orders of magnitude compared to conventional CPUs. Moving data between a processor, SRAM, and DRAM dominates energy consumption. By utilizing analog operations to reduce data movement, resistive memory crossbars can enable processing of large amounts of data at lower energy than conventional memory architectures.
Neuromorphic computing promises revolutionary improvements over conventional systems for applications that process unstructured information. To fully realize this potential, neuromorphic systems should exploit the biomimetic behavior of emerging nanodevices. In particular, exceptional opportunities are provided by the non-volatility and analog capabilities of spintronic devices. While spintronic devices that emulate neurons have been previously proposed, they require complementary metal-oxide semiconductor (CMOS) technology to function. In turn, this significantly increases the power consumption, fabrication complexity, and device area of a single neuron. This work reviews three previously proposed CMOS-free spintronic neurons designed to resolve this issue.
To support the increasing demands for efficient deep neural network processing, accelerators based on analog in-memory computation of matrix multiplication have recently gained significant attention for reducing the energy of neural network inference. However, analog processing within memory arrays must contend with the issue of parasitic voltage drops across the metal interconnects, which distort the results of the computation and limit the array size. This work analyzes how parasitic resistance affects the end-to-end inference accuracy of state-of-the-art convolutional neural networks, and comprehensively studies how various design decisions at the device, circuit, architecture, and algorithm levels affect the system's sensitivity to parasitic resistance effects. A set of guidelines are provided for how to design analog accelerator hardware that is intrinsically robust to parasitic resistance, without any explicit compensation or re-training of the network parameters.
Inspired by the parallelism and efficiency of the brain, several candidates for artificial synapse devices have been developed for neuromorphic computing, yet a nonlinear and asymmetric synaptic response curve precludes their use for backpropagation, the foundation of modern supervised learning. Spintronic devices - which benefit from high endurance, low power consumption, low latency, and CMOS compatibility - are a promising technology for memory, and domain-wall magnetic tunnel junction (DW-MTJ) devices have been shown to implement synaptic functions such as long-term potentiation and spike-timing dependent plasticity. In this work, we propose a notched DW-MTJ synapse as a candidate for supervised learning. Using micromagnetic simulations at room temperature, we show that notched synapses ensure the non-volatility of the synaptic weight and allow for highly linear, symmetric, and reproducible weight updates using either spin transfer torque (STT) or spin-orbit torque (SOT) mechanisms of DW propagation. We use lookup tables constructed from micromagnetics simulations to model the training of neural networks built with DW-MTJ synapses on both the MNIST and Fashion-MNIST image classification tasks. Accounting for thermal noise and realistic process variations, the DW-MTJ devices achieve classification accuracy close to ideal floating-point updates using both STT and SOT devices at room temperature and at 400 K. Our work establishes the basis for a magnetic artificial synapse that can eventually lead to hardware neural networks with fully spintronic matrix operations implementing machine learning.
We evaluate the sensitivity of neuromorphic inference accelerators based on silicon-oxide-nitride-oxide-silicon (SONOS) charge trap memory arrays to total ionizing dose (TID) effects. Data retention statistics were collected for 16 Mbit of 40-nm SONOS digital memory exposed to ionizing radiation from a Co-60 source, showing good retention of the bits up to the maximum dose of 500 krad(Si). Using this data, we formulate a rate-equation-based model for the TID response of trapped charge carriers in the ONO stack and predict the effect of TID on intermediate device states between 'program' and 'erase.' This model is then used to simulate arrays of low-power, analog SONOS devices that store 8-bit neural network weights and support in situ matrix-vector multiplication. We evaluate the accuracy of the irradiated SONOS-based inference accelerator on two image recognition tasks - CIFAR-10 and the challenging ImageNet data set - using state-of-the-art convolutional neural networks, such as ResNet-50. We find that across the data sets and neural networks evaluated, the accelerator tolerates a maximum TID between 10 and 100 krad(Si), with deeper networks being more susceptible to accuracy losses due to TID.
We evaluate the resilience of CoFeB/MgO/CoFeB magnetic tunnel junctions (MTJs) with perpendicular magnetic anisotropy (PMA) to displacement damage induced by heavy-ion irradiation. MTJs were exposed to 3-MeV Ta2+ ions at different levels of ion beam fluence spanning five orders of magnitude. The devices remained insensitive to beam fluences up to $10^{11}$ ions/cm2, beyond which a gradual degradation in the device magnetoresistance, coercive magnetic field, and spin-transfer-torque (STT) switching voltage were observed, ending with a complete loss of magnetoresistance at very high levels of displacement damage (>0.035 displacements per atom). The loss of magnetoresistance is attributed to structural damage at the MgO interfaces, which allows electrons to scatter among the propagating modes within the tunnel barrier and reduces the net spin polarization. Ion-induced damage to the interface also reduces the PMA. This study clarifies the displacement damage thresholds that lead to significant irreversible changes in the characteristics of STT magnetic random access memory (STT-MRAM) and elucidates the physical mechanisms underlying the deterioration in device properties.
In-memory computing based on non-volatile resistive memory can significantly improve the energy efficiency of artificial neural networks. However, accurate in situ training has been challenging due to the nonlinear and stochastic switching of the resistive memory elements. One promising analog memory is the electrochemical random-access memory (ECRAM), also known as the redox transistor. Its low write currents and linear switching properties across hundreds of analog states enable accurate and massively parallel updates of a full crossbar array, which yield rapid and energy-efficient training. While simulations predict that ECRAM based neural networks achieve high training accuracy at significantly higher energy efficiency than digital implementations, these predictions have not been experimentally achieved. In this work, we train a 3 × 3 array of ECRAM devices that learns to discriminate several elementary logic gates (AND, OR, NAND). We record the evolution of the network’s synaptic weights during parallel in situ (on-line) training, with outer product updates. Due to linear and reproducible device switching characteristics, our crossbar simulations not only accurately simulate the epochs to convergence, but also quantitatively capture the evolution of weights in individual devices. The implementation of the first in situ parallel training together with strong agreement with simulation results provides a significant advance toward developing ECRAM into larger crossbar arrays for artificial neural network accelerators, which could enable orders of magnitude improvements in energy efficiency of deep neural networks.
Brigner, Wesley H.; Hassan, Naimul H.; Bennett, Christopher H.; Hu, Xuan H.; Velasquez, Alvaro V.; Garcia-Sanchez, Felipe G.; Marinella, Matthew J.; Incorvia, Jean A.; Friedman, Joseph S.
Neuromorphic computing with spintronic devices has been of interest due to the limitations of CMOS-driven von Neumann computing. Domain wall-magnetic tunnel junction (DW-MTJ) devices have been shown to be able to intrinsically capture biological neuron behavior. Edgy-relaxed behavior, where a frequently firing neuron experiences a lower action potential threshold, may provide additional artificial neuronal functionality when executing repeated tasks. In this letter, we demonstrate that this behavior can be implemented in DW-MTJ artificial neurons via three alternative mechanisms: shape anisotropy, magnetic field, and current-driven soft reset. Using micromagnetics and analytical device modeling to classify the Optdigits handwritten digit dataset, we show that edgy-relaxed behavior improves both classification accuracy and classification rate for ordered datasets while sacrificing little to no accuracy for a randomized dataset. This letter establishes methods by which artificial spintronic neurons can be flexibly adapted to datasets.
Digital computing is nearing its physical limits as computing needs and energy consumption rapidly increase. Analogue-memory-based neuromorphic computing can be orders of magnitude more energy efficient at data-intensive tasks like deep neural networks, but has been limited by the inaccurate and unpredictable switching of analogue resistive memory. Filamentary resistive random access memory (RRAM) suffers from stochastic switching due to the random kinetic motion of discrete defects in the nanometer-sized filament. In this work, this stochasticity is overcome by incorporating a solid electrolyte interlayer, in this case, yttria-stabilized zirconia (YSZ), toward eliminating filaments. Filament-free, bulk-RRAM cells instead store analogue states using the bulk point defect concentration, yielding predictable switching because the statistical ensemble behavior of oxygen vacancy defects is deterministic even when individual defects are stochastic. Both experiments and modeling show bulk-RRAM devices using TiO2-X switching layers and YSZ electrolytes yield deterministic and linear analogue switching for efficient inference and training. Bulk-RRAM solves many outstanding issues with memristor unpredictability that have inhibited commercialization, and can, therefore, enable unprecedented new applications for energy-efficient neuromorphic computing. Beyond RRAM, this work shows how harnessing bulk point defects in ionic materials can be used to engineer deterministic nanoelectronic materials and devices.
Brigner, Wesley H.; Hassan, Naimul H.; Friedman, Joseph S.; Velasquez, Alvaro V.; Garcia-Sanchez, Felipe G.; Akinola, Otitoaleke G.; Incorvia, Jean A.; Bennett, Christopher H.; Marinella, Matthew J.
Analog hardware accelerators, which perform computation within a dense memory array, have the potential to overcome the major bottlenecks faced by digital hardware for data-heavy workloads such as deep learning. Exploiting the intrinsic computational advantages of memory arrays, however, has proven to be challenging principally due to the overhead imposed by the peripheral circuitry and due to the non-ideal properties of memory devices that play the role of the synapse. We review the existing implementations of these accelerators for deep supervised learning, organizing our discussion around the different levels of the accelerator design hierarchy, with an emphasis on circuits and architecture. We explore and consolidate the various approaches that have been proposed to address the critical challenges faced by analog accelerators, for both neural network inference and training, and highlight the key design trade-offs underlying these techniques.
Brigner, Wesley H.; Hu, Xuan; Hassan, Naimul; Jiang-Wei, Lucian; Bennett, Christopher H.; Garcia-Sanchez, Felipe; Akinola, Otitoaleke; Pasquale, Massimo; Marinella, Matthew J.; Incorvia, Jean A.; Friedman, Joseph S.
Due to their nonvolatility and intrinsic current integration capabilities, spintronic devices that rely on domain wall (DW) motion through a free ferromagnetic track have garnered significant interest in the field of neuromorphic computing. Although a number of such devices have already been proposed, they require the use of external circuitry to implement several important neuronal behaviors. As such, they are likely to result in either a decrease in energy efficiency, an increase in fabrication complexity, or even both. To resolve this issue, we have proposed three individual neurons that are capable of performing these functionalities without the use of any external circuitry. To implement leaking, the first neuron uses a dipolar coupling field, the second uses an anisotropy gradient and the third uses shape variations of the DW track.
Lateral inhibition is an important functionality in neuromorphic computing, modeled after the biological neuron behavior that a firing neuron deactivates its neighbors belonging to the same layer and prevents them from firing. In most neuromorphic hardware platforms lateral inhibition is implemented by external circuitry, thereby decreasing the energy efficiency and increasing the area overhead of such systems. Recently, the domain wall - magnetic tunnel junction (DW-MTJ) artificial neuron is demonstrated in modeling to be intrinsically inhibitory. Without peripheral circuitry, lateral inhibition in DW-MTJ neurons results from magnetostatic interaction between neighboring neuron cells. However, the lateral inhibition mechanism in DW-MTJ neurons has not been studied thoroughly, leading to weak inhibition only in very closely-spaced devices. This work approaches these problems by modeling current- and field- driven DW motion in a pair of adjacent DW-MTJ neurons. We maximize the magnitude of lateral inhibition by tuning the magnetic interaction between the neurons. The results are explained by current-driven DW velocity characteristics in response to an external magnetic field and quantified by an analytical model. Dependence of lateral inhibition strength on device parameters is also studied. Finally, lateral inhibition behavior in an array of 1000 DW-MTJ neurons is demonstrated. Our results provide a guideline for the optimization of lateral inhibition implementation in DW-MTJ neurons. With strong lateral inhibition achieved, a path towards competitive learning algorithms such as the winner-take-all are made possible on such neuromorphic devices.
Neuromorphic computing captures the quintessential neural behaviors of the brain and is a promising candidate for the beyond-von Neumann computer architectures, featuring low power consumption and high parallelism. The neuronal lateral inhibition feature, closely associated with the biological receptive field, is crucial to neuronal competition in the nervous system as well as its neuromorphic hardware counterpart. The domain wall - magnetic tunnel junction (DW-MTJ) neuron is an emerging spintronic artificial neuron device exhibiting intrinsic lateral inhibition. This work discusses lateral inhibition mechanism of the DW-MTJ neuron and shows by micromagnetic simulation that lateral inhibition is efficiently enhanced by the Dzyaloshinskii-Moriya interaction (DMI).