Researchers Worldwide Come Together on “NeuRRAM” Neuromorphic Chip
The teams have developed the first compute-in-memory chip to tackle a range of AI applications at lower energy and higher accuracy than other platforms.
The holy grail of edge AI computing is a chip that offers high efficiency, performance, and versatility simultaneously. Obtaining all three has historically posed a significant challenge for designers, and as such, many have started considering new computing architectures altogether.
One of these new architectures is in-memory computing, which aims to eliminate the data movement bottleneck to achieve higher efficiency and better performance than conventional digital processing units. This week, a group of international researchers published a paper describing a new compute-in-memory chip based on resistive random-access memory (RRAM).
The NeuRRAM neuromorphic chip. Image used courtesy of UCSD
In this article, we’ll discuss RRAM for compute-in-memory (CIM), the historical shortcomings of these solutions, and the group’s new "NeuRRAM" neuromorphic chip.
Resistive RAM for Compute in Memory
Within the past 30 years, designers have been investigating the idea of compute in memory—and more recently, compute in memory based on resistive RAM.
RRAM CIM removes the von Neumann bottleneck, a result of separate memory and compute, and instead merges them together. In this architecture, resistive RAM elements are used for memory storage, where binary digits are stored based on the resistive state of the RRAM material in each cell. Here, applying a voltage may cause the RRAM to become a highly resistant material, representing a digital 1 and vice versa. Reading out bits in memory is achieved by applying a voltage to the RRAM cell and reading out the resulting current, which will vary based on the state of the RRAM.
Example of multiplication using an RRAM CIM cell. Image used courtesy of SemiWiki
RRAM is an extremely power-efficient, small, and non-volatile form of memory. This architecture also fits very well in the context of artificial intelligence computation since machine learning computation relies heavily on multiply and accumulate functions that can be easily implemented with RRAM. Since sensing currents enable RRAM results to be read out, one can easily add and multiply RRAM values by summing currents in a junction or a series of junctions.
The Shortcomings of RRAM CIM
Despite the benefits of RRAM CIM, the research and development of this technology have still been fraught with obstacles.
For one, much of the early research focused on performing the AI computation on the RRAM chip but still relied on off-chip resources to perform other essential functions such as analog-to-digital conversion and neuron activations. Not only did this limit system performance but it also skewed benchmarking. Historically, results were based on software emulation of device characteristics, which are almost always optimistic.
Beyond this, there are inherent tradeoffs between energy efficiency, versatility, and accuracy within an RRAM CIM device. According to a group of researchers from the University of California San Diego (UCSD), Stanford University, Tsinghua University, and the University of Notre Dame, no previous research has ever attempted to simultaneously optimize for all three criteria.
NeuRRAM Hits on Efficiency, Accuracy, Flexibility
This week, those researchers from UCSD, Stanford, Tsinghua, and Notre Dame published a report in Nature describing an RRAM CIM chip they call NeuRRAM.
The NeuRRAM neuromorphic chip is said to achieve a combination of efficiency, accuracy, and flexibility thanks to an output sensing method it employs. Compared to conventional techniques that read out current for an output, NeuRRAM uses a neuron circuit that senses voltages and performs efficient analog-to-digital conversion all on-chip.
Block diagram of a CIM core and the NeuRRAM architecture. Image used courtesy of Nature and Wan et al
The architecture includes CMOS neuron circuits that coexist with the RRAM bit cells. A neuromorphic AI chip, NeuRRAM consists of 48 neurosynaptic cores, 256 CMOS neurons, and 65,536 RRAM cells that perform parallel processing and can support data and model parallelism. This allows different model layers to be mapped to different cores for maximum versatility.
The team of international researchers claims the chip can achieve an energy-delay product (EDP) of up to 2.3x lower than conventional digital processors while also offering a computational density that is up to 13x higher. NeuRRAM is also said to achieve an accuracy of 99% for handwritten digit recognition, 85.7% for image classification, and 84.7% for speech recognition.
Altogether, the study suggests that this chip matches the accuracy of conventional digital chips but with significantly less energy expenditure and higher density. The researchers, who designed this chip with edge computing in mind, claim that the low power and high performance of NeuRRAM may enable a new class of devices that are currently not feasible with existing technology.