IBM’s Analog AI Chip Takes New Approach to Mixed-signal Architecture
Combining the compute performance of analog with the flexibility of digital makes for a device uniquely suited to complex AI challenges.
IBM recently revealed a new analog in-memory compute chip incorporating novel analog electronics with scalable digital interfaces to realize AI building blocks. As AI models become more complex, implementing them on digital computers has become a considerable bottleneck for both speed and efficiency, a problem that IBM aims to address with its latest development.
The IBM AI analog chip includes 64 analog tiles, each of which can be used as a layer in a neural network. Image used courtesy of IBM
Matrix multiplication, an essential component of AI models, uses compute-in-memory architectures to prevent the shuffling of data to and from the processor. Despite the speed and efficiency benefits of this feature, however, designers must still incorporate digital electronics to tune and read the weights from in-memory compute units to provide enough versatility to the design.
This article looks at IBM’s latest advancement to give readers a sense of how the chip's analog and digital electronics operate to improve the efficiency of AI deployments.
IBM's Chip Performs One-to-One Matrix Math
A critical limitation of existing computer systems is the von Neumann architecture, which imposes an upper limit on computation to transfer data between the processor and memory. For matrix multiplication in generative AI, where thousands or millions of multiplication operations are performed, this limitation can cause significant latency and performance inefficiencies.
Each bridge between neural network layers requires matrix multiplication, which can impose major compute requirements for large networks. IBM’s analog AI chip, however, accomplishes this nearly instantly. Image used courtesy of Towards Data Science
To circumvent this limitation, IBM has taken an analog compute-in-memory approach to eliminate expensive data transfer and multiplication operations. By programming the weights of a mesh of mem-resistive elements, the chip performs matrix multiplication simply by inputting the necessary signal and reading the analog output.
A New Kind of Mixed-signal AI Chip
Several groups have shown the feasibility of using analog computers versus their digital counterparts to perform complex mathematical operations. Fundamentally, however, the computing world runs on digital. While analog computing can certainly perform computation much faster, an analog-to-digital bridge is necessary for a product to be realistically usable.
A previous-generation analog AI chip from IBM uses a similar architecture, where an 8x8 array of PCM devices can accomplish matrix operations. Image used courtesy of arXiv. (Click image to enlarge)
IBM aims to provide this bridge with its new chip consisting of 64 “tiles”, each of which contains a 256-by-256 array of phase-change memory synaptic unit cells. Each cell contains analog-to-digital converters and digital processing units to perform the required scaling and activation when programming.
The chip itself demonstrated a 92.81% accuracy on the CIFAR-10 image dataset, showing that it can perform high-quality neural network tasks. In addition, the throughput was as high as 63.1 TOPs with an energy efficiency of up to 9.76 TOPs/W, showing not only the computing performance but the energy efficiency of the device.
Leveraging Analog in a Digital Domain
As more complex computing requirements emerge, the most demanding of them (in this case, AI) may guide designers toward new architectures and techniques. And while analog computers are not poised to replace digital, their unique benefits combined with programmable compute-in-memory operations make them a useful tool for researchers moving forward.
The IBM analog AI chip demonstrates a critical step toward the adoption of analog computers since it narrows the gap between analog and digital. With the addition of a chip modeled after the IBM device, a digital computer could readily tune and use a complex AI model with better performance and efficiency.