Lightelligence Reports ‘World’s First’ Optical Network-on-Chip Processor
The new processor is said to use optical networking to overcome scaling challenges in conventional computing architecture.
Artificial intelligence (AI) and machine learning (ML) require real-time, parallel computations on huge amounts of data. These workloads exacerbate the memory bottleneck of classical, all-purpose CPUs from both a latency and energy perspective.
To overcome these challenges, many new players in the industry are turning toward novel technologies for the future of AI/ML computing. Recently, Lightelligence made waves in the industry when it announced a new AI/ML accelerator that leverages an optical network-on-chip (NoC).
Lightelligence says its new Hummingbird oNOC processor is the first of its kind designed for domain-specific AI workloads. Image courtesy of Lightelligence
In this piece, we’ll look at challenges with conventional multicore AI/ML processors, the novel computing architecture developed by Lightelligence, and the company's newest ASIC: the Hummingbird.
NoCs and Multicore Challenges
AI/ML computation involves specific mathematic functions, such as multiply-and-accumulates (MACs) and convolutions, to process large amounts of data simultaneously. Because of this, standard AI/ML processing hardware tends to consist of multicore and heterogeneous systems.
An example of a heterogeneous computing architecture. Image courtesy of Routledge Handbooks Online
In a multicore system, a single piece of hardware will consist of many cores to process data in parallel (such as a GPU). In a heterogeneous system, like an SoC, a single chip will feature a large number of different computing blocks, including accelerators for MAC functions, GPUs, and general-purpose CPUs. Here, different blocks on the SoC will handle different tasks to reduce power consumption and speed up overall computation for an ML model.
Regardless of which architecture is employed, the one constant between multicore and heterogeneous systems is the need for data movement. Whether data is moving between several processing cores or in and out of memory, high-speed computing applications tend to implement a network-on-chip to speed up data transfer between endpoints.
Different NoC architectures and configurations. Image courtesy of ResearchGate
However, because of the physical limitations of digital systems, these architectures are limited in bandwidth. As a result, NoCs are also limited in the topologies they can achieve, preventing ASICs from reaching maximum performance.
Lightelligence’s oNoC Architecture
For Lightelligence, the key to enabling better-performing AI/ML accelerators is to enable new NoC topologies that maximize speed and decrease power consumption. Since conventional electrical NoCs won’t cut it, the company instead turned to optical NoCs (oNoCs) as the solution.
Lightelligence's computing architecture consists of three major components: an electronic chip (EIC), an interposer, and a photonic chip (PIC).
A cross-sectional view of Lightelligence’s stacked architecture. Image courtesy of Lightelligence
The EIC is part of the system that implements the digital domain of the system, including ALU, memory, and analog interface. The interposer connects the EIC and PIC to deliver power to the domains. The PIC hosts the oNOC, which uses optical networking to interconnect the processing cores in an all-to-all broadcasting technique. This technique is said to allow all cores to access data simultaneously.
Lightelligence’s oNoC connects EICs with optical networking. Image courtesy of Lightelligence
On a lower level, the interposer contains photonic routing wave guides that act as data communication highways between EICs. Each EIC is stacked on top of a PIC connected via micro-bumps to form a 2D array. Light from a laser source routes through the waveguides and is translated into electrical data by modulating the light intensity. To do this, the analog interface on each EIC couples with the photonic interposer and alters the refractive index of the silicon waveguide to physically modulate the light’s intensity. To convert this back to a bitstream, the EIC hosts photodiodes that convert the light pulses to electrical current for use in the digital domain.
The major benefit of optical interconnects is that they operate at significantly higher speeds and lower power consumption than what’s possible through electrical NoCs. With near-zero latency, the oNoC enables new NoC topologies, like toroidal, that are not otherwise possible.
Hummingbird oNoC Processor
Recently, Lightelligence announced its new Hummingbird processor—the first product to feature its oNoc architecture.
Hummingbird is an AI/ML accelerator consisting of 64 cores, each connected to one another via the oNoC. With 64 transmitters and 512 receivers, the Hummingbird is a single-instruction, multiple-data (SIMD) solution with its own proprietary ISA.
The Hummingbird processor stack up. Image courtesy of Lightelligence
While performance numbers aren’t available, the company claims that the solution offers lower latency and power consumption than anything else available. Specifically, the solution’s oNoC is said to achieve an energy efficiency ratio as low as < 1 pJ/bit.
As it stands, the Hummingbird will be implemented in a PCIe form factor for standard servers. The tool will be programmable via Lightelligence’s own SDK, which offers support for TensorFlow. First demonstrations of the chip will occur at this year’s Hot Chips conference at the end of August.