In an Industry First, Broadcom Puts Neural Network Onto a Switch

December 06, 2023 by Arjun Nijhawan

Broadcom has released StrataXGS Trident 5 BCM78800, a family of Ethernet switch chips with an on-chip neural-network inference engine called NetGNT.

As AI booms, Broadcom has sought to capitalize on the exploding demands of data centers by releasing a top-of-rack (ToR), 800-G-capable family of Ethernet switches called the Trident 5-X12. According to the press release, Trident 5 is the “world first” switch to include an on-chip, neural-network inference engine. 


The Trident 5-X12

The Trident 5-X12. Image (modified) used courtesy of Broadcom

Key Features of Broadcom's Trident 5 TOR Switch

The Trident 5-X12, a continuation of Broadcom’s Trident series of Ethernet switches, differs in significant ways from its predecessor, Trident 4. For starters, the Trident 4 was implemented using a 7 nm manufacturing process, while the new Trident 5 uses a 5 nm manufacturing process. As node sizes shrink, the capacitance between components on the chip reduces, resulting in less energy required to power the chip. In fact, Broadcom claims the new Trident 5 uses 25% less power than Trident 4.

The Trident 5-X12 supports 160 instances of a 100-G PAM4 SERDES core, a significant upgrade from Trident 4’s 50-G PAM4 core. This 100-G core has a reach of up to 4 meters. Since 160 x 100 G = 16000 Gbps, the total bandwidth of the Trident 4 is 16 Terabits/second. The Trident 5-X12 also enables 800-G Ethernet. For example, a customer could connect 24 x 400 G + 8 x 800 G for a total bandwidth of 16 Terabits/second. 


Trident 5 block diagram

Trident 5 block diagram. Image used courtesy of Broadcom

In contrast to non-return-to-zero (NRZ), Pulse Amplitude Modulation 2-Level (PAM4) is an encoding scheme used in SERDES to transmit double the information using the same bandwidth. In NRZ, high and low voltage levels are used to indicate a “1” or “0”. In PAM4, four voltage levels are used to indicate two bits of information: 00, 01, 10, 11.


NRZ vs. PAM4 voltage levels

NRZ vs. PAM4 voltage levels. Image used courtesy of Samtec

As shown in the above figure, however, PAM4 has smaller eye openings due to its encoding scheme. Therefore, it is more susceptible to noise and reflections on the transmission medium itself, making designs more complex and costly than its NRZ counterparts. 


The Benefits of a Top-of-Rack Switch

Data centers employ two key design architectures: top-of-rack (TOR) and end-of-rack (EOR) architectures. In a TOR approach, each server rack can be considered a separate entity with only the top of rack switches connected to a centralized aggregation switch. In an EOR approach, the servers and end-of-row rack switch are all connected to the aggregation switch. Instead of each rack being considered a separate entity, an EOR architecture treats the entire set of racks as a single entity. This requires more cabling but fewer devices overall since the TOR switch is eliminated.


TOR architecture

TOR architecture. Image used courtesy of FS Community


StrataXGS Trident 5-X12 is a TOR switch, allowing existing data centers to easily upgrade from their predecessors. Since TOR doesn't require as much cabling and each rack is a separate entity, upgrading existing switches tends to be easier than EOR architectures. One downside of such an architecture is they tend to be more power-intensive compared to EOR architectures, since each rack requires a TOR switch to communicate with the centralized aggregation switch. 


“World's First” On-Chip Neural Inference Engine

In addition to 100-G PAM4 encoding, one of the most significant additions to the Trident 5-X12 is the on-chip NetGNT neural network. NetGNT works in parallel with the ingress pipeline and invokes congestion-control protocols if it detects certain traffic patterns. The ingress pipeline parses the incoming data and forwards it. By monitoring incoming traffic and pre-emptively tackling network congestion, NetGNT is designed to improve network efficiency and performance.

Broadcom joins a growing number of semiconductor companies developing Ethernet solutions targeted at AI workloads for data centers, and future product releases may target even higher data rates to keep up with growing demand.