News

Intel Rolls First FPGA with PCIe 5.0 and Compute Express Link

May 25, 2023 by Chantelle Dubois

By embedding PCI Express 5.0 and CXL support, Intel’s new FPGAs seek to tackle high-bandwidth computing workloads.

Even with today’s lightning fast processors, high-speed computing can’t function without also moving data around a system. That’s why high-speed serial interconnects like PCI Express (PCIe) and Compute Express Link (CXL) are so critical.

Along just those lines, this week Intel’s Programmable Solutions (FPGA) Group announced that the Intel Agilex 7 I-Series FPGA is now in production. The company claims this to be the first FPGA with Peripheral Component Interconnect Express (PCIe) 5.0 and Compute Express Link (CXL) capabilities, provided via their R-Tile chiplet with a mix of hard and soft intellectual property.
 

Intel’s Agilex I-Series FPGAs are PCIe 5.0 and CXL capable.

Intel’s Agilex I-Series FPGAs are PCIe 5.0 and CXL capable. Image used courtesy of Intel

 

The FPGA has a heterogeneous multi-die architecture embedded on a single device, with a 10nm SuperFin FPGA fabric featured in the center. The FPGA is connected to R-Tile via Intel’s proprietary Embedded Multi-Die Interconnect Bridge (EMIB).

With the availability of the R-Tile, the FPGA has the capability to be connected to other processors over a high-bandwidth interface and has been optimized for use with Intel’s 4th Gen Xeon processors. Other features include:

  • 1.9 – 4 M logical elements
  • Up to 116 Gbps transceiver rates
  • PCIe and CXL support
  • DDR4 interfaces
  • Option for Quad-core Arm Cortex-A53 based SoC

Additionally, Intel claims that their 10nm SuperFin FPGAs have approximately 2X fabric performance per watt compared to other 7 nm FPGAs, which gives them an advantage in the supply chain. This may be due to the added complexity of manufacturing 7 nm devices.

Intel’s Agilex 7 I-Series is a part of the Agilex 7 product line also featuring an F-Series and M-Series configuration. The I-Series provides PCIe and CXL support, the F-Series provides Digital Signal Processing blocks (DSP) and crypto blocks, while M-Series provides high-bandwidth memory (HBM), DDR5 SDRAM, and Network-on-Chip capabilities.

Intel envisions these devices being used in data centers, the financial services industry, or in telecommunications.

 

What is R-Tile?

Intel describes R-Tile as a “companion tile” that can support PCIe 3.0, 4.0, and 5.0 configurations as well as CXL 1.1, 2.0 (with a device coherent agent), and 3.0. Additionally, the R-Tile can provide Root Port (RP), Endpoint (EP), and Transaction Layer Packet (TLP) bypass modes.

 

R-Tile block diagram.

R-Tile block diagram. Image used courtesy of Intel

 

Intel claims that the Intel Agilex 7 FPGA with R-Tile is the only PCI-SIG compliant device with full PCIe 5.0x16 data rates. Some technical features:

  • Up to 32 GT/s per lane
  • 1x16 Endpoint/Root Port modes
  • 2x8 Endpoint/Root port modes
  • 4x4 Root Port modes
  • Virtualization
  • Precision Time Measurement options

R-Tile CXL features include cxl.io, cxl.cache, and cxl.mem interfaces. Additionally, Intel’s 4th Gen Xeon processors have been CXL certified. Some technical features:

  • Up to 32 GT/s per lane
  • 1x16 Endpoint modes

 

CXL and Transparent Page Placement

The addition of a CXL interface enables the Agilex 7 to leverage Transparent Page Placement (TPP). In a white paper from the University of Michigan and Meta, the authors describe the benefit of TPP in a tiered-memory subsystem, in which memory pages can be dispatched to appropriate memory tiers depending on how hot or cold it is (that is, how often the data is being accessed).

By using TPP, the authors claim that Linux performance can be increased up to 18%, outperforming other state-of-the-art solutions by 10-17%.

Additionally, Israeli smart memory node technology company UnifabriX have been leveraging CXL in their CXL-enabled smart memory nodes, with a claim to 28% performance increase in the High-Performance Conjugate Gradient (HPCG) benchmark score for high performance computing workloads.