Xilinx is most famous for its FPGAs. (And with good reason. They have an undisputed claim on having invented them.)
FPGAs are associated with applications that have high demands on memory. It comes as little surprise, then, that they've gotten more attention in recent years for their potential for ousting traditional CPUs, GPUs, and ASICs.
With an eye on data centers, AI, machine learning, and the multitude of tangential industries, Xilinx has released a new accelerator card using FPGA logic, the Alveo U50.
Alveo U50 Accelerator Card
Introduced last week at the annual Flash Memory Summit, it draws 75 watts and is the first family member to be packaged in a half-height, half-length form factor, fitting into a standard PCIe server slot.
The Alveo U50 data center accelerator card. Image from Xilinx
The Alveo U50 is the industry's first adaptable accelerator card with PCIe 4.0 support. The unit is a reconfigurable platform that can “supercharge” a wide range of critical compute, network and storage workloads.
By moving the computing workload closer to the data, this programmable accelerator platform can improve throughput by an order of magnitude or more. This can also serve to aid developers to first identify and then eliminate latency and data movement bottlenecks
The Alveo U50 features 8 GB of HBM2 (high-bandwidth memory) and can deliver over 400 Gbps data transfer speeds, while QSFP ports can provide up to 100 Gbps network connectivity. This high-speed networking I/O makes it possible for the unit to support advanced applications such as NVMe-oF (NVM Express over Fabrics).
The Alveo U50’s hardware and software programmability allows customers to meet the continuously changing circumstances presented as algorithms and workloads continue to evolve. The device’s flexibility, low latency, and high throughput allow it to meet tough challenges.
Block diagram for the Alveo U50. Image from Xilinx
This flexibility can be useful in several major applications:
- Machine learning inference
- Financial risk modeling
- Data analytics
- Computational storage
- Electronic trading
- Video transcoding
These are just a few of the applications where FPGAs are sometimes noted as having a strategic advantage over CPUs and even some ASICs.
The U50 makes possible some very impressive improvements on a wide range of topical applications. The applicable conditions are completely described in the product sheet.
- Speech Translation. Greatly improved latency and throughput for deep learning inference acceleration. Power efficiency per node is also enhanced as compared to GPU-only implementations
- Database Inquiry. Data analytics acceleration, as measured by the TPC-H Query benchmark is greatly improved.
- Computational Storage Acceleration aka compression. Compression/ decompression throughput is improved by a factor of 20.
- Network Acceleration. For electronic trading, there is 20x lower latency and sub-500ns trading time.
"Ever-growing demands on the data center are pushing existing infrastructure to its limit, driving the need for adaptable solutions that can optimize performance across a broad range of workloads and extend the lifecycle of existing infrastructure, ultimately reducing TCO," said Salil Raje, executive vice president and general manager, Data Center Group, at Xilinx. "The new Alveo U50 brings an optimized form factor and unprecedented performance and adaptability to data center workloads, and we continue to build out solution stacks with a growing ecosystem of application partners to deliver previously unthinkable capabilities to a range of industries."
The U50 has elicited a lot of interest from some extremely important industry participants, including AMD, IBM, and Western Digital. The most often cited reasons are the units critical support of PCIe 4.0 and NVME-oF capabilities.
Do you work with applications that could be benefited by the Alveo U50's advancements? Tell us about them in the comments below.