Generative AI Meets Its Computing Match in Nvidia’s New GPU

The company’s new GPU builds on Nvidia RTX technology for improvements in AI, graphics, and compute.

News February 13, 2024 by Jake Hertz

Generative AI requires computing resources that can handle heavy workloads powerfully, efficiently, and cost-effectively. While accelerators and dedicated hardware have their place in such tasks, the GPU still reigns supreme as the go-to resource for all things AI.

Nvidia RTX 2000 ADA Generation GPU

The Nvidia RTX 2000 ADA Generation GPU.

This week, Nvidia announced the new Nvidia RTX 2000 ADA Generation GPU, designed specifically for generative AI workloads. In this piece, we’ll examine the new GPU’s architecture and performance to see how Nvidia is addressing the growing challenges of generative AI computing.

The Specs Behind Nvidia RTX 2000 Ada Generation

The Nvidia RTX 2000 Ada Generation GPU is designed to cater to professional consumers with compact systems on a budget. This GPU, part of Nvidia's expansion into the professional market, is a small form factor (SFF) card with a dual-slot design featuring a blower-type cooling system that is 6.6 inches long. It is compatible with both standard and SFF systems by including a standard ATX and low-profile bracket.

An ADA streaming multiprocessor

An ADA streaming multiprocessor.

On a lower level, the device incorporates 2,816 CUDA cores and is derived from the AD107 silicon, which originally contained 24 streaming multiprocessors (SMs) equivalent to 3,072 CUDA cores. However, only 22 SMs are enabled in this model. This configuration places the RTX 2000's performance between the GeForce RTX 4050 Mobile, with 2,560 CUDA cores, and the GeForce RTX 4060, with 3,072 CUDA cores.

It features 192 fourth-generation Tensor cores and 22 third-generation RT cores. Single-precision performance is rated at 12.0 TFLOPS, RT core performance at 27.7 TFLOPS, and Tensor performance at 191.9 TFLOPS. The GPU is equipped with 16 GB of GDDR6 memory with ECC, utilizing a 128-bit memory interface that offers a bandwidth of 224 GB/s. The maximum power consumption of the device, however, remains conservative at 70 W, negating the need for external power connectors. Connectivity is facilitated through four mini DisplayPort 1.4a outputs.

Performance Improvements Over Previous Generations

The RTX 2000 Ada Generation (datasheet linked) is founded on Nvidia's Ada Lovelace architecture, enabling it to outperform the RTX A2000 12GB by a 50% margin in single-precision tasks. Moreover, the introduction of third-generation RT cores and fourth-generation Tensor cores have notably enhanced RT and Tensor performances, delivering more than a threefold increase on paper.

Nvidia RTX 2000 ADA Generation performance uplifts

Nvidia RTX 2000 ADA Generation performance uplifts over the Nvidia RTX A2000 12GB.

Notably, this GPU shows substantial performance uplifts, ranging from 1.3X to 1.6X, with significant improvements observed in generative AI workloads. In comparative benchmarks, it delivers twice the performance of the older Quadro P2200 in Solidworks SPECviewperf 2020 and up to four times the performance in Solidworks Visualize benchmarks.

Democratizing Generative AI Compute

While the Nvidia RTX 2000 ADA Generation GPU is not the company’s most performant GPU, it still offers a competitive price point and a slew of performance improvements over previous offerings. Its real value, however, lies in its potential to democratize access to high-quality, efficient computing for a broader range of users and applications. From small-scale startups to large research institutions, the RTX 2000 Ada enables more entities to leverage the power of generative AI without prohibitive costs or logistical constraints.

All images used courtesy of Nvidia.