News

Over a Year Into HBM3, Micron Claims New Win for the Memory Standard

July 31, 2023 by Aaron Carman

Micron aims to strike a balance of high bandwidth, power efficiency, and speed with its latest generation of high-bandwidth memory (HBM3) technology.

Micron Technology recently announced the sampling stage of its “industry’s first” high-bandwidth memory 3 (HBM3) Gen2 memory die. As generative AI models become more prevalent, designers must overcome the high cost and performance bottleneck of AI memory and power efficiency. Gen2 HBM3 may remedy some of these growing pains thanks to its unique benefits compared to legacy memory solutions.

 

Micron’s HBM3 Gen2 memory die

Micron’s HBM3 Gen2 memory die may improve density and bandwidth while lowering the power requirements and increasing the efficiency of the compute module. Image used courtesy of Micron Technology
 

The latest HBM3 memory is built on Micron’s 1β (1-beta) DRAM process node, allowing 24 Gb memory die to be assembled into 8-high or 12-high 3D stacks. This verticality not only improves pure memory density but also bandwidth and power performance. This article dives into these advantages and Micron's latest memory offering to give readers a sense of how the HBM3 standard may enable new and powerful memory advancements.

 

HBM3 Standard Rises to the Call for Fast, Efficient Memory

Since its introduction, the von Neumann architecture has loomed over designers with a fundamental limit. This is especially true with AI models that require large amounts of memory, bottlenecking the overall system.

In response to this limitation, HBM uses 3D integration to further couple compute and memory die and provide more overall bandwidth and power efficiency. HBM3 Gen1, the latest approved standard from JEDEC, builds upon the HBM2 and HBM2E standards to support more bandwidth and power efficiency for the future of stacked memory die.

 

HBM3 memory uses vertically-stacked dies

HBM3 memory uses vertically-stacked dies to improve memory density and bandwidth performance. Image used courtesy of Synopsys
 

Although memory standards focus heavily on updated speed and bandwidth, power consumption is a major source of inefficiency with large deployments. As such, the HBM3 standard not only increases bandwidth from 3.6 to 6.4 Gbps and max capacity from 16 to 64 GB but considerably reduces the core voltage and power consumption of the memory devices.

 

Micron's HBM3 Gen2: Improved Performance Per Watt

Despite the fact that no HBM3 Gen2 standard has been published by JEDEC, companies such as Micron are developing the next generation of HBM devices to meet the memory needs of emerging applications. Building on its HBM2E portfolio, the Micron HBM3 Gen2 die's reported specs make it an attractive solution for bandwidth-heavy applications.

 

HBM3 memory

HBM3 memory allows for memory die to be placed very close to the compute module, lowering the latency between processor and memory and improving performance in memory-heavy applications. Image used courtesy of Synopsys
 

In terms of pure bandwidth, the HBM3 Gen2 die supports over 1.2 TB/s memory bandwidth to couple with AI compute cores, along with 50% more capacity per 8-high stack. The improved performance and reduced power consumption of HBM3 Gen2 increase performance by 2.5x per watt, improving the total operation cost of the system.

Micron will begin sampling its 36 GB 12-high stack in Q1 2024. This trend toward higher bandwidth memory using advanced packaging and integration techniques bodes well for the future of complex AI models.

 

Memory With More Dimensions

While it’s not likely that all designers will work directly with stacked memory die, engineers in all fields will feel the effects of improved memory bandwidth for HPC and AI applications. With improved memory speeds and power efficiency, AI models and HPC clusters can provide more benefits at a lower total cost.

As the official HBM3 standard evolves, chipmakers like Micron may react to discovered limitations of the standard. An upper limit on stack size, for example, could limit the overall capacity and bandwidth of a 3D memory die using today’s techniques.