How Does Non-Volatile Memory Express Reach Such High Data Transfer Rates?August 11, 2020 by Steve Arar
Using PCIe protocols, NVMe addresses the issue of data rate bottlenecks—proving to be even faster than SAS and SATA protocols purpose-built for hard disk drives.
Non-volatile memory express (NVMe) is a scalable interface protocol developed to take full advantage of the parallelism offered by NVM technologies like Flash. With today’s fast solid-state drives (SSDs), the performance bottleneck is the data rate of the host interface rather than the bandwidth of the storage device.
NVMe is developed to address this issue. It uses the PCI Express (PCIe) protocol and is designed to be faster than SAS and SATA protocols that were originally designed with the hard disk drive (HDD) characteristics in mind.
In this article, we’ll briefly discuss some of the basic concepts that allow NVMe to achieve a high data transfer rate. Then, we’ll briefly look at Microchip’s Flashtec NVMe 3108—a recently released 8-channel PCIe Gen 4 enterprise NVMe SSD controller—as an example.
A Fundamental Difference Between SSDs and HDDs
With an HDD, a write/read operation to/from a particular sector of the disk requires moving the head assembly to the right location on the spinning platter (a delay often referred to as seek time). Therefore, HDDs access data sequentially and relatively slowly.
The SAS protocol, which is a mainstream choice for many storage buyers, was originally designed with the HDD characteristics in mind. In contrast, Flash storage systems employ different levels of parallelism to achieve higher read/write throughputs.
A typical SSD architecture is shown below.
Example of a typical SSD architecture. Image used courtesy of Tyler Thierolf and Justin Uriarte
In this case, several different NAND memory devices are organized as a channel. Data pipelining is usually employed to operate different memory devices of a channel in parallel and achieve higher throughput.
For example, while a write operation is being performed in the first memory device of a channel, the address information of an upcoming write operation can be transferred to a second memory device. To achieve this parallel operation, the Flash controller should distribute the information it receives from an external interface between different channels.
The above structure enables an SSD to access memory cells instantaneously and in parallel. For today’s SSDs, with their inherent massive parallelism and minimal latency, we need a new optimized interface protocol.
This new protocol, which is designed from the ground up for SSD architecture, is NVMe. NVMe utilizes the PCIe bus along with command queuing to achieve unparalleled scalable performance. Below, we’ll take a look at these two features.
PCIe is a general-purpose bus interface used both in client and enterprise compute applications. With a PCIe based SSD, we can connect the storage device directly to the backplane of a server.
However, with SATA and SAS protocols, a storage controller block is required between the SSD and the PCIe port of the CPU. As a result, a PCIe-based solution brings the SSD closer to the CPU and eliminates the latency associated with the storage controller.
A PCle-based solution. Image (modified) used courtesy of SATA-IO
Besides, the bandwidth/performance of PCIe is scalable and can serve the needs of multi-die Flash storage devices in future. For example, the third generation of PCIe supports as many as 16 lanes for data transfer with each lane supporting a maximum data throughput of 1 Gb/s.
NVMe incorporates command queuing that allows the processor to queue up the issued commands and perform them in an efficient way. This enables the processor to fully utilize the parallelism offered by the SSD architecture.
When a command attempts to access a busy NAND die, the host can simply queue this command and continue with the next command in the queue. The queuing technique is employed in SATA and SAS protocols as well; however, these protocols support a single queue of 32 and 256 commands, respectively.
NVMe can have up to 64,000 queues with each queue supporting as many as 64,000 commands.
Diagram of NVMe queues. Image used courtesy of NVM Express Org and Seagate
This along with the high data transfer rate of PCIe enables NVMe to take full advantage of the parallelism offered by an SSD.
Microchip’s Flashtec NVMe 3108
The Flashtec NVMe 3108 controller exemplifies many of the principles discussed above. The device is designed to provide cloud-scale infrastructure with the storage bandwidth and density required by artificial intelligence (AI) and machine learning (ML) workloads.
Diagram of Microchip's Flashtec NVMe 3108 controller. Image used courtesy of Microchip
It is an 8-channel PCIe Gen 4 enterprise NVMe SSD controller. The new device offers greater than one million IOs operations per second (IOPS) for random workloads and greater than 6 Gigabytes per second (GB/s) of sequential bandwidth.
Microchip says the Flashtec NVMe 3108 controller incorporates end-to-end enterprise-class data integrity and critical security features such as hardware root of Trust to increase the system reliability.
Featured image (modified) used courtesy of Microchip
Have you worked with NVMe technology in the past? Share your experiences in the comments below.