Startup Claims Its AI Chips Can Outperform Google and Intel at Edge Computing

In this article, we'll assess how Kneron's NPU sizes up to Google and Intel's comparable chips.

News September 08, 2020 by Jake Hertz

With edge computing gaining popularity, many big companies have worked to develop AI workload-specific chips. Notably, big names such as Google and Intel claim solid footing in the market with Coral’s Edge TPU and the Movidius Myriad X VPU, respectively.

Edge computing visualized

Edge computing visualized. Image used courtesy of VectorMine

Now, a new player has entered the field and is claiming to rival some of the big names in the industry. Kneron, a California-based startup, has recently released its KL720 AI SoC, which they say outperforms anything else on the market in terms of speed, power efficiency, and cost. To assess the validity of this claim, we set out to compare the new chip with Google and Intel's comparable devices.

Google Coral's Edge TPU

Coral is Google’s lesser-known initiative for the development of edge computing platforms. In the company's own words, “Coral is a hardware and software platform for building intelligent devices with fast neural network inferencing.”

At the heart of Coral’s devices is the Edge TPU (Tensor Processing Unit) coprocessor. This ASIC was designed explicitly for state-of-the-art neural networks at high speed with a low power cost—and the specs seem to back this up.

Edge TPU chips measure 5 mm x 5 mm

Edge TPU chips measure 5 mm x 5 mm. Image used courtesy of Coral

The TPU offers a max speed of 4 TOPS at a cost of 2 W, giving it a power efficiency of 2 TOPS per watt. In terms of functionality, the Edge TPU is capable of executing deep feed-forward neural networks (DFF) such as convolutional neural networks (CNN), making it useful for a variety of on-device vision-based machine learning applications.

Where this chip falters is accessibility. Google does not sell these chips to designers; instead, it needs to be integrated via Coral’s Accelerator Module. This is a surface-mounted module (10 mm x 15 mm) that includes the Edge TPU and all required power management with a PCIe Gen 2 and USB 2.0 interface. So while this may provide ease of integration, it denies designers the ability to use Edge TPU as a standalone device for their unique designs.

Intel’s Movidius Myriad X VPU

From Intel’s camp, we’ll look at the Movidius Myriad X Virtual Processing Unity (VPU).

According to Intel, the VPU works by coupling highly-parallel programmable compute with workload-specific hardware acceleration in a unique architecture that minimizes data movement. In this way, they are able to achieve a balance of power efficiency and compute performance, enabling devices with deep neural network and computer vision-based applications.

Movidius Myriad X

Movidius Myriad X. Image used courtesy of Intel

Intel says this chip can operate at speeds up to 4 TOPS in general and offers 1 TOPS of performance when running deep neural network inferences. This comes at a cost of a minimal TDP of 1.5 W, giving this chip a general 2.67 TOPS per watt and a DNN inference performance of .67 TOPS per watt.

Kneron’s KL720 NPU

Finally, the new competitor is Kneron’s KL720 AI SoC.

At the heart of this chip is Kneron's neural processing unit (NPU). The NPU was designed specifically for edge devices, claiming to provide high computing performance with low power consumption in a small area.

Kneron’s KL720

Kneron’s KL720. Image used courtesy of Kneron

The KL720 comes in at .9 TOPS per watt and can reach up to 1.5 TOPS max performance. The chip also has the ability to process 4K still images and videos at a 1080P and provides 3D sensing for facial recognition.

It also offers new audio recognition tools for natural language processing applications.

VPU vs. TPU vs. NPU

When comparing these three AI chips we see a lot of similarities and differences.

Besides each calling their AI processor something different (TPU, VPU, and NPU), these chips also differ in performance. While Intel’s chip can reach 4 TOPS, it slows down when running DNN inferences. This makes Google's Edge TPU the fastest of the three—specifically, four times more TOPS during inference than Intel. Kneron’s chip also edges out Intel’s in terms of speed during inference, offering about a 50% increase in TOPS.

In terms of power efficiency, Google wins as well. The Edge TPU offers 2 TOPS per watt compared to .9 for the KL720 and .67 for Intel’s chip.

Where Google fails, however, is accessibility. Since you cannot purchase the Edge TPU as a standalone device, a designer could not incorporate the TPU alone into his or her design. In this sense, Kneron’s claim of being a top competitor in the market seems valid, offering better power efficiency and speed than other standalone chips, namely Intel’s Movidius Myriad X.

All things considered, all three are very impressive devices that will help usher in the future of edge computing.