Hailo Goes Head to Head With Intel and Google on AI Acceleration ModulesOctober 09, 2020 by Antonio Anzaldua Jr.
Hailo, an Israel-based startup, has benchmarked its AI acceleration modules against its Intel and Google counterparts. What's the verdict?
Hailo, an Israel-based AI chipmaker, recently challenged its big-name competitors Intel and Google by claiming its new AI acceleration modules for edge devices can analyze and process deep neural networks faster and more efficiently than Intel and Google's comparable chips.
The new M.2 and Mini PCIe AI acceleration modules integrate the Hailo-8 processor and can be utilized in realtime with any edge device.
Hailo’s M.2 AI acceleration module is able to utilize a built-in processor that can analyze models at 26 TOPS while consuming 3 TOPS per watt. Image used courtesy of Hailo
What does AI acceleration mean to data scientists and hardware engineers? And is Hailo’s AI accelerator modules really the best option—even compared to Intel's Movidius module and Google's Edge TPU modules?
The Purpose of AI Acceleration
AI acceleration refers to hardware that boosts the speed for any machine learning module. Analyzing complex unstructured data such as voice, acoustics, images can be a costly and time-consuming issue developers face in deep neural network environments.
While deep neural networks are essential to analyze unstructured data, these networks have anywhere from 50 to 150 layers with each layer requiring billions of calculations to process. This is where AI accelerators come into play. These modules allow data to be processed, calculated, and analyzed data at faster rates.
Hailo-8 Processor Integrated in AI Acceleration Module
Hailo's AI acceleration modules have some leverage over other competitors because it integrates the Hailo-8 processor—a device that Hailo says delivers 26 tera-operations per second (TOPS) in chip processing capabilities and 3 TOPS per watt in power consumption.
The module can be plugged into any existing edge device with the appropriate M.2 or Mini PCIe sockets to execute real-time deep neural network inferencing. Design engineers have many ways to utilize Hailo’s AI acceleration modules since various PCIe sockets can be found on most standard PC motherboards to add GPUs, RAID cards, Wi-Fi cards, and SSDs.
Orr Danon, CEO of Hailo explains, "Our new Hailo-8 M.2 and Mini PCIe modules will empower companies worldwide to create new powerful, cost-efficient, innovative AI-based products with a short time-to-market—while staying within the systems' thermal constraints. The high efficiency and top performance of Hailo's modules are a true gamechanger for the edge market."
Aside from incorporating an AI processor to handle 26 trillion operations, Hailo’s M.2 is said to outperform a few neural network benchmarks from Intel and Google.
Benchmark performance graph for Hailo in comparison to Google and Intel’s AI acceleration solutions. Image used courtesy of Hailo
These benchmarks cover a variety of machine learning applications such as image classification, speech recognition, and object detection.
Hailo vs. Intel's Movidius Myriad Modules
Intel’s Movidius Myriad modules feature the first Neural Compute Engine, a dedicated hardware accelerator for deep neural network inference. The Movidius module is equipped with a toolkit, OpenVINO, that includes necessary development tools, frameworks, and APIs to implement the custom vision, imaging, and deep neural network workloads on the chip.
Based on the architecture of Intel’s Movidius, the maximum number of neural network inference operations per second achievable by the Neural Compute Engine is 916 billion operations per second.
Intel’s Movidius benchmark modules in terms of frames per second. Image used courtesy of Intel
Hailo hasn’t released a full list of benchmarks to compare each module line item to Intel. However, based on the data that is available, Hailo’s benchmark modules are analyzing data at higher frames per second. For example, Intel’s benchmark MobileNet-V2 is at 594 FPS while Hailo has its MobileNet-V2 surpassing 2500 FPS.
Intel’s Myriad modules are at 4 TOPS of processing capabilities that are dedicated to deep neural network computing. The Neural Compute Engine directly interfaces with Intel’s memory fabric to avoid any memory bottlenecks when transferring data.
Hailo vs. Google’s Edge TPU Modules
The Edge TPU is a small ASIC designed by Google that provides high-performance ML inferencing for low-power devices. Google focused on providing an end-to-end AI solution that could be interfaced with edge devices and Google cloud.
Google’s Edge TPU’s benchmark for mobile vision models, MobileNet-V2, is nearly at 400 FPS, which is lower than Intel and significantly lower than Hailo.
Google’s Edge TPU has an integrated power control module while still maintaining a 10mm x 15mm footprint. Image used courtesy of Google Coral
An individual Edge TPU can perform at 4 TOPS of chip computation while using 2 TOPS per watt, which in terms of power efficiency is a better option than Intel and Hailo. A Google Edge TPU chip is only available on an accelerator multi-chip module that is still compatible with various boards, only requiring a PCIe Gen 2 and USB 2.0 interface.
Choosing a Fitting AI Accelerator
AI accelerators increase computing speeds of large data, preventing bottlenecks and saving time when working with deep neural networks.
Hailo, Intel, and Google’s respective AI acceleration modules can all integrate into standard frameworks, such as TensorFlow and Pytorch—both of which Hailo supports with its dataflow compiler.
If the machine learning application requires a low-power consumption module, Google’s Edge TPU is the most fitting solution. Intel’s Movidius modules may be the best option for ML applications that require hundreds of models to arrive at a smaller, defined model like enhanced vision and image processing workloads.
If data scientists and hardware engineers need an AI accelerator with a built-in processor, they might gravitate toward Hailo’s M.2 module for heavy mobile vision applications such as object detection and image classification.