32-bit MCUs Handle Heavy ML Workloads on Battery-Operated IoT Devices

Alif claims its new Ensemble line of microcontrollers efficiently processes heavy machine learning workloads on battery-powered devices.

News November 13, 2023 by Arjun Nijhawan

Alif Semiconductor has introduced its Ensemble family of fusion microcontrollers. These microcontrollers "fuse" different computing technologies—a real-time MCU core, a machine learning accelerator, and some application MPU cores. Alif integrated these MCUs with a "microNPU" directly next to each CPU core, purportedly affording a performance two orders of magnitude higher than traditional 32-bit MCUs executing similar workloads with various AI models.

Alif claims the Ensemble family is the only 32-MCU option that can manage heavy ML workloads for battery-operated IoT devices.

Since many edge ML applications require performance levels of 50–250 Giga (billion) operations per second (GOPs)—a level 32-bit MCUs are unable to meet on their own—many developers must turn to 1,000 GOP-level GPU-based accelerators. The addition of such accelerators, however, comes at the cost of power, size, complexity, and price point.

"Only Alif fills this gap in the middle," said Alif's VP of marketing Mark Rootz, in a press release. "This is the sweet spot for battery-powered products on the edge."

Alif Announces New 32-bit MCUs and Fusion Processors

Alif has rolled out four members of the Ensemble lineup: E1, E3, E5, and E7. With a single core, the E1 is the highest-efficiency device in the lineup. The quad-core E7 is the highest-performance device offered. According to a product brochure (downloads as PDF), Alif designed these devices for high scalability, so designers can adapt the MCUs' performance based on the needs of the workload.

Alif's new family of Ensemble microcontrollers

Alif's new family of Ensemble microcontrollers and microprocessors.

One of the key ways in which the Ensemble lineup achieves this is through Alif’s autonomous intelligent power management (aiPM) technology. This technology enables several independent power domains that can shut off when certain parts of the chip are not in use. This applies even to the most powerful E7 fusion processor, allowing it to work for low-power applications when needed.

To address the data privacy needs of edge AI-enabled IoT devices, each chip includes an isolated portion dedicated to security throughout its lifecycle. Readout prevention protects application code and ML models from IP theft and malware attacks. Extensive cryptography guards data on the device and over communication channels.

aiPM Technology: The Heart of the Ensemble Lineup

With aiPM, the Ensemble chip is divided into several regions, referred to in the datasheet as the high-performance region, the high-efficiency region, and the always-on region.

The always-on region comprises basic functions that remain on as long as a power source is connected to the device. Such functions include a backup SRAM, LP general-purpose IO (LPGPIO), and timer. The high-performance region of the E7 contains compute resources for performance-intensive applications: two levels of cache; an Arm Ethos-55 NPU for AI/ML; two Cortex-A32 low-power processors; two AI-capable Cortex-M55 (HP) processors; a GPU; and support for MIPI CSI-2, Ethernet, MIPI DSI, and more.

High-level block diagram of E7 quad-core fusion processor

High-level block diagram of E7 quad-core fusion processor.

The high-efficiency region includes a single Cortex-M55 (HE) processor, instruction and data cache, an Ethos NPU-55 (HE), and support for audio via LPI2S and LPPDM. By blending such high-efficiency and high-performance functionality on a single device, the Ensemble family achieves workload-based scalability for various applications.

As system designers seek to reduce dependence on the cloud for AI/ML processing, scalable edge AI solutions like the Ensemble family will likely advance in the years ahead.

All images used courtesy of Alif Semiconductor.