Raspberry Pi Introduces Its First HAT Product for Generative AI
Raspberry Pi's new AI HAT+ 2 adds higher-performance neural acceleration and memory support to extend generative workloads on Raspberry Pi 5.
Raspberry Pi recently introduced the AI HAT+ 2, its first hardware-attached-on-top (HAT) product designed to support generative AI workloads on the Raspberry Pi 5 platform.

The Raspberry Pi AI HAT +2.
The new product builds on the earlier AI HAT and AI HAT+ designs with a higher-performance neural network accelerator and expanded system integration to address use cases beyond classical computer vision inference.
AI HAT+ 2 Architecture and Capabilities
The Raspberry Pi AI HAT+ 2 integrates a Hailo-10H neural network accelerator on a full-size HAT+ board compatible with Raspberry Pi 5. The Hailo-10H delivers up to 40 TOPS of INT8 inference performance, a substantial increase over the Hailo-8-based AI HAT+ designs, which are rated up to 26 TOPS. Because of this additional compute capacity, the AI HAT+ 2 is a good choice for transformer-based and diffusion-style models that require higher sustained throughput than earlier edge AI workloads.
The Hailo-10H accelerator connects to the Raspberry Pi 5 via a PCIe Gen 2 x1 interface exposed through the board’s HAT+ connector. The direct PCIe attachment avoids the bandwidth limitations of USB-based accelerators and enables lower-latency data transfer between host memory and the neural processing unit. The board includes onboard power regulation to support the accelerator’s peak load while conforming to the Raspberry Pi 5 power envelope.

The Raspberry Pi AI HAT +2 includes 8 GB of dedicated on-board RAM, so it can handle much larger models than previously possible.
Improving on earlier models, the Hailo-10H supports larger on-chip memory resources and more flexible dataflow scheduling. This allows portions of larger models to remain resident on the accelerator during inference. The HAT itself adds 8 GB of onboard memory, which Raspberry Pi claims enables LLMs and VLMs with up to six billion parameters.
Like previous models, the board is mechanically compatible with standard Raspberry Pi enclosures and thermal solutions. The AI HAT +2 comes with an additional heatsink to be installed on top of the HAT. Raspberry Pi recommends using an active cooler for sustained high-load operation.
From a software perspective, the AI HAT+ 2 is supported by Raspberry Pi OS and integrates with the Hailo software stack, including model compilation tools and runtime libraries.
Edge Acceleration for Generative AI
Generative AI models differ from traditional inference workloads in both structure and resource demand. Many are built on transformer architectures that rely on repeated matrix multiplications, attention mechanisms, and large parameter sets. And, in contrast to fixed-function vision pipelines, generative models often call for iterative execution over the same data structures. For this reason, memory locality and scheduling efficiency are important.
On resource-constrained systems, general-purpose CPUs struggle to deliver acceptable performance for generative operations within reasonable power limits. Dedicated neural accelerators address this by implementing parallel compute arrays optimized for low-precision arithmetic, typically INT8 or mixed precision. They also rely on tightly coupled memory and deterministic dataflow to reduce external memory access, which is both energy-intensive and latency-sensitive.
For edge deployments, flexibility is another constraint. Models must be quantized and compiled to fit the accelerator’s execution model without degrading output quality beyond acceptable limits. This emphasizes compiler maturity and toolchain support, especially for rapidly evolving generative workloads.
New Classes of Raspberry Pi Applications
With the launch of the AI HAT+ 2, Raspberry Pi is supporting applications that previously exceeded its platform’s practical limits. Applications such as local text generation, speech-to-text preprocessing, image captioning, and multimodal assistants are feasible when models can execute with acceptable latency on-device. For developers, new possibilities include privacy-preserving and offline-capable systems that do not depend on continuous cloud access.
The AI HAT+ 2 is available now through Raspberry Pi-approved resellers.
All images used courtesy of Raspberry Pi.