Roundup: Flood of New AI Hardware Comes to Bolster Data Centers
In this roundup, we highlight six recent announcements that demonstrate how semiconductor leaders are redefining the hardware stack for large-scale AI inference.
Data centers are under mounting pressure from a new class of AI workloads that demand massive compute power, high-speed data movement, and energy efficiency all at once. Training and inference models continue to grow in size and complexity, exposing bandwidth bottlenecks, rising power costs, and thermal limits.

Recent announcements from Qualcomm, Intel, Cisco, MIPS, Vsora, and Axelera AI show how semiconductor companies are attacking these constraints from different angles: rethinking compute architecture, interconnect speed, and memory access to make AI acceleration more efficient and scalable across racks, clusters, and facilities.
1. Qualcomm's Accelerator Cards
Qualcomm designed its new AI200 and AI250 accelerator cards to deliver rack-scale performance for AI inference at a lower total cost of ownership. The AI200 card includes 768 GB of LPDDR memory and uses direct liquid cooling to manage heat. The newer AI250 adds a near-memory setup that boosts bandwidth by about ten times while cutting power use.

Qualcomm’s AI200 and AI250 rack-scale inference systems deliver high memory bandwidth and low power consumption for large-scale generative AI workloads. Image used courtesy of Qualcomm
Each rack draws around 160 kW and can run large language and multimodal inference jobs without slowing down. Qualcomm’s next set of inference chips works smoothly with common AI frameworks like Hugging Face and offers simple, one-click model deployment. The company plans yearly updates that keep boosting efficiency and scalability for large data center operations.
2. Intel’s GPU for Data Centers
Intel’s new Crescent Island GPU for data centers is built on the Xe3P design and tuned for AI inference jobs that need a lot of memory and bandwidth but can’t exceed normal cooling limits. It carries 160 GB of LPDDR5X memory and is made for air-cooled enterprise servers that need strong performance without extra power draw. Intel positions Crescent Island as part of a broader, heterogeneous AI ecosystem combining Xeon 6 processors and GPUs under an open software stack. Sampling is expected in late 2026, giving developers early access for inference workflows and “tokens-as-a-service” operations.
3. Cisco's SoC and Routing System
AI training and inference models are quickly exceeding the capacity of single data centers, forcing providers to distribute workloads across regions. Cisco’s new Silicon One P200 chip and 8223 routing system are built to manage that surge, offering a massive 51.2 Tbps of routing capacity and over 3 Exabits per second of interconnect bandwidth.
![]()
Cisco’s 8223 routing systems and Silicon One P200 chip power distributed AI data centers with 51.2 Tbps bandwidth and deep-buffer routing for secure, scalable connectivity. Image used courtesy of Cisco
The deep-buffer architecture allows traffic bursts from AI training without packet loss, while the 3RU system delivers switch-like power efficiency. With 64 ports of 800GE built in, post-quantum encryption and full programmability, the P200 architecture helps data centers “scale across” securely, linking clusters separated by hundreds of miles while minimizing latency and energy use.
4. MIPS' RISC-V-Based Processor
The MIPS I8500 processor uses an open RISC-V design and adds predictable, real-time control for data handling. It runs four threads per core and up to 24 threads in a cluster, built for uses like industrial automation, cars, and telecom systems, where AI has to react on the spot.
It enables low-latency, rule-based packet management, making it useful for SmartNICs, DPUs, and predictive maintenance systems. Designed for Linux and real-time OS support, the I8500 delivers secure, scalable performance for edge and data center use, helping bridge the gap between cloud AI and “physical AI” in connected environments.
5. Vsora's AI Inference Chip
France-based Vsora has completed tape-out of its Jotunn8 inference chip, claiming to have solved the long-standing memory wall bottleneck in AI hardware. Built on TSMC’s 5-nm process and featuring CoWoS chiplet packaging, Jotunn8 offers 3,200 TFLOPS of compute power and 288 GB of HBM3e memory while using 50% less power than leading competitors.

Vsora claims Jotunn8 has a "breakthrough chip architecture" that may solve the memory wall bottleneck. Image used courtesy of Vsora
The chip’s design removes the usual slowdowns between compute and memory, which gives it a big boost in speed and efficiency for large-scale inference work. This step puts Vsora among the top AI chipmakers worldwide, strengthening Europe’s place in the next wave of semiconductor development.
6. Axelera's AI Processor Unit
The Netherlands-based Axelera AI has launched Europa, an AI processor unit (AIPU) delivering 629 TOPS of performance at INT8 precision and up to 3–5× higher performance per watt than competing chips. Its design integrates eight second-generation AI cores using Digital In-Memory Compute (D-IMC) technology and 16 RISC-V vector processors, with 128 MB of on-chip SRAM and 200 GB/s bandwidth. Europa comes in several PCIe card options, from a single-chip, 16-GB setup to a four-chip, 256-GB version. It runs local inference with little delay and low power use. Shipments start in early 2026, aimed at multi-modal AI, computer vision, and robotics work both at the edge and in data centers.
A Theme in AI Data Centers
Across all six announcements, one theme stands out: AI data centers are evolving from isolated compute clusters into highly connected, memory-optimized systems that emphasize bandwidth, scalability, and energy efficiency.