Arm Unveils TCS23 Platform With Visual Computing Cores and More
Arm's latest Total Compute Solutions includes an updated suite of CPU, GPU, system, and software IP for mobile computing.
Last week, Arm released Arm Total Compute Solutions (TCS23), a suite of Arm intellectual property (IP) built on top of the latest ARMv9 architecture. This package provides designers with a mix-and-match approach for mobile computing applications via a portfolio of CPU, GPU, system, and software IP.
The TCS23 ecosystem. Image used courtesy of Arm
One of the standout features of last year’s TCS22 was the introduction of hardware raytracing on the Immortalis Arm GPU. This year’s announcement revs up both its CPU and GPU offerings, primarily through improvements in power and efficiency with similar or improved performance.
CPU Clustering for Mobile Computing
Included in TCS22 are three new Arm Cortex lines of CPUs focusing on mobile device computing such as smartphones, tablets, and laptops. In particular, the Cortex processors are meant to enable graphically-intensive applications such as gaming, augmented/virtual reality, machine learning, and on-device security.
The Cortex-X4 is a fourth-generation, high-performance CPU core in Arm’s X-series line. Built on top of the latest ARMv9.2 architecture, the device features a 3.4 GHz clock speed, L1 cache (65 KB), L2 cache (0.5–2 MB), and L3 cache (32 MB). Additionally, the instruction bandwidth has been increased to 10 instructions per cycle. These updates reportedly result in a 15% performance gain over the last-generation Cortex CPU.
The 10x performance increase of the Cortex-X4 and the 4x increase of the Cortex-A720 presents a powerful competitor in the Arm-based high-performance laptop landscape. Image used courtesy of Arm
The new 64-bit Cortex-A720 focuses on power efficiency, particularly in applications that need to balance performance and battery life in a big.LITTLE heterogeneous computing configuration. It features L1 I-cache and D-cache (32KB or 64KB), L2 cache (128KB/256KB/512KB), and optional L3 cache (256KB to 32MB). With this device, Arm reports a 20% improvement in power efficiency and a 4.5% improvement in performance over the previous generation.
Finally, the 64-bit Cortex-A520 is another updated CPU, focusing on efficiency for lightweight and background task loads in a big.LITTLE configuration. The CPU features L1 I-cache and D-cache (32KB or 64KB), optional L2 cache (128KB/192KB/256KB/384KB/512KB), and optional L3 cache (256KB to 32MB). Compared to the last generation, Arm states a 22% improvement in power efficiency and an 8% improvement in performance.
The DynamIQ Shared Unit 120 (DSU-120) supports up to 14 A-class architecture CPU cores in a cluster. It provides a shared L3 memory system, security features, control logic, and SoC interfacing. Arm presented several core configurations that can support applications ranging from wearable devices to high-performance laptops, supporting the “mix-and-match” approach.
Diagram of the DynamIQ Shared Unit 120. Image used courtesy of Arm
New GPUs for Varied Graphical and Power Demands
TCS23 also includes three fifth-generation GPU architectures. Their base technical configurations are nearly identical, with some key differences in how many cores are supported. In some cases, there are also optional ray-tracing capabilities. These are mobile GPUs meant to support AAA mobile gaming, such as Genshin Impact or Fortnite.
First is the Immortalis-G720 targeting flagship smartphones for gaming and machine learning. It features Deferred Vertex Shading, 2x/4x/8x/16x multi-sampling anti-aliasing, raytracing, and API support for OpenGL ES 1.1/2.0/3.2, Vulkan 1.3, and OpenCL 1.2/2.1/3.0 Full Profile. The GPU is scalable to 10 or more cores and also provides hardware ray-tracing capabilities, making it the most performant out of the three GPU offerings. The Immortalis-G720 is reported to have a 15% gain in performance, using 40% less memory bandwidth compared to the last generation of Arm GPU architectures.
The Mali-G720 shares the same technical specs as the Immortalis-G720, although only scalable from six to nine cores, and has optional hardware ray-tracing capabilities. Last but not least is the Mali-G620, which once again shares the same base configuration but is only scalable from one to five cores, providing the most lightweight GPU option.
TCS23's Machine Learning and Security Benefits
Arm claims that TCS23 improves machine learning capabilities through a combination of hardware and software.
For CPU-based machine learning applications such as INT8 inference for object detection/classification, real-time recognition, and body pose tracking, the company reported a 12% improvement for the Cortex-X4, 9% improvement for the Cortex-A720, and 13% improvement for the Cortex-A520 compared to their predecessors. On the GPU side, optimizations in the Arm NN and Arm Compute Library provided a 4x improvement in machine learning tasks.
TSC23 also provides some security improvements. The platform supports the Android Virtualization Framework for Arm 64-based devices, Pointer Authentication (PAC), and Branch Target Identification (BTI) to eliminate Return Oriented Programming (ROP) and Jump Oriented Programming (JOP) attacks.