IBM’s First CPU With On-chip AI Acceleration Detects Fraud at Lightning Speeds
Containing 22 billion transistors, IBM's new processor, Telum, features on-chip acceleration for AI inferencing. The goal: to detect fraud before a transaction is complete.
This week, IBM announced its first AI-accelerated processor, Telum, designed for inference workloads that can detect fraud in real-time. According to the Federal Trade Commission's 2020 report, the cost of fraud to consumers jumped nearly 100 percent from 2019 to 2020.
The AI-accelerated Telum, IBM’s next-gen processor for IBM Z and LinuxOne systems. Image used courtesy of IBM
Telum is said to be useful for financial service workloads such as:
- Credit card authorizations
- Fraud detection
- Loan processing
- Clearing and settling trades
- Anti-money laundering
Telum’s Hardware Architecture
IBM partnered with Samsung to build the Telum processor on Samsung's EUV 7nm process node. The device contains an AI accelerator directly on-die along with eight processor cores that claim to operate at speeds over 5 GHz.
The eight cores (seen as the outer eight subsystems below) are each coupled with a private 32 MB L-2 cache (central eight subsystems). The cache architecture is tightly coupled, allowing for a virtual L3-cache of 256 MB along with a total virtual L4-cache of 2 GB.
Rendered image of the Telum processor. Screenshot [1:23] used courtesy of IBM
The Telum module contains 22 billion transistors and 19 miles of wire on 17 metal layers.
The on-die AI accelerator is capable of more than 6 TFLOPS in terms of total compute capacity. The cache and AI accelerator sit on a centralized ring interconnect architecture, which gives the AI accelerator direct access to all caches and each core access to the AI accelerator. It's this design that affords Telum extremely low latency.
The Telum architecture builds on the reliability of the IBM z15 processors by reengineering an eight-channel memory interface to tolerate complete channel failure in a DIMM slot.
IBM's First Chip From the AI Hardware Center
Since its debut to AI development in 2015, IBM has set a goal to improve processor performance by 1,000x by 2029. The majority of such advancements are coming out of the AI Hardware Center.
In a bid to move away from massive datasets and ML models, IBM is teaching AI to reason and interpret as a human, and they are calling the research for this technology neuro-symbolic AI.
IBM says Telum is its first commercial processor with on-chip AI acceleration. Image used courtesy of IBM
This class of AI is reported to require less training data than existing AI technologies. To accomplish this, IBM's new AI technology leverages neural networks that interpret symbolic representations along with extracted statistical structures.
Rectifying Misguided Security Measures
To understand what Telum might look like in action, let's first consider a situation in which a less-accurate AI processor wastes both time and money for businesses and customers. Imagine a person frequenting several stationary stores over the period of an hour. In the third store, his credit card is blocked, stymying further purchases. In this instance, the data processing system tied to his account inaccurately applied security measures.
Data-derived embedded AI inference might have prevented this situation by determining that these small, similar purchases within a given geographic location were likely legitimate. IBM’s Telum's data processing system is said to avoid these inconvenient situations with a hybrid cloud environment, including "Hyper Protected Virtual Servers" and a trusted execution environment.
Edge AI Technology Around The Industry
AAC recently covered Xilinx's Versal AI-series SoC, which appears to occupy a similar functional domain to the IBM Telum processor. The Versal AI series possesses a dedicated fabric interconnect between its processor cores and the AI engines, providing access for multi-terabit throughput.
Block diagram for the Versal AI Edge platform. Image used courtesy of Xilinx
The financial industry requires sub-1 ms latency times in transactions. While IBM's clients on the Z platform are already using analytics to determine credit card authorizations, they are aiming to hit the sub-millisecond response time threshold in the future. IBM and Xilinx each make claims that their devices are approaching this threshold.
IBM’s Telum and Xilinx's Versal AI Edge series, which are both slated to be available sometime in 2022, may add serious compute power to edge devices that require AI inference—securing financial applications in the process.