Doug Burger, a computer chip researcher at Microsoft, is the man who first came up with the idea of equipping Microsoft servers with FPGAs as accelerators. He believed that, in the future, giant internet companies will need to design both their software and hardware to meet the demands of future complex algorithms. And the world’s hardware manufacturers were not going to build what Microsoft needed.
ASIC, FPGA, or Software on a General-Purpose Chip?
To achieve the optimum speed, we need to implement functions on special-purpose chips.
Utilizing ASICs in servers faces two main limitations: Firstly, to reduce management issues and have a consistent platform, we prefer homogeneity in datacenters. Secondly, datacenter services evolve extremely rapidly. For example, servers need to cope with the changes in AI algorithms which can occur dozens of times a year. It will not be affordable at all to utilize a special-purpose chip for every new problem. Moreover, since it takes several months to build an ASIC—the new chip will sometimes be obsolete by the time it arrives. Hence, despite their speed improvements, ASICs are not desirable for these applications.
FPGA implementations are not as fast as ASICs; however, they are still capable of offering speed improvements over designs which use programming languages to implement a function on a general-purpose processor. Therefore, FPGAs can be used to achieve the flexible acceleration required for datacenters.
Burger proposed a project called Catapult, which utilizes the speed and flexibility of FPGAs to improve the efficiency of Microsoft servers. Although Microsoft has been designing software for over 40 years, it did not have the engineers and tools to design its own servers. The project was not warmly welcomed at first; however, Burger and Qi Lu, who runs Bing, were finally able to convince their boss to start working on Catapult.
Project Catapult to rapidly and efficiently implement AI algorithms. Image courtesy of Microsoft.
After the experiments verified that the FPGA-based datacenters can expedite a specific machine learning algorithm by 40 times, Microsoft announced equipping Bing with the technology in 2014.
The programmable chips are currently utilized by Bing to perform a part of the ranking algorithm on hardware rather than software. In near future, the hardware will implement deep neural network algorithms to dramatically speed up Bing’s search process. Hence, Bing will be faster while returning more relevant results by taking a wider array of sources into account.
While Catapult’s cost is less than 30% of all the hardware used in a server and it burns less than 10% of the overall power, it speeds up data processing by a factor of two.
All the internet giants—Google, Amazon, Baidu, etc.—are somehow following a similar direction: adding extra chips to their servers so that they can adapt their systems with the rapidly changing algorithms.
For example, Google resorts to the exorbitantly expensive solution of ASICs to offer a higher speed. Google’s chips, called tensor processing units or TPUs, sacrifice flexibility to minimize the execution time of neural network algorithms. However, the day Google comes up with a new neural networking model, the company will need to manufacture a new chip.
The New Tech Mitigates the End of Moore’s Law
As observed by Moore’s law, we have enjoyed increasingly faster and more affordable processors for a long time. However, over the last decade, we have witnessed a slowing increase in CPU performance. This has been frustrating for many from computer manufacturers to datacenter managers.
To circumvent this trend, Microsoft researchers were not interested in incremental improvements. They were looking for a radical change. They decided to develop part of the algorithms on FPGA and gain some speed improvements over a general-purpose CPU. In other words, they offloaded some of the computational burden of the algorithms from the slow general-purpose CPU to the fast special purpose FPGAs.
Since FPGAs are not new, people generally underestimate their potential. That is why, before the Project Catapult, nobody had seriously considered using FPGAs at large scale for cloud computing. Lu believes that the FPGA-based technology will allow Microsoft to expand its processing power well until 2030. He notes that, after 2030, it is very likely that quantum-enabled ultrafast computers will be available.
FPGAs Could Benefit Other Microsoft Services
With the successful stride of Bing in 2014, Microsoft later decided to employ a similar technology in Azure, the company’s cloud computing service, and Office 365. Each of these services has different bottlenecks and priorities. Bing relies on the technology to facilitate new AI algorithms. Azure’s acute problem is network traffic, hence, a modified version of the technology was proposed to route the data. Office 365 uses the new scheme to perform encryption, decryption, and machine learning algorithms.
Derek Chiou, Microsoft partner hardware engineering manager, explains that using FPGAs at the front door can lead to a faster, more secure network. He further clarifies that when we go to the bank to withdraw some money, we go to the teller, not the manager. Similarly, the FPGA offloads the unnecessary computational burden from the main processor.
It is believed that, in the future, every new Microsoft server will be equipped with the technology.
Project Catapult team members, from left, Adrian Caulfield, Doug Burger, Andrew Putnam, Eric Chung, and Sitaram Lanka. Image courtesy of Microsoft.
Microsoft’s technology has had a significant impact on the FPGA market. Intel acquired Altera for $16.7 billion last summer. The company is planning to use Altera FPGAs in many applications such as cars, robots, drones, and more.
However, Diane Bryant, Intel’s executive vice president, notes that the shifts in the FPGA market persuaded Intel to acquire Altera. She believes that a third of all servers in the world will soon use FPGAs as accelerators.
For a detailed explanation of how to use FPGAs as accelerators in cloud computing, please refer to the paper published by Microsoft’s research team.