When Google decided to design their own chip with TPU, it generated a lot of buzz for faster and smarter computations with its ASIC-based architecture. Google claimed its move would significantly enable intelligent apps to take over, and industry experts somehow believed a reply from Microsoft was always coming (remember Bing?). Well, Microsoft has announced its arrival into the game – with its own real-time AI-enabled chip called Brainwave.
Interestingly, as the two tech giants compete in chip manufacturing, developers are certainly going to have more options now, while facing the complex computational processes of modern day systems.
What is Brainwave?
Until recently, Nvidia was the dominant market player in the microchip segment, creating GPUs (Graphics Processing Unit) for faster processing and computation. But after Google disrupted the trend with its TPU (tensor processing unit) processor, the surprise package in the market has come from Microsoft. More so because its ‘real-time data processing’ Brainwave chip claims to be faster than the Google chip (the TPU 2.0 or the Cloud TPU chip).
The one thing that is common between both Google and Microsoft chips is that they can both train and simulate deep neural networks much faster than any of the existing chips. The fact that Microsoft has claimed that Brainwave supports Real-Time AI systems with minimal lag, by itself raises an interesting question – are we looking at a new revolution in the microchip industry?
The answer perhaps lies in the inherent methodology and architecture of both these chips (TPU and Brainwave) and the way they function. What are the practical challenges of implementing them in real-world applications?
The Brainwave Architecture: Move over GPU, DPU is here
In case you are wondering what the hype with Microsoft’s Brainwave chip is about, the answer lies directly in its architecture and design. The present-day complex computational standards are defined by high-end games for which GPUs (Graphical Processing Units) were originally designed. Brainwave differs completely from the GPU architecture: the core components of a Brainwave chip are Field Programmable Gate Arrays or FPGAs. Microsoft has developed a huge number of FPGA modules on top of which DNN (Deep Neural Network) layers are synthesized. Together, this setup can be compared with something similar to Hardware Microservices where each task is assigned by a software to different FPGA and DNN modules. These software controlled Modules are called DNN Processing Units or DPUs. This eliminates the latency of the CPU and the need for data transfer to and fro from the backend.
The two methodologies involved here are seemingly different in their architecture and application: one is the hard DPU and the other is the Soft DPU. While Microsoft has used the soft DPU approach where the allocation of memory modules are determined by software and the volume of data at the time of processing, the hard DPU has a predefined memory allocation which doesn’t allow for flexibility so vital in real-time processing. The software controlled feature is exclusive to Microsoft, and unlike other AI processing chips, Microsoft have developed their own easy to process data types that are faster to process. This enables the Brainwave chip to perform near real-time AI computations easily. Thus, in a way Microsoft brainwave holds an edge over the Google TPU when it comes to real-time decision making and computation capabilities.
Brainwave’s edge over TPU 2 – Is it real time?
The reason Google had ventured out into designing their own chips was their need to increase the number of data centers, with the increase in user queries. They had realized the fact that instead of running data queries via data centers, it would be far more plausible if the computation was performed in the native system. That’s where they needed more computational capabilities than what the modern day market leaders like Intel X86 Xeon processors and the Nvidia Tesla K80 GPUs offered. But Google opted for Application Specific Integrated Circuits (ASIC) instead of FPGAs, the reason being that it was completely customizable. It was not specific for one particular Neural Network but was rather applicable for multiple Networks. The trade-off for this ability to run multiple Neural Networks was of course Real Time computation which Brainwave could achieve because of using the DPU architecture. The initial data released by Microsoft shows that the Brainwave has a data transfer bandwidth of 20TB/sec, 20 times faster than the latest Nvidia GPU chip. Also, the energy efficiency of Brainwave is claimed to be 4.5 times better than the current chips. Whether Google would up their ante and improve on the existing TPU architecture to make it suitable for real-time computation is something only time can tell.
Future outlook and challenges
Microsoft is yet to declare the benchmarking results for the Brainwave chip. But Microsoft Azure customers most definitely look forward to the availability of Brainwave chip for faster and better computational abilities. What is even more promising is Brainwave works seamlessly with Google’s TensorFlow and Microsoft’s own CNTK framework. Tech startups like Rigetti, Mythic and Waves are trying to create mainstream applications which will employ AI and quantum computation techniques. This will bring AI to the masses, by creating practical AI driven applications for daily consumers, and these companies have shown a keen interest in both the Microsoft and the Google AI chips. In fact, Brainwave will be most suited for these companies such as the above which are looking to use AI capabilities for everyday tasks, as they are less in number because of the limited computational capabilities of the current chips. The challenges with all AI chips, including Brainwave, will still revolve around their data handling capabilities, the reliability of performance, and on improving memory capabilities of our current hardware systems.