It was in the month of May at the VISION conference that Intel announced its Gaudi2 hardware. This is the company’s second-generation accelerator for training and thrust and has also received an update to the open-source Linux kernel driver or userspace software stack. Allegedly, as claimed by Intel, the Gaudi2’s AI training performance is twice that of the NVIDIA’s A100 competing solutions. Besides, the chip manufacturing process has been upgraded from the original 16nm to 7nm.
Furthermore, the Gaudi2 comes with a 3x TPC improvement, there is an increment as the 32GB HBM2 high bandwidth memory was upgraded to 96GB HBM2e, and the SRAM cache was also doubled to 48MB, augmented by 24×100 GbE connections, but the thermal design power (TDP) was also increased from 350W to 600W. Currently, Gaudi2 support is built on top of the existing Gaudi and Goya accelerators and kernel drivers.
The Gaudi2 is very close to the first generation and the overall structure is similar. About 158,000 lines of code must be added to the kernel to enable Gaudi2. Most of those lines are header files. Additionally, Intel today released TOC_LLVM 1.1, a new version of the open-source compiler for the Habana Labs accelerator.
On the other hand, Intel is also working hard on the upgradation of the SynapseAI Core for Gaudi2 open source components, with a new patch series for the new Gaudi2 kernel driver code. As stated that the new version shall evolve from the same Gaudi architecture, the follow-up code review is expected to be fairly smooth. We can be hopeful that this shall put additional support for Intel Gaudi2 within the v5.20 kernel cycle.
Last but not least, at the VISION 2022 conference, Intel announced the Greco accelerator card, an upgrade from Goya that is efficiency-optimized for deep learning applications.