Nvidia GTC Announcements Confirm it’s a Connected, Multi-Chip World
Watching the evolution of the computing industry over the last few years has been a fascinating exercise. After decades of focusing almost exclusively on one type of chip—the CPU (Central Processing Unit)—and measuring enhancements through refinements to its internal architecture, there has been a dramatic shift to multiple chip types, particularly GPUs (Graphics Processing Units), with performance improvements being enabled by high-speed connectivity between components.
Never has this been made clearer than at Nvidia’s latest GTC, or GPU Technology Conference. During the event’s keynote, company CEO Jensen Huang unveiled a host of new advancements, including the latest GPU architecture (named Hopper after computing pioneer Grace Hopper) and numerous forms of high-speed chip-to-chip and device-to-device connectivity options. Collectively, the company used these key technology advancements to introduce everything from the enormous Eos Supercomputer down to the H100 CNX Converged Accelerator, a PCIe card designed for existing servers, with lots of other options in between.
Nvidia’s focus is being driven by the industry’s relentless pursuit of advancements in AI and Machine Learning. In fact, most of the company’s many chip, hardware, and software announcements from the show have a tie to these critical trends, whether it be HPC (High Performance Computing)-type supercomputing applications, autonomous driving systems, or embedded robotics applications.
Speaking of which, Nvidia also strongly reinforced that it is much more than just a chip company at the 2022 Spring GTC, offering important software updates for its existing tools and platforms, particularly the Omniverse 3D collaboration and simulation suite. To encourage even more usage of the tool, Nvidia announced Omniverse Cloud, which lets anyone try Omniverse with nothing more than a browser.
For hyperscalers and large enterprises looking to deploy advanced AI applications, the company also debuted new or updated versions of several cloud-native application services, including Merlin 1.0 for recommender systems, and version 2.0 of both its Riva speech recognition and text-to-speech service, as well as AI Enterprise, for a variety of data science and analytics applications. New to AI Enterprise 2.0 is support for virtualization and the ability to use containers across several platforms, including VMware and RedHat. Taken as a whole, these offerings reflect the company’s growing evolution as a software provider. It’s moving from a tools-focused approach to one that offers SaaS-style applications that can be deployed across all the major public clouds, as well as via on-premises server hardware from the likes of Dell Technologies, HP Enterprise, and Lenovo.
Never forgetting its roots, however, the star of Nvidia’s latest GTC was the new Hopper GPU architecture and the H100 datacenter-focused GPU the company unveiled. Boasting a whopping 80 billion transistors, the 4 nm process-based Nvidia H100 supports several important architectural advancements. First, to speed the performance of new Transformer-based AI models (such as the one driving the GPT-3 natural language engine), the H100 includes a Transformer engine that the company claims offers a 6x improvement versus the previous Ampere architecture. It also includes a new set of instructions called DPX that are designed to accelerate dynamic programming, a technique leveraged by applications such as genomics and proteomics, that previously ran on CPUs or FPGAs.
For privacy-focused applications, the H100 is also the first GPU or accelerator to support confidential computing (previous implementations only worked with CPUs), allowing models and data to be encrypted and protected via a virtualized trusted execution environment. The architecture does allow for federated learning while in a confidential computing mode, meaning that multiple companies with private data sets can all train the same model by essentially passing it around among different secure environments. In addition, thanks to a second-generation implementation of multi-instance GPU, or MIG, a single physical GPU can be split up into seven separate isolated workloads, improving the efficiency of the chip in shared environments.
Recommended by LinkedIn
Finally, Hopper also supports the fourth-generation version of Nvidia’s NVLink, a major leap that offers a huge 9x increase in bandwidth versus previous technologies, supports connections to up to 256 GPUs, and enables use of NVLink Switch. The latter provides the ability to maintain high-speed connections not only within a single system, but to external systems as well. This, in turn, enabled a new range of DGX Pods and DGX SuperPods, Nvidia’s own branded supercomputer hardware, as well as the aforementioned Eos Supercomputer.
Speaking of NVLink and physical connectivity, the company also announced support for a new chip-to-chip technology called Nvidia NVLink-C2C, which is designed for chip-to-chip and die-to-die connections with speeds up to 900 Gbps between Nvidia components. On top of that, the company opened up the previously proprietary NVLink standard to work with other chip vendors, and notably announced it would also be supporting the newly unveiled UCIe standard (see “The Future of Semiconductors is UCIe” for more). This gives the company much more flexibility in terms of how it can potentially work with others to create heterogenous parts, as others in the semiconductor industry have started to do as well.
Nvidia chose to leverage its own NVLink-C2C for a new Grace Superchip, which combines two of the company’s previously announced Arm-based CPUs, and revealed that the Grace Hopper Superchip previewed last year, uses the same interconnect technology to provide a high-speed connection between its single Grace CPU and Hopper GPU. Both “superchips” are targeted at datacenter applications, but their architectures and underlying technologies provide a good sense of where we can likely expect to see PC and other mainstream applications headed. The NVLink-C2C standard, which supports industry connectivity standards such as Arm’s AMBA CHI protocol and CXL, can also be used to interconnect DPUs (data processing units) to help speed up critical data transfers within and across systems.
In addition to all these datacenter-focused announcements, Nvidia also launched updates and more real-world customers for its Drive Orin platform for assisted and autonomous driving, as well as its Jetson and Isaac Orin platforms for robotics.
All told, it was an impressive launch of numerous technologies, chips, systems, and platforms. What was clear is that the future of demanding AI applications, along with other difficult computing challenges, is going to require multiple different elements working in concert to complete a given task. As a result, increasing the diversity of chip types and the mechanisms for allowing them to communicate with one another is going to be as important—if not more important—as advancements within individual categories. To put it more succinctly, we’re clearly headed into a connected, multi-chip world.
Bob O’Donnell is the president and chief analyst of TECHnalysis Research, LLC a market research firm that provides strategic consulting and market research services to the technology industry and professional financial community. You can follow him on Twitter @bobodtech.