NVIDIA has just announced their new Blackwell platform at GPU Technology Conference (GTC) 2024. With strides taken in improving the GPU architecture, the new Blackwell GPU and Blackwell platform has made a breakthrough in performance, enabling organizations to now run real-time generative AI on trillion-parameter large language models at 25x less cost and energy consumption than its predecessors.
“For three decades we’ve pursued accelerated computing, with the goal of enabling transformative breakthroughs like deep learning and AI,” said Jensen Huang, founder and CEO of NVIDIA. “Generative AI is the defining technology of our time. Blackwell is the engine to power this new industrial revolution. Working with the most dynamic companies in the world, we will realize the promise of AI for every industry.”
The new Blackwell architecture succeeds the NVIDIA Hopper architecture, launched two years ago.
6 Revolutionary Technologies in Blackwell
1. World’s Most Powerful Chip – The Blackwell GPU comes packed with 208 million transistors, and is manufactured with a custom-built 4NP TSMC process. It features a two-reticle limit GPU dies connected by a 10TB/s chip-to-chip link into a single unified GPU
2. 2nd Generation Transformer Engine – Blackwell now supports double the compute and model sizes with a new 4-bit floating point AI inference capabilities. This is fueled by a new micro-tensor scaling support and NVIDIA’s advanced dynamic range management algorithms integrated into NVIDIA TensorRT-LLM and NeMo Megatron frameworks.
3. New 5th Generation NVLink – The latest iteration of NVLink delivers 1.8TB/s bidirectional throughput per GPU, ensuring seamless high speed connectivity among up to 576 GPUs
4. Dedicated RAS Engine – A dedicated Reliability, Availability and Serviceability engine is now included on Blackwell-powered GPUs. The new architecture also enables AI-based preventive maintenance at the chip level to run diagnostics and forecast reliability issues. This improvement can maximize system uptime and improve resiliency for any AI deployments.
5. Secure AI – With support for new native interface encryption protocols, the new GPUs can protect AI models and customer data without compromising performance. This is critical for privacy-sensitive industries like healthcare and financial services.
6. Decompression Engine – Blackwell GPUs now comes with a dedicated decompression engine to accelerate database queries. This delivers the highest performance in data analytics and data science.
The NVIDIA GB200 Grace Blackwell Superchip
With the new Blackwell GPU architecture, NVIDIA has also released the new GB200 Grace Blackwell Superchip. The GB200 connects two NVIDIA B200 Tensor Core GPUs to the NVIDIA Grace CPU over a 900GB/s ultra-low-power NVLink chip-to-chip interconnect.
The GB200 Superchip will become the key component to many scalable Blackwell server solutions. For example, the NVIDIA GB200 NVL72 is a multi-node, liquid-cooled, rack-scale system for the most compute-intensive workloads. It combines 36 Grace Blackwell Superchips, which include 72 Blackwell GPUs and 36 Grace CPUs interconnected by fifth-generation NVLink. Additionally, GB200 NVL72 includes NVIDIA BlueField®-3 data processing units to enable cloud network acceleration, composable storage, zero-trust security and GPU compute elasticity in hyperscale AI clouds. The GB200 NVL72 provides up to a 30x performance increase compared to the same number of NVIDIA H100 Tensor Core GPUs for LLM inference workloads, and reduces cost and energy consumption by up to 25x.
The platform acts as a single GPU with 1.4 exaflops of AI performance and 30TB of fast memory, and is a building block for the newest DGX SuperPOD.
Support from a Global Network of Partners
AWS, Google Cloud, Microsoft Azure and Oracle Cloud will be among the first cloud service providers to offer Blackwell-powered instances to organizations.
GB200 will also be available on NVIDIA DGX™ Cloud, an AI platform co-engineered with leading cloud service providers that gives enterprise developers dedicated access to the infrastructure and software needed to build and deploy advanced generative AI models.
Cisco, Dell, Hewlett Packard Enterprise, Lenovo and Supermicro are expected to deliver a wide range of servers based on Blackwell products, as are Aivres, ASRock Rack, ASUS, Eviden, Foxconn, GIGABYTE, Inventec, Pegatron, QCT, Wistron, Wiwynn and ZT Systems.