The Nvidia H200 is a high-performance data center GPU designed for AI, high-performance computing (HPC), and data analytics. Manufactured by Nvidia, it's based on the Hopper architecture and stands out due to its enhanced memory and performance capabilities compared to previous generations. The H200 is particularly well-suited for large language models (LLMs), scientific workloads, and other memory-intensive tasks. It is available in two variants: SXM (Secure eXpress Module) and NVL (NVIDIA Virtual Link), catering to different deployment needs. The SXM version offers higher performance, while the NVL variant is designed for lower power consumption and air-cooled data centers.
Key features of the Nvidia H200 include 141 GB of HBM3e memory with a 4.8 TB/s bandwidth, significantly improving memory capacity and bandwidth compared to its predecessor, the H100. This allows for faster data processing, reduced latency, and improved energy efficiency. The H200 also boasts up to 1.6x better inference performance and significantly higher throughputs than the H100 across various LLM configurations. The price of an H200 GPU ranges from $30,000 to $40,000 to purchase outright. For cloud rentals, prices range from $3.72 to $10.60 per GPU hour. The Nvidia H200 became available to global system integrators and cloud service providers in the second quarter of 2024. It is currently available through major cloud providers and Nvidia solution partners.