Tranium chips are a family of purpose-built AI accelerators designed by Amazon Web Services (AWS) for machine learning (ML) training and inference across various generative AI workloads. Amazon began designing its own chips for its data centers after acquiring Annapurna Labs, an Israeli chip startup, in 2015. The goal is to offer scalable performance and cost efficiency, reducing reliance on Nvidia's GPUs. Amazon's Trainium chips directly compete with Nvidia and Google in the AI hardware space.
The AWS Trainium family includes multiple generations: Trainium1, Trainium2, and Trainium3. The latest generation, Trainium3, is built on TSMC's 3nm process and combines high compute density, high-bandwidth memory, and efficient power performance. Key features of Trainium3 include 2.52 PFLOPs of FP8 compute performance per chip, 144 GB of HBM3e memory with 4.9 TB/s memory bandwidth, and support for various precision formats. Trn3 UltraServers can include up to 144 Trainium3 chips, delivering 362 FP8 PFLOPs in total. Amazon is also developing Trainium4, which will support Nvidia's NVLink Fusion technology.
Amazon has customers waiting for every chip it builds. Customers, including Anthropic, Karakuri, Metagenomi, NetoAI, Ricoh, and Splash Music, have seen reduced training costs by as much as 50% compared to using GPUs. Decart, an AI lab, is leveraging Trainium3 for real-time generative video, achieving 4x faster frame generation at half the cost of GPUs. Amazon's managed service for foundation models, Amazon Bedrock, is already serving production workloads using Trainium3.
Trainium3 UltraServers are now generally available. While exact pricing details can vary, AWS has offered computing power comparable to Nvidia's H100 chips at a significantly lower cost, around 25% of the price. This pricing strategy aims to make AI more accessible to a broader range of customers.