Meet Ironwood: Google’s Most Scalable and Powerful TPU Yet

Meet Ironwood: Google’s Most Scalable and Powerful TPU Yet
  • calendar_today August 17, 2025
  • Technology

Google introduced its next-generation Tensor Processing Unit (TPU) under the code name Ironwood, which represents a substantial advancement in its specialized AI hardware. Ironwood was developed to address the growing needs of Google’s high-performance Gemini models in simulated reasoning capabilities called “thinking” and stands as a critical breakthrough that enables advanced “agentic AI” operations during Google’s “age of inference.”

The company stresses that its Gemini models depend fundamentally on its infrastructure because custom AI hardware enables faster inference processing and larger context windows. Ironwood stands as Google’s most scalable and powerful TPU currently available and it is designed to let AI perform actions for users by independently collecting information and producing results which forms Google’s vision for agentic AI.

Ironwood offers a significant boost in throughput performance beyond what was possible with previous models. Google intends to use these chips in enormous clusters with liquid cooling that can contain up to 9,216 units. The newly improved Inter-Chip Interconnect (ICI) enables these chips to directly exchange data at high speeds throughout the extensive system.

Google’s powerful design extends beyond its internal applications. Developers looking to run demanding AI projects in the cloud will also be able to leverage Ironwood through two distinct configurations: Developers working on intensive AI cloud projects can select between a 256-chip server and an expansive 9,216-chip cluster configuration.

Google’s Ironwood pods reach their maximum potential by delivering 42.5 Exaflops of inference computing power in their largest configuration. Google states that every Ironwood chip delivers a peak throughput of 4,614 TFLOPs, which shows substantial advancement compared to earlier models. The new TPUs from Google demonstrate a substantial memory expansion, with each chip containing 192GB, which represents a sixfold increase from the memory capacity of the previous Trillium TPU. The memory bandwidth expanded significantly to reach 7.2 Tbps, which represents a 4.5x improvement.

Google evaluates Ironwood TPU performance with FP8 precision while other AI hardware comparisons remain challenging due to different benchmarking methods. The assertion that Ironwood “pods” outperform comparable supercomputer segments by 24 times demands skepticism because not all these systems support FP8 operations natively. Google has not included its TPU v6 (Trillium) hardware in its direct comparison metrics.

According to Google, Ironwood provides double the performance per watt when compared to v6 technology. According to a company spokesperson, Ironwood functions as an upgraded version of the TPU v5p, while Trillium succeeds the TPU v5e, which has less power. The Trillium system reached a computational performance of 918 TFLOPS when operating at FP8 precision.

Ironwood stands as a major progression in Google’s AI development landscape despite benchmarking complexities. Ironwood provides significantly improved speed and efficiency over earlier TPUs by leveraging Google’s existing powerful infrastructure, which has already enabled rapid progress in large language models and simulated reasoning.

The current market-leading Gemini 2.5 model from Google operates using earlier-generation TPUs. Ironwood’s improved inference speed and efficiency will enable major advances in AI capabilities over the next year, which will initiate the “age of inference” along with the start of advanced agentic AI development.