Two new TPUs to power the next wave of AI training and inference at Google

Google has unveiled two new Tensor Processor Units (TPUs), TPU 8t and TPU 8i, designed for AI training and inference. The new chips aim to improve performance and reduce costs for large-scale AI workloads.
Google introduced two new custom silicon chips for artificial intelligence at Google Cloud Next 2026. The eighth-generation TPU 8t and TPU 8i are designed for training and inference, respectively. TPU 8t is optimized for massive pretraining and embedding-heavy workloads, using 3D torus network topology and SparseCore accelerator. It can network 9,600 chips in a single pod and doubles throughput while maintaining accuracy. TPU 8i is designed for inference, employing high-bandwidth memory and a specialized network topology. It hosts a larger key-value cache at inference time and processes reduction and synchronization steps required during autoregressive decoding. Both chips deliver twice the performance-per-watt boost over the previous generation, with TPU 8t offering up to 2.7 times the performance-per-dollar improvement and TPU 8i targeting an 80% performance-per-dollar improvement.
This content was automatically generated and/or translated by AI. It may contain inaccuracies. Please refer to the original sources for verification.