Google Cloud Managed Lustre Hits 10 TB/s at Next '26, With New Dynamic Tier and KV-Cache Inference Boost

Overview

At Google Cloud Next 2026, Google and its storage partner DDN announced a sweeping set of upgrades to Google Cloud Managed Lustre, pushing single-instance throughput to 10 TB/s — a 10-fold increase over the prior year — while introducing a new low-cost Dynamic tier and demonstrating significant inference gains when the system is used as a shared KV-cache layer. The announcements, made on April 22, 2026, position cloud storage as a first-class performance variable in AI training and inference pipelines, not merely a passive repository.

What We Know

Throughput and Competitive Position

According to the Google Cloud Blog, Google Cloud Managed Lustre now delivers up to 10 TB/s of throughput for a single instance, powered by C4NX VMs and Hyperdisk Exapools. That figure is 4–20x higher than managed Lustre offerings from other hyperscalers for a single instance, and represents a 10x increase since last year’s baseline.

The service, built on DDN’s EXAScaler technology, also delivers checkpoint writes and restores 2.6x faster compared to other Google Cloud storage solutions, according to the Google Cloud Blog. For AI training teams where checkpoint frequency directly determines how much work is lost in a hardware fault, that improvement has material cost implications at scale.

Alex Bouzari, Co-founder and CEO of DataDirect Networks, framed the milestone as a competitive moment for the industry at large. “This is not just a product milestone – it’s a market-shaping moment,” Bouzari said, as quoted by StorageNewsletter. “We are delivering one of the fastest-growing, highest-performance managed Lustre services in the industry, purpose-built for the realities of modern AI at scale.”

Kirill Tropin, Group Product Manager at Google Cloud, attributed the gains to the depth of the DDN partnership. “This is what happens when deep infrastructure expertise meets cloud-scale innovation,” Tropin told StorageNewsletter. “Our partnership with DDN enables customers to run their most demanding AI workloads with the performance, scale, and simplicity they need – today and into the future.”

New Dynamic Tier

Alongside the throughput upgrade, Google introduced a new Dynamic tier priced at $0.06/GB-month, per the Google Cloud Blog. Unlike caching-based approaches, the Dynamic tier serves data from persistent disk, providing the low-latency performance required for intense AI workloads such as training runs and checkpointing. The offering is available as a single SKU with simple, predictable billing — a deliberate contrast to tiered pricing schemes that can generate unexpected costs under variable workloads.

One of the more technically novel announcements was the use of Managed Lustre as a shared KV-cache layer for AI inference. According to StorageNewsletter, this approach delivered a 75% improvement in total inference throughput and a greater than 40% reduction in mean time to first token versus host memory alone. SiliconANGLE reported the same figures, noting that joint customers have achieved 95% or more TPU utilization using this configuration.

The KV-cache use case moves Managed Lustre beyond its traditional role as a training-time filesystem and into the inference serving stack, where latency to first token is a primary quality metric for interactive AI applications.

Customer Results

Two customer examples were shared at the event. Motoi Kataoka, Senior Manager of AI & Data Analytics Platform at Sony Honda Mobility Inc., said the company scaled AI model training for its AFEELA Intelligent Drive system by 3x compared to other Google Cloud solutions after adopting Managed Lustre, as reported by StorageNewsletter.

Harmonic Inc. reported a 6x increase in GPU/TPU saturation after switching to Managed Lustre, according to SiliconANGLE.

Salesforce also described production use of the service. Lavnaya Karanam, Software Engineering PMTS at Salesforce, stated: “By integrating Managed Lustre we eliminated the typical onboarding bottlenecks, allowing us to hit the ground running with the inferencing workload. This high-throughput, low-latency storage keeps our B200 GPUs fully saturated, driving a substantial performance gain in LLM inference over the H200. For our customers, this translates directly into faster, more responsive AI agents that can handle complex reasoning at a fraction of the previous latency,” as quoted in the Google Cloud Blog.

Broader Storage Announcements at Next ‘26

Managed Lustre was the most prominent storage announcement, but not the only one. Google simultaneously brought its Cloud Storage Rapid Bucket to general availability, claiming 15 TB/s of bandwidth, 20 million requests per second, and sub-millisecond latency in a single zonal bucket — as well as 50% reduced GPU blocked time and 5x faster checkpoint restores versus traditional object storage, per the Google Cloud Blog.

The Rapid Cache feature, formerly known as Anywhere Cache, now delivers aggregate read throughput of 2.5 TB/s for existing buckets and a 2.2x faster checkpoint restore via a new ingest-on-write capability, again per the Google Cloud Blog.

Google also noted that 70% of its largest cloud customers now use Storage Intelligence, each managing over 50 billion objects, with new zero-configuration dashboards integrating security posture management.

Context: Storage as AI Infrastructure

Asad Khan, Senior Director of Google Storage, told SiliconANGLE that Google Cloud’s data footprint grew 4x in three years — a scale that makes storage throughput a meaningful constraint on how efficiently expensive accelerators can be used. The framing from both Google and DDN at Next ‘26 was consistent: storage has become a co-equal part of AI infrastructure investment, alongside GPUs and interconnects.

When Google Cloud Managed Lustre first reached general availability in July 2025, it offered up to 1 TB/s of read throughput and storage scaling from 18 TiB to 8 PiB+, with a 99.9% SLA and native integration with Compute Engine, GKE, Vertex AI, and Terraform, according to the DDN press release from that launch. The Next ‘26 announcements represent a tenfold increase in maximum single-instance throughput in under nine months.

What We Don’t Know

Google has not disclosed pricing for the higher-throughput Managed Lustre configurations beyond the $0.06/GB-month Dynamic tier figure. The announcement did not specify when the 10 TB/s performance tier will be generally available, or whether it requires dedicated capacity reservations. The KV-cache inference benchmark figures were disclosed as internal results rather than third-party verified tests, and the methodology for the TTFT and throughput comparisons was not published in detail.

It is also unclear whether the 4–20x competitive advantage claim accounts for comparable configurations at AWS, Azure, and other hyperscalers, or reflects a specific benchmark scenario optimized for Managed Lustre’s architecture.