AWS Rebuilds Amazon OpenSearch Serverless From the Ground Up for Agentic AI, Reaching GA With 20x Faster Scaling and Up to 60% Lower Cost

Overview

Amazon Web Services on May 28, 2026 reached general availability with a fully rebuilt version of Amazon OpenSearch Serverless, redesigning the service’s core architecture to accommodate the bursty, unpredictable traffic patterns that AI agents generate. The revamped platform separates compute from storage, enabling collections to scale from zero to thousands of requests per second and back, with AWS claiming autoscaling that is up to 20 times faster than its predecessor and cost savings of up to 60 percent compared to peak-capacity provisioned clusters.

What Changed

The original OpenSearch Serverless launched with an architecture that assumed relatively predictable search traffic. According to AWS’s launch blog, the new service is a “fully managed search and vector engine designed for customers building AI agents,” built around a decoupled compute-storage model that the previous generation lacked.

The key structural change is that OpenSearch Compute Units are now stateless, reading from and writing to a distributed shared storage layer rather than relying on tightly coupled local storage. This allows indexing and search to scale independently. When no requests arrive within the idle timeout window of 10 minutes, the service releases compute resources entirely. Cold starts from zero take approximately 10 seconds.

Tia White, Director of OpenSearch at AWS, described the driver behind the rebuild to The Register: “Historically, search has not had to decouple [storage and compute], because the traffic was pretty predictable. Now with agentic workloads, even the most sophisticated technical teams need to use a serverless offering.”

Scale-to-Zero and Cold Start

The standout capability is true scale-to-zero billing. White told The Register: “Collections can shrink all the way to zero when nothing’s happening. We have mitigated the cold start problem, so they spin back up in seconds when traffic is needed as agents restart. It auto-scales 20 times faster than before.”

As the AWS Big Data Blog explains, “fast provisioning” means new OCUs start serving requests in seconds, and AWS’s What’s New page states the service delivers “scale-to-zero and pay-per-usage pricing” that drives the claimed cost reduction. Charges are based on OpenSearch Compute Units consumed for indexing, search, and GPU-accelerated vector operations, with separate storage charges billed per GB-month.

The platform also ships with GPU acceleration for vector index construction, which AWS enables automatically in the new architecture for HNSW vector indexing workloads.

Search Modes and Integrations

The service supports two collection types at general availability — SEARCH and VECTORSEARCH — and unifies vector, lexical, hybrid, and agentic search within a single managed offering. Collections are grouped into Collection Groups, the required organizational unit in the new architecture that allows shared compute across multiple collections while maintaining per-collection encryption isolation via separate AWS KMS keys.

At launch, the service ships with native integrations in the Vercel marketplace and in Kiro, AWS’s agentic coding IDE, enabling developers to provision production-ready search and vector backends in minutes without manual infrastructure management. According to AWS’s What’s New announcement, an OpenSearch Agent Skills repository also launches alongside the service, with integrations for platforms including Claude Code, Cursor, and Codex.

AWS PrivateLink is supported through standard VPC endpoints with automatic private DNS configuration for customers requiring private network access.

What We Don’t Know

The Register noted that the underlying storage layer powering the new serverless architecture is proprietary and not open source, which may be relevant for customers with open-source or portability requirements. AWS has not disclosed what fraction of existing OpenSearch Serverless customers will migrate to the new architecture automatically or whether existing classic collections can be migrated in place.

Elastic, which AWS positioned itself against with this launch, introduced competing serverless search with decoupled storage and compute in 2024 and improved performance further in January 2026. By DB-Engines rankings, ElasticSearch sits at 11th and OpenSearch at 31st, though AWS can draw on broader ecosystem integration advantages. White said to The Register that “agentic, production-allied workloads are only going to continue to proliferate and grow,” framing the GA launch as the beginning of a longer infrastructure buildout rather than a fixed endpoint.

The next-generation service is available now in all AWS commercial regions where Amazon OpenSearch Serverless was previously offered.