Infrastructure
22 articles RSS
AWS Rebuilds Amazon OpenSearch Serverless From the Ground Up for Agentic AI, Reaching GA With 20x Faster Scaling and Up to 60% Lower Cost
AWS launched the next-generation OpenSearch Serverless on May 28, decoupling compute from storage to hit true scale-to-zero and 20x faster autoscaling for bursty agentic workloads.
Google Cloud Managed Lustre Hits 10 TB/s at Next '26, With New Dynamic Tier and KV-Cache Inference Boost
Google and DDN unveiled a 10x throughput jump for Managed Lustre at Cloud Next 2026, adding a $0.06/GB-month Dynamic tier and showing 75% inference gains via KV-cache sharing.
Cloudflare Expands Workers AI to Host Trillion-Parameter Models With New Infire Engine and Unweight Compression
Cloudflare detailed the custom infrastructure behind its Workers AI expansion to host models like Kimi K2.5, including a Rust-based inference engine and a lossless weight-compression system.
Red Hat Summit 2026: Desktop Hits GA and AI 3.4 Adds Speculative Decoding, MCP, and Agent Management to OpenShift
Red Hat used its May 12 Summit keynote to take Red Hat Desktop with Podman Desktop and isolated AI agent sandboxing to general availability, and to preview Red Hat AI 3.4 with vLLM speculative decoding, Model-as-a-Service, MCP tooling, and an agent evaluation hub.
EdgeCore Secures $1.5 Billion to Build Two Fully Leased Hyperscale AI Data Centers in Northern Virginia
EdgeCore Digital Infrastructure closed $1.5 billion in senior secured term loans for AS01 and AS02 in Sterling, Virginia, two single-tenant facilities totaling 114 MW and 685,000 square feet, with AS01 going live in November 2026 and AS02 in July 2027.
Nvidia Takes a $2.1 Billion Warrant in IREN and Signs a $3.4 Billion Five-Year Cloud Contract, Anchoring a Joint 5 GW AI Buildout
Nvidia receives a five-year option on 30 million IREN shares at $70 each while paying IREN $3.4 billion over five years for Blackwell cloud capacity at Childress, Texas.
Anthropic Leases the Full Capacity of SpaceX's Colossus 1 Data Center, Adding 300 MW and Over 220,000 Nvidia GPUs to the Claude Backend
The May 6 deal hands Anthropic all of xAI's Memphis supercomputer and unlocks immediate rate-limit increases for Claude Code, Claude Pro, Max, and the Opus API.
Thinking Machines Lab Signs Multibillion-Dollar Google Cloud Deal for GB300 Compute, Adding a Second Hyperscaler to Its Stack
Mira Murati's startup expands its Google Cloud relationship with a non-exclusive multibillion-dollar deal centered on GB300 systems and reinforcement-learning workloads behind its Tinker fine-tuning API.
Meta Breaks Ground on $1 Billion AI Data Center in Tulsa as City Moratorium Exempts 'Project Anthem'
Meta's first Oklahoma data center — a 2-million-square-foot AI facility in East Tulsa — was grandfathered past a city moratorium that paused similar projects, drawing local protests.
MLPerf Inference v6.0 Delivers Its Most Ambitious Benchmark Suite Yet as NVIDIA Triples DeepSeek-R1 Throughput in Six Months
MLCommons' April 2026 inference benchmark round adds text-to-video and vision-language tests while NVIDIA posts a 2.7x jump on DeepSeek-R1 through software alone.
Equinix Ships Fabric Intelligence, an AI-Native Network Layer That Aims to Cut Enterprise AI Deployment From Weeks to Minutes
Equinix has launched Fabric Intelligence, an AI-native operational layer for its 4,400-customer interconnection platform that exposes network provisioning through natural language agents and MCP servers.
NVIDIA Launches Omniverse DSX Blueprint to Build Physically Accurate Digital Twins of Gigawatt-Scale AI Factories
NVIDIA released the Omniverse DSX Blueprint at GTC 2026, a framework for building physically accurate digital twins of AI factories that unifies power, cooling, and network simulation.