Cloudflare Retires 15 Years of NGINX, Ships Gen 13 Servers Built Around a 192-Core Turin and a Rust Proxy
Cloudflare's Gen 13 platform pairs AMD EPYC 9965 chips with FL2, a Rust rewrite of its NGINX-based proxy, doubling edge throughput while trading L3 cache for cores.
Overview
Cloudflare has begun deploying its thirteenth-generation edge servers to more than 330 cities, pairing AMD’s 192-core EPYC 9965 “Turin” processor with FL2, a complete Rust rewrite of the request-handling layer that has run on NGINX and LuaJIT for fifteen years. According to Cloudflare’s Gen 13 launch announcement on March 23, 2026, the new hardware doubles per-server throughput compared to Gen 12 while improving performance-per-watt by 50 percent — but only when paired with the new software stack. The same hardware running the legacy FL1 stack incurred a more than 50 percent latency penalty, a result the company described as unacceptable.
The transition reframes a tradeoff that has shaped CDN hardware for a decade: whether to optimize for large per-core L3 cache, which favors the cache-locality patterns of NGINX-derived stacks, or to maximize core count and accept smaller caches, which only pays off if the software can scale linearly across many threads. Cloudflare chose the second path, and the gamble depended on FL2 being ready in time.
What We Know
The hardware
The Gen 13 platform is a single-socket 2U server built around AMD’s EPYC 9965, a 500W part with 192 Zen 5c cores and 384 SMT threads. According to Cloudflare’s configuration deep dive, each box ships with 768 GB of DDR5-6400 memory across twelve 64 GB DIMMs (614 GB/s peak bandwidth), three 7.68 TB PCIe 5.0 NVMe drives for 24 TB of base storage, dual 100 GbE network ports via OCP 3.0, and a 1300W 80 PLUS Titanium power supply.
The network upgrade from dual 25 GbE to dual 100 GbE was driven by capacity, not preference. Cloudflare’s production telemetry showed that the 95th-percentile bandwidth per port on Gen 12 was already consistently above 50 percent of the available link, and doubling server throughput would have saturated the older NICs. The company sources from two vendors — Intel’s E830-CDA2 and the NVIDIA Mellanox ConnectX-6 Dx — to maintain supply-chain resilience, a lesson Cloudflare attributes to the pandemic-era component shortages.
The most consequential design choice sits inside the CPU. The EPYC 9965 carries just 2 MB of L3 cache per core, compared with 12 MB per core on the Gen 12 generation that used AMD’s 3D V-Cache parts. Cloudflare picked the 9965 anyway, citing the highest aggregate requests per second and the best performance-per-watt of the three Turin SKUs it evaluated, as documented in the Gen 13 configuration post.
The software
The cache reduction would have been ruinous on the legacy stack. Cloudflare’s Gen 13 launch post reports that running FL1 on the 9965 produced a 62 percent throughput gain but pushed latency up by more than 50 percent, because the NGINX/LuaJIT pipeline relied heavily on the larger L3 cache to maintain its access patterns. Switching to FL2 erased the latency penalty entirely and unlocked the full hardware: roughly 100 percent higher throughput, 50 percent better performance-per-watt, 60 percent more capacity per rack at constant power, and latency 70 percent lower than Gen 12’s baseline.
FL2 is built in Rust on Cloudflare’s Oxy framework, with a strict module system that forbids I/O inside modules, enforces a fixed phase order, and validates input/output contracts at compile time. Cloudflare’s earlier internet-upgrade post describes the project as a from-scratch replacement of a stack that began in 2010 as NGINX with PHP logic and evolved into NGINX with LuaJIT by 2013. The post reports that FL2 cut median request latency by 10 milliseconds, scored a 25 percent performance gain in third-party CDN benchmarks, and used less than half the CPU and memory of FL1. More than 100 engineers have implemented over 130 modules; customer traffic began flowing through FL2 in early 2025, and the company plans to turn off FL1 in early 2026.
Reporting on Cloudflare’s announcement, InfoQ summarized the architectural shift as a deliberate move to support workloads that scale with parallelism rather than cache, with the goal of “significantly higher request capacity and better performance-per-watt across Cloudflare’s global edge infrastructure.”
The benchmarks
The Gen 13 launch post lists the per-server gains starkly. Compared with Gen 12 on FL2, Gen 13 delivers two times the memory capacity, 1.5 times the storage, and four times the network bandwidth. Compared with the same Gen 13 hardware running FL1, FL2 produces 50 percent more requests per CPU percent and 100 percent higher throughput, while dropping latency to 70 percent below the Gen 12 baseline.
Context
Cloudflare’s move is the operational payoff of work that has been visible in the open for years. The company’s Pingora framework — a Rust-based proxy library that serves as one of the foundations FL2 builds on — was open-sourced in 2024 after Cloudflare reported it was already handling more than 40 million requests per second internally. The shift from NGINX to in-house Rust infrastructure has been incremental rather than a single rewrite, but Gen 13 is the first hardware generation explicitly designed around assumptions only the new stack can satisfy.
The broader cloud-infrastructure pattern is consistent: hyperscalers are coupling silicon choices to software they control end-to-end, because that is the only way to extract the full advertised performance from increasingly heterogeneous server CPUs. Where AWS, Google Cloud, and Microsoft Azure pursue this through custom Arm-based silicon and proprietary networking fabrics, Cloudflare is doing it on commodity AMD parts by rewriting the proxy that sits in front of every request.
What We Don’t Know
Cloudflare has not published independent third-party benchmarks of Gen 13 in production, only its own telemetry. The 25 percent CDN-benchmark gain cited in the internet-upgrade post refers to FL2 broadly rather than to Gen 13 specifically, and Cloudflare did not name the benchmark provider in the public announcement.
The FL1-to-FL2 migration is also still in progress at the time of the Gen 13 announcement: the internet-upgrade post lists “early 2026” as the target date for FL1 shutdown, but does not confirm that every customer workload has been moved. Edge cases that remain on the legacy stack will not see Gen 13’s full gains, and Cloudflare has not disclosed a precise FL2 traffic-share figure for the Gen 13 launch window.
Finally, the company has not specified how many Gen 13 units are deployed today versus how many remain to be shipped to its more-than-330-city footprint, nor has it quantified the capital or power envelope of the rollout.