Cloudflare Expands Workers AI to Host Trillion-Parameter Models With New Infire Engine and Unweight Compression
Cloudflare detailed the custom infrastructure behind its Workers AI expansion to host models like Kimi K2.5, including a Rust-based inference engine and a lossless weight-compression system.