Alibaba Unveils Qwen3.6-Max-Preview, Topping Six Coding Benchmarks and Cementing a Pivot to Closed Weights
Alibaba's new flagship tops SWE-bench Pro, Terminal-Bench 2.0 and four other coding leaderboards, but ships as a proprietary hosted model rather than an open-weight release.
Overview
Alibaba unveiled Qwen3.6-Max-Preview on April 20, 2026, billing it as the most capable model the Qwen team has shipped and claiming the top rank on six major coding and agent benchmarks. The release, reported by Decrypt, arrives as a hosted, proprietary system with no open weights — a continuation of the strategic pivot that has seen Alibaba ship successive closed-source Qwen variants and shut down the free tier of its Qwen Code developer tool earlier in April.
The timing is notable: Alibaba’s flagship drop landed the same day Moonshot AI open-sourced its trillion-parameter Kimi K2.6, crystallising a split inside China’s AI industry between labs doubling down on open weights and those following Alibaba toward paid API access.
What We Know
Alibaba says Qwen3.6-Max-Preview achieved first place across six programming and agent evaluations: SWE-bench Pro, Terminal-Bench 2.0, SkillsBench, QwenClawBench, QwenWebBench, and SciCode, according to Decrypt. Compared with its predecessor Qwen3.6-Plus, the preview posts gains of 9.9 points on SkillsBench, 10.8 on SciCode, 5.0 on NL2Repo and 3.8 on Terminal-Bench 2.0, according to Datanorth’s write-up of Alibaba’s released numbers. Gains on general-knowledge and instruction-following tests include 2.3 points on SuperGPQA, 5.3 on QwenChineseBench and 2.8 on ToolcallFormatIFBench, per the same source.
On architecture, Datanorth reports that the model uses a mixture-of-experts design with 35 billion total parameters and roughly 3 billion active per inference, a sparsity pattern Alibaba says lowers serving costs relative to dense models of equivalent capability. Decrypt’s coverage does not confirm the parameter counts, describing only a 256,000-token context window, text-only input, and API compatibility with both the OpenAI and Anthropic specifications. The preview also introduces a preserve_thinking flag that carries chain-of-thought traces across multi-turn conversations, a feature Alibaba recommends for long-running agent workflows, according to Decrypt.
The distribution story is the more consequential shift. Where earlier Qwen generations, including the open-weight Qwen3.5 release in March, shipped with weights on Hugging Face under permissive licenses, Qwen3.6-Max-Preview is available only through Alibaba’s Qwen Studio chat interface and the Alibaba Cloud Model Studio API, according to BigGo Finance. Alibaba has not published weights, quantizations or technical papers, and pricing for the preview has not been disclosed, BigGo Finance reports.
Where It Ranks
Independent third-party numbers are thinner. On the Artificial Analysis Intelligence Index, BigGo Finance places Qwen3.6-Max-Preview second overall among evaluated models. Decrypt notes that Alibaba positions the release as the leading “domestic” Chinese model, citing its advantage over GLM-5.1 and MiniMax-M2.7. Alibaba has labelled the release a preview and stated that the model is “still in active development,” with further optimisation expected before a stable launch, according to Decrypt.
What We Don’t Know
Several gaps will matter to developers weighing adoption:
- Pricing. Alibaba has not published per-token rates for Qwen3.6-Max-Preview, and the predecessor Qwen3.6-Plus reportedly remains in a free preview of its own, Datanorth notes.
- Parameter counts. The 35B-total/3B-active figure comes from third-party coverage of Alibaba’s release materials; Alibaba’s own public communications reviewed in Decrypt’s account did not confirm the number.
- Benchmark methodology. Two of the six leaderboards Alibaba topped — QwenClawBench and QwenWebBench — are named after the company itself. The lineage and evaluation protocols of these benchmarks are not detailed in the launch coverage reviewed.
- Independent replication. Beyond the Artificial Analysis ranking, third-party SWE-bench Verified or Terminal-Bench 2.0 scores reproducing Alibaba’s claims were not available at launch, according to Datanorth.
Analysis
The Qwen3.6-Max-Preview release is less a single product story than a data point in a broader realignment of China’s open-source AI leadership. When Alibaba shipped Qwen3.5 in March, it framed open weights as a deliberate edge against proprietary Western frontier systems. Two months later, with three closed-source Qwen releases in under a month and the Qwen Code free tier shut down, the company is signalling that the revenue opportunity inside agentic coding now outweighs the ecosystem benefits of open distribution — at least at the top of the product stack.
That pivot leaves a gap other Chinese labs appear willing to fill. Moonshot AI’s Kimi K2.6 and Z.ai’s GLM-5.1 have continued shipping open weights, and smaller Qwen variants still appear on Hugging Face. Whether developers respond by migrating workloads to those alternatives, or by paying for Alibaba’s top-tier API, is the question Qwen3.6-Max-Preview’s commercial launch will have to answer.