Z.ai launches GLM-5.2 with 1M-token context and MIT license

Z.ai has introduced GLM-5.2, a new flagship open-weight model designed for long-horizon work such as coding agents, debugging, research, and large-scale software tasks. The company says the model is a major step up from GLM-5.1 and, for the first time in its line, supports a stable 1 million token context window.

The release is aimed at workflows that stretch over many steps and long sequences of tool use, where models need to stay consistent across sprawling codebases and extended task chains. Z.ai said that achieving long context is not enough on its own, and that GLM-5.2 was trained to remain reliable in practical engineering scenarios rather than simply accept more text.

A central part of the update is stronger coding performance paired with adjustable reasoning effort. Users can select different effort levels to balance speed, cost, and capability, with a higher-effort mode reserved for harder tasks. Z.ai says this makes the model more flexible for agentic coding, where latency and token consumption matter as much as raw benchmark scores.

On the technical side, the company highlighted several architecture changes intended to make 1M-token inference more efficient. One of them, called IndexShare, reuses the same indexer across groups of transformer layers to reduce computation at long context lengths. Z.ai also said it improved its speculative decoding layer to raise acceptance rates, which can help lower the cost of generation.

The company is positioning GLM-5.2 as a pure open model under an MIT license, with no regional restrictions. It is available through Z.ai, and the weights are also hosted on GitHub and Hugging Face.

Benchmark results shared by Z.ai suggest the model is strongest among open-source systems on several coding tasks. On Terminal-Bench 2.1, GLM-5.2 scored 81.0, up sharply from 63.5 for GLM-5.1. On SWE-bench Pro, it reached 62.1 compared with 58.4 for its predecessor. Z.ai also said the model narrowed the gap to top closed models, though it still trails some of the best proprietary systems on certain measures.

The company focused heavily on long-horizon coding benchmarks. On FrontierSWE, which measures performance on open-ended technical projects that can take hours or longer, Z.ai said GLM-5.2 was close to Claude Opus 4.8 and slightly ahead of GPT-5.5. It also reported strong results on PostTrainBench, where models are judged on how well they can improve smaller models through post-training, and on SWE-Marathon, a benchmark for extended software engineering work.

Z.ai said the model’s training process included expanded long-context data for coding-agent scenarios, covering implementation, research, optimization, and debugging. It also described new safeguards for reinforcement learning and evaluation, including an anti-hacking system meant to detect behavior that exploits benchmark shortcuts rather than solving tasks legitimately.

The release reflects a broader push in the AI industry toward models that can handle sustained work over large contexts and longer agent runs. With GLM-5.2, Z.ai is betting that open-weight systems can compete more directly in that space while remaining widely accessible under permissive licensing.