MiniMax Launches M3 With 1M-Token Context, Promises Weights and Report Soon

MiniMax launches M3 with long-context and multimodal support

MiniMax has released M3, a new model aimed at coding agents and other long-context AI tasks, while saying that downloadable weights and a technical report will follow within 10 days. For now, users can access the model through MiniMax Code, token plans and an API, but the company has not yet published a checkpoint for local use.

The Shanghai-based company is positioning M3 as an open-weight model, though that claim remains incomplete until the promised files and documentation arrive. The launch materials available so far include a product page, API documentation and a blog post, but not the model weights themselves. MiniMax said the report and weights will be posted on Hugging Face and GitHub.

M3 is notable for its 1 million token context window, which MiniMax says comes with a guaranteed minimum of 512,000 tokens for API use. The company also says the model supports native multimodal input, including images and video, with text output. That combination places M3 in a category of systems designed to handle long codebases, screenshots, diagrams and extended tool histories in a single session.

What MiniMax says M3 can do

In its launch materials, MiniMax described M3 as its first open-weight model to combine frontier coding performance, multimodal input and very long context. The company highlighted benchmark results it says include 59.0% on SWE-Bench Pro, 66.0% on Terminal-Bench 2.1 and 74.2% on MCP Atlas.

MiniMax also published a company-run Hopper test that it says involved 24 hours of FP8 matrix multiplication, 147 benchmark submissions and 1,959 tool calls. According to the company, those runs increased hardware utilization from 7.6% to 71.3%.

The model’s long-context system is powered by what MiniMax calls MiniMax Sparse Attention, or MSA. The company says the architecture filters cached key-value blocks before performing full attention, helping reduce compute at very long context lengths. MiniMax claims the approach lowers per-token compute at the 1 million token scale to about one-twentieth of the prior generation model.

The API page also lists Anthropic-compatible and OpenAI-compatible endpoints, suggesting the model is intended to fit existing developer workflows. MiniMax says the service supports image and video inputs alongside text generation, which could make it useful for software agents and other applications that need to process multiple input types in one run.

Pricing and market reaction

MiniMax lists standard API pricing at $0.60 per million input tokens and $2.40 per million output tokens, with those rates applying up to 512,000 input tokens. The company also offers subscription plans starting at $20 per month, which it says provide access to about 1.7 billion M3 tokens.

The release landed alongside reports that MiniMax shares fell in Hong Kong after a disclosure related to a planned Shanghai STAR Market listing. MarketWatch reported a 16% decline, citing a listing-guidance agreement with Citic Securities and a filing with the Shanghai bureau of the China Securities Regulatory Commission.

At the same time, some details that would help outside analysts compare M3 with rival models remain unknown. The South China Morning Post reported that MiniMax did not disclose the model’s size or the training infrastructure used, limiting direct comparisons with systems whose parameter counts and compute budgets are public.

Developers can start testing M3 now through MiniMax’s API and product offerings. The next major checkpoint is the company’s promised release of weights and a technical report, which should clarify how the model was built and whether its open-weight status will be fully realized.