Cohere has released North Mini Code, an open-source model the company says is designed for agentic software development tasks. The launch marks Cohere’s first model aimed specifically at developers and the opening entry in a new family of models the company says will support a more sovereign AI ecosystem.
North Mini Code uses a mixture-of-experts architecture and has 30 billion total parameters, with 3 billion active at a time. Cohere says that design is intended to keep the model efficient while still delivering strong performance on software engineering workloads. The company is positioning it as a smaller model that can run with less demanding hardware than many larger systems.
The model is being released under the Apache 2.0 license, making it freely available for developers and organizations to download, adapt, and deploy. Cohere said the release reflects its broader push to give users more control over where and how they run AI systems, including on-premises and local deployments.
According to the company, North Mini Code is optimized for code generation, agentic software engineering, and terminal-based tasks. Cohere says it is meant to handle workflows such as coordinating sub-agents, mapping system architecture, and conducting code reviews. The model also supports a 256,000-token context window, with a maximum generation length of 64,000 tokens.
Cohere said North Mini Code can be accessed in several ways. The weights are available on Hugging Face, and the model can also be used through Cohere’s API, its Model Vault managed inference service, OpenRouter, and OpenCode. The company noted a minimum hardware requirement of one H100 GPU when running at FP8 precision.
In benchmark testing described by Cohere, North Mini Code performed competitively against other open-source models of similar size on coding and software engineering tasks. The company said its results correspond to a 33.4 score on the Artificial Analysis Coding Index. It also claimed internal tests showed up to 2.8 times higher output throughput than Devstral Small 2 under the same hardware and concurrency conditions, along with a 30% advantage in inter-token latency. Cohere said time to first token was closer between the two models, with Devstral Small 2 maintaining a slight lead in the conditions it tested.
Cohere framed the launch as part of a broader effort to support developers who want more autonomy over their AI infrastructure. The company said the model is built for teams that want to avoid vendor lock-in and run their coding tools on their own terms. It also said community feedback will help shape future releases.
The company described North Mini Code as the first of a new generation of models, and said it plans to expand the lineup over time. For now, it is encouraging developers to try the model through the available open-source and hosted options and to provide feedback on its performance in real workflows.