Anthropic says its new model can uncover serious flaws in software

Anthropic has released a new general-purpose model called Claude Mythos Preview, and the company says its cybersecurity abilities are so strong that they could reshape how defenders and attackers compete. In a technical post, Anthropic said the model can identify and exploit vulnerabilities across major operating systems and browsers, sometimes with little human help.

The company framed the release as the start of a broader effort called Project Glasswing, which is aimed at helping security teams protect critical software before similar systems become widely available. Anthropic said it is initially limiting access to selected industry partners and open source developers.

According to the company, Mythos Preview is far more capable at security work than its earlier models. Anthropic said it tested the model over the past month and found it could discover and exploit zero-day vulnerabilities, including issues in open source code and flaws in closed-source software. The company described the results as a significant shift in what current frontier models can do in cybersecurity.

Reported gains over earlier models

Anthropic said the new model outperformed its own previous systems by a wide margin in internal evaluations. In one comparison involving Firefox-related bugs, the company said an earlier model was rarely able to turn discovered vulnerabilities into working exploits, while Mythos Preview succeeded many more times and also achieved register control in additional attempts.

The company also said its internal benchmarks showed a marked jump in impact severity. Using a testing setup based on around 1,000 open source repositories, Anthropic said older models generally produced lower-level crashes, but Mythos Preview generated far more results overall and reached the highest severity tier on ten fully patched targets.

Anthropic said these abilities were not the result of explicit training for offensive exploitation. Instead, the company said they emerged from broader improvements in coding skill, reasoning, and autonomy. It added that the same qualities that make the model better at finding bugs also make it more effective at turning those bugs into exploits.

A warning for the security industry

The company said the implications are mixed. In the long run, Anthropic expects advanced AI tools to help defenders more than attackers by speeding up patching and hardening work. But it warned that the transition period could be difficult if frontier models are released too broadly before the security ecosystem is ready.

Anthropic compared the situation to the rise of fuzzing tools, which were once viewed as potentially dangerous because they could help attackers find bugs faster. Over time, those tools became a core part of defensive security work. The company said it expects a similar outcome with powerful language models, but only after the industry adapts.

To evaluate Mythos Preview, Anthropic said it used an agentic workflow in isolated containers. The model was prompted to find vulnerabilities in specific files, test its own theories, and produce proof-of-concept reports when it believed it had found a real issue. A final model pass was then used to judge whether each report was meaningful.

Limited disclosure for now

Anthropic said it is following its coordinated disclosure process and is intentionally withholding most technical details because the majority of the vulnerabilities it found have not yet been patched. The company said fewer than 1% of the issues it has identified are fully resolved so far, which means its public examples represent only a small portion of the overall findings.

The company described the launch as an early step in preparing for a security landscape where AI systems can discover bugs and build exploits faster than many humans can respond. For now, Anthropic says the priority is to give defenders access first, while it works with partners to strengthen critical systems before the tools become broadly available.