Anthropic said its latest Claude model completed a set of shared robotics tasks far faster than earlier human teams, in a follow-up to an experiment the company first ran last year with an off-the-shelf robotic quadruped.
In the new test, Claude Opus 4.7 carried out the same kinds of warehouse-style objectives that Anthropic employees had tackled in the original Project Fetch experiment. The company said the model, working without human help for the measured tasks, was about 20 times faster than the fastest human team from the earlier round and more than 37 times faster than the slower of the two human teams on the tasks they both finished.
Anthropic framed the result as evidence that large language models are moving beyond helping humans use physical tools and are beginning to operate them directly in limited settings. The company also cautioned that the system is not close to solving robotics broadly.
Project Fetch asked non-robotics specialists at Anthropic to use a robodog to carry out a sequence of tasks. These included connecting to the robot’s video and lidar sensors, writing control code, monitoring the robot’s movement, detecting a beach ball, and eventually getting the robot to retrieve the ball.
For the second phase, Anthropic said researchers only handled the setup. They connected a laptop running Claude Code to the robot, entered the initial prompt, approved commands, and moved the model on to the next objective. The company said it ran three trials of Opus 4.7 using maximum effort settings.
Anthropic compared those results with the original human teams from August 2025. On every task that at least one human team had completed, Opus 4.7 finished at least 10 times faster, according to the company. For the four tasks completed by both human teams, the model averaged more than 18 times faster than one team and more than 37 times faster than the other.
The company also said the newer model generated far less code than the people in the original experiment while still succeeding on the same tasks. It described the model’s coding as efficient and often effective on the first attempt.
Anthropic said Opus 4.7 was not flawless. In one instance, it chose an older object-detection method, but the company said it was still able to work around the misstep and reach a workable solution.
The biggest weakness remained the final ball-fetching task. Anthropic said Claude could move the robot behind the ball and position it to push the ball back toward the starting area, but it struggled to do so with the precision and closed-loop control that humans managed after practice. The company noted that a researcher with more robotics experience was able to program autonomous fetching successfully.
That said, Anthropic said the current model appears reliable on the tasks it can handle and that additional scaffolding and time could likely help it complete the more difficult parts as well.
The company said the progress was not the result of a robotics-specific training push, but rather of broader model scaling. It argued that the pattern now seen in robotics resembles what happened in software work, where models first assisted humans and later took on more of the task themselves.
Anthropic suggested that the field may be entering an early stage of physical agentic AI, in which models can use existing hardware tools with growing ease. At the same time, it said more research is needed before drawing conclusions about how far that ability can extend, especially if models begin designing or adapting robotic systems themselves.
For now, Anthropic’s findings point to a narrower but notable shift. The latest Claude model still has trouble with fine motor-style control, but on several practical robot operations, it appears to have moved well past the speed of the human teams that first tested the system.