OpenAI and Anthropic Take Different Paths to Managing Long Contexts

# OpenAI and Anthropic Take Different Paths to Managing Long Contexts

A recent analysis of how leading AI coding tools behave suggests OpenAI and Anthropic are taking notably different routes to one of the hardest problems in frontier models: managing long, complex tasks without losing track of important details.

The post, written by developer and researcher Calvin French-Owen, compares OpenAI's GPT-5.5 and Anthropic's Claude-based tools after extended hands-on use. His main conclusion is that the companies appear to be optimizing for different ways of preserving context over many thousands of tokens, with OpenAI leaning toward a single-threaded approach and Anthropic leaning more heavily on parallel sub-agents.

## Two different strategies

French-Owen argues that long-horizon AI work is fundamentally a context management problem. Models have to gather information, perform reasoning, call tools, and then produce a result, often across large and changing threads of conversation.

In his view, OpenAI's approach, especially in Codex, relies on compaction. That means the system reduces or summarizes older material when the context window gets crowded, while keeping the most relevant details available. He says OpenAI has built this into its server-side infrastructure, allowing the company to adjust the method without requiring changes from users. He also notes that this can help the system take advantage of better key-value caching on the backend.

He describes this style as resembling an "oracle." In this model, one main thread carries most of the work, with coherence preserved over time because the system keeps many of the important details inside that thread.

Anthropic, by contrast, appears to use a more distributed approach. French-Owen says Claude Code frequently creates sub-agents that handle smaller parts of a task in their own context windows. These agents can research, explore code, and then pass summaries back to the parent agent. He says that when working with Fable 5, this behavior became especially noticeable, with the system spawning many review-style agents.

He compares this structure to a firm, where different actors work on separate responsibilities and share only selected information with one another.

## Tradeoffs in speed, cost, and accuracy

The post argues that the two approaches create different tradeoffs.

French-Owen says Anthropic's model may feel faster because more tokens are being generated in parallel. At the same time, he suspects the approach can be less token-efficient, since multiple sub-agents may duplicate work by searching similar material without fully coordinating.

He also says the distributed method may increase the chance that the system omits relevant facts when sub-agents decide not to pass everything back to the parent context. That, he suggests, could help explain some reports that Claude models sometimes seem less coherent or misstate details despite having done the underlying research.

OpenAI's compaction-based method may preserve coherence better, he argues, because the system does less selective filtering of what to keep or discard. But he also acknowledges that compaction can still lose information if it is too aggressive.

## A likely convergence

French-Owen does not frame the approaches as mutually exclusive. Instead, he expects the two companies to move toward a hybrid future.

He predicts that Anthropic will improve its compaction methods, which he says are currently too lossy, while OpenAI is likely to train more aggressively for multi-agent workflows.

The broader takeaway is that frontier model performance may increasingly depend less on raw model size alone and more on how systems organize work across context windows, agents, and memory. For users, that may continue to shape how different AI products feel in practice, from coherence and speed to reliability and cost.