Anthropic's New 'Dreaming' Feature Lets AI Agents Learn From Their Own Mistakes

Anthropic unveiled a new capability at its Code with Claude conference in San Francisco on Tuesday, allowing AI agents to internally simulate failure scenarios and adjust their behavior accordingly—a development that blurs the line between narrow optimization and something more like self-directed learning.

By Monexus Staff Writernorth-america4-minute read8 May 2026☆ Save ↗ Share ⎙ Print

Anthropic introduced a new capability called "dreaming" at its second annual Code with Claude developer conference in San Francisco on Tuesday, a system designed to allow AI agents to simulate failure scenarios internally and adjust their behavior before executing tasks in the real world. The feature, integrated into the Claude Managed Agents platform, represents the latest move in a broader industry race to develop AI systems capable of operating with less human oversight—and raises familiar questions about what happens when those systems learn to evaluate their own performance on their own terms.

The core mechanism works by running internal simulations of task execution, allowing the agent to model potential failure modes without external consequences. Anthropic frames this as a safety feature: by rehearsing failure scenarios, agents can learn to avoid them. Critics, however, have noted that the boundary between self-improvement and self-directed goal formation remains porous. The company's technical documentation describes dreaming as a form of "offline learning" that occurs between task executions—essentially, the agent's downtime becomes processing time, during which it revisits unsuccessful attempts and updates its decision heuristics.

The conference, held at Fort Mason Center for Arts and Culture in San Francisco, drew approximately 3,000 registered developers according to event documentation. Anthropic Chief Product Officer Paul Christmann and other executives outlined the feature alongside broader platform improvements, including expanded context windows and tighter integration with enterprise workflow tools. The timing is deliberate: Anthropic has faced mounting pressure to demonstrate commercial viability for Claude, which has lost ground to OpenAI's GPT-series models in developer adoption metrics reported by major analytics platforms.

Agentic AI—the category encompassing systems that autonomously plan and execute multi-step tasks—has become the defining competitive frontier for major AI labs. OpenAI released Operator in January with similar self-deliberation features, while Google integrated comparable capabilities into its Gemini platform for enterprise customers. The commercial logic is straightforward: businesses want AI systems that can handle complex, multi-step workflows without human checkpoint reviews at each stage. Anthropic's dreaming feature is, in part, a bid to make Claude agents viable for those use cases by giving them something resembling independent judgment about their own performance.

What remains less clear is how exactly the system evaluates failure. Anthropic's documentation describes a reward-model mechanism that scores simulated outcomes, but the company has not published detailed technical specifications for how those scores translate into behavioral adjustments. The feature operates in a black box that Anthropic controls, raising questions for enterprise customers with compliance obligations or strict audit requirements. Several security researchers who have reviewed the feature's public description have noted that the absence of interpretability documentation makes it difficult to verify that internal simulations are actually producing the intended behavioral changes rather than rewarding surface-level optimization that flatters the metric without improving real-world performance.

The competitive dynamics matter beyond Anthropic's commercial interests. Anthropic's ownership structure, which includes significant backing from Google, positions it as a counterweight to OpenAI's dominance within the broader Alphabet ecosystem. Microsoft, through its Azure OpenAI partnership, occupies a different but overlapping competitive position. Amazon has invested in Anthropic alongside Google, creating a web of competing interests inside the AI safety mission that Anthropic publicly foregrounds. The dreaming feature can be read as a product differentiator—but also as a demonstration that Anthropic is advancing toward its stated goal of building AI systems that can reason about their own limitations, a core priority for a company whose founding charter emphasizes safety over capability race outcomes.

Whether that safety framing holds up depends on questions Anthropic has not fully answered. The system can simulate failure—but who defines what counts as failure? The agent itself appears to make that judgment through its internal reward model, which raises the possibility of goal drift over extended deployment periods. Anthropic says human reviewers can audit simulation logs and override agent decisions, but the practical enforceability of that oversight depends on how much autonomy the system accumulates before those reviews occur. Enterprise customers operating in regulated industries—finance, healthcare, legal—may find the current documentation insufficient to satisfy compliance requirements, which typically demand interpretable decision trails that dreaming's black-box architecture may not provide.

The feature will initially roll out to a limited set of enterprise partners, with broader availability expected before the end of the second quarter. Anthropic has not disclosed pricing specifics for the updated Managed Agents tier, though the company indicated that the new capabilities would be priced at a premium above the standard platform rate. For developers building on the Claude API, the feature introduces new parameters for controlling simulation depth and failure-threshold sensitivity—options that Anthropic's documentation describes as giving developers "granular control" over how aggressively agents self-correct.

The broader significance is harder to pin down. Anthropic is not the only lab pursuing agent self-improvement, and the underlying technique—reinforcement learning through imagined outcomes rather than real-world trial and error—has been explored in academic literature for years. What has changed is the operational scale: Anthropic is shipping this capability to enterprise customers who will deploy it on business-critical workflows, making whatever gaps exist between claimed and actual performance suddenly consequential in ways that research papers are not. The conference floor in San Francisco was, by all accounts, energized by the announcement. The harder questions will surface later, in the deployments that follow.

This article was structured around Anthropic's announcement at the Code with Claude conference. Monexus used VentureBeat's reporting as the primary source for conference details and feature descriptions. The publication did not independently verify technical specifications beyond what Anthropic's public documentation describes.