Google researchers have proposed a new way to help large language models keep learning without forgetting what they already know. In a paper posted to arXiv, the team describes a framework they call "Sleep," a training phase inspired by how humans consolidate memories during rest.
The work, titled Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories, argues that current AI systems are strong at answering questions and using short-term context, but still struggle to absorb new information over time and preserve it in their long-term parameters. The authors say their approach is meant to address that gap by giving models a structured way to turn temporary knowledge into more durable skills.
The proposed sleep framework has two parts. The first, called memory consolidation, uses a process the researchers refer to as Knowledge Seeding. In this stage, information stored in a smaller version of the model is distilled into a larger network. The goal is to increase capacity while retaining the useful knowledge gathered during learning.
The second stage is called dreaming. Here, the model uses reinforcement learning to create its own synthetic training curriculum. That generated data is then used to rehearse new material and improve existing abilities without human supervision. The authors describe this as a self-improvement loop that allows the system to refine itself after the consolidation step.
The paper presents a generalized distillation method as part of the Knowledge Seeding process. According to the abstract, it combines on-policy distillation with reinforcement learning-based imitation learning. The researchers say this is intended to help transfer knowledge from one model state to another while preserving learned behavior.
The study is aimed at continual learning, a long-standing challenge in machine learning where models need to adapt to new tasks or information without losing earlier knowledge. The authors also test the idea on long-horizon tasks, knowledge incorporation, and few-shot generalization, and say the results support the value of the sleep stage.
While the paper is framed around human sleep as an analogy, its technical goal is practical. The researchers want to improve how models store, retain, and reuse information across training cycles. That matters for systems expected to operate over longer periods, where static training can leave them brittle or unable to incorporate fresh data effectively.
The paper was submitted to arXiv on June 2, 2026, under the machine learning and artificial intelligence categories. A note on the listing says a version of the work had been publicly available on OpenReview since September 2025.
As with many arXiv papers, the work is still a preprint and has not yet appeared in a peer-reviewed venue. Even so, it adds to a growing line of research that borrows ideas from biology to rethink how AI systems learn. In this case, the proposed answer is not just more data or bigger models, but a deliberate period of artificial rest designed to help memory stick.