How Does Memory Generalize? A New Theory
In a recent Nature Neuroscience article, scientists described how the brain generalizes memories during memory consolidation. They challenged the idea that generalization is an uncontrolled, random transfer of memory from the hippocampus to the cortex, and instead proposed that it depends on the predictability of the environment. The authors of the new theory claim that memory is consolidated only if it contributes to generalization.
“It’s impossible to know everything in the world, but cherishing my dream, I solved the problem brilliantly…” Do you remember these words from a popular science program? They perfectly describe what happens in our brains when we need to store tons of information from the environment, most of which we’ll never use. The brain’s ingenious solution is that no memory is encoded in isolation; instead, it is generalized together with similar memories into a certain pattern. For example, an animal remembers a safe path to a water source, then generalizes this knowledge to more easily find other paths to water in the future. Memory and generalization are interconnected processes that help us predict the future and guide our behavior.
How Does the Brain Decide What to Generalize?
But how does the brain know what to generalize and what not to? What explains the selectivity in memory consolidation? What happens if we generalize incorrect information? The current theory of systems consolidation states that consolidation is a slow process in which the hippocampus, which initially encodes information thanks to its highly plastic synapses, transfers information to the cortex (neocortex), whose synapses are less plastic.
Credit: Sun et al., 2023
Systems consolidation takes several days. During this time, the hippocampus “teaches” the neocortex new information by replaying it, gradually “handing over” the memory to the neocortex.
So where does generalization come in? In reality, it happens in parallel with consolidation, as the neocortex sorts memories by their similarity to others and generalizes information. In this view, generalization is considered an inherent part of consolidation and always occurs without regulation. But can the brain really generalize any information? What if the information contains noise? What if the environment changes too quickly, making generalization meaningless?
The Theory of Optimally Generalized Systems Consolidation
Scientists proposed using the concept of mathematical neural networks to address the problems of the systems consolidation theory. They compared the neocortex to a student and the environment to a teacher. The teacher can directly teach the student—this is called direct learning. In this case, the student remembers exactly what the teacher teaches, whether it’s relevant information or noise. To help the student filter out relevant information for later memorization, an assistant is needed—a notebook for notes. By writing down what the teacher says, the student can review and understand the material independently.
On a mathematical level, each element of the “teacher–notebook–student” system is a neural network with certain parameters. The student can review the material multiple times (the neocortex learns), which is reflected in changes to the network’s weights.
Left: the “teacher–notebook–student” model (environment, neocortex, and hippocampus). Center: encoding, where experience (black arrow) is stored in the hippocampus and then consolidated in the neocortex. Right: recall, involving both the hippocampus and neocortex. Credit: Sun et al., 2023
Unfortunately, the environment is not a perfect teacher. It contains a lot of noise in the information it provides to the brain. Returning to the animal searching for water: relevant information includes fog, lots of insects, and morning dew. Noise or irrelevant information includes the brightness of sunlight or the color of passing birds.
It’s important to note that noise makes the environment less predictable, reducing the usefulness of generalization. The ratio of noise to relevant information depends on mechanisms of implicit and explicit attention, on how much information the environment (the teacher) provides to the student, and on how well the teacher matches the student (in mathematical terms: the student may be a linear neural network, while the teacher is nonlinear). The real world is full of noise and complexity, and the brain’s ability to interact with it is limited.
According to systems consolidation theory, memories should be integrated into a system of generalizations to predict the future. On the other hand, unpredictability and variability in the environment should not be stored in our memory. In other words, the “student” must separate noise from the main information. Mathematically, this means minimizing the difference between the teacher’s input and the student’s predictions.
Scientists suggested that learning begins when the “hippocampus–notebook” stores a certain number of examples from the environment, replaying them as neural activity. The “student” gradually learns these examples with increasing accuracy. But each repetition contains an error, encoded at the start and replayed later. Thus, the more noise in the environment, the higher the error and the worse the generalization.
The Key Role of Environmental Predictability
At what point does the brain realize it should stop generalizing? When does further generalization start to distort the ideal model the brain has created for predicting the future? How does it know when there’s too much noise?
The standard systems consolidation theory postulates that generalization naturally occurs during consolidation. However, it doesn’t consider that generalization can also harm memory. In its view, the “teacher–notebook–student” link is always perfect. But in a rapidly changing, unpredictable environment, too much consolidation leads to degraded generalization.
Researchers proposed that systems consolidation stops when further consolidation would harm generalization. The brain can calculate this moment, when the error between generalization and prediction starts to increase. How?
- One strategy is to use a simple heuristic: the initial learning rate correlates with predictability. This allows time to be used as a parameter for stopping learning.
- Another strategy is to split information into training and validation sets. This split happens in the “hippocampus–notebook.” The first set is needed to train the model (encoded by the neocortex), and the second for validation and later use. Learning ends when prediction error starts to rise.
To demonstrate their idea, the scientists simulated hippocampal damage in neural models, disrupting the “teacher–notebook–student” link. Systems consolidation ended at that point. As expected, the absence of the “notebook” in the model always led to worse memory. The same effect occurred when the “teacher’s” unpredictability increased.
Blue line: functioning hippocampus; light blue: damaged. Top: memory test simulations based on the classic systems consolidation theory (a) and the authors’ proposed idea (b). Bottom: generalization results for the classic (c) and proposed models (d). The black arrow shows improved information quality (less noise). Credit: Sun et al., 2023
It’s important to note the novelty of this work. Previous theories did not consider that systems consolidation could be harmful. The authors showed that unregulated consolidation can worsen memory and lead to poor neural network predictions if the initial data from the environment is limited or too noisy. All this leads to problems with generalization.
Memory error (blue) and generalization error (red) at different levels of environmental predictability (low—left, high—right) and different types of teachers (with lots of noise, too complex, or providing insufficient information for learning). Credit: Sun et al., 2023
Thus, a key advantage of the theory of optimally generalized systems consolidation is the idea that for successful consolidation and generalization, experience must be sufficient and predictable. The theory states that what matters for memory and generalization is not the detail, frequency, or vividness of experience, as many earlier studies claimed, but its predictability.