Figure 5. Correlation between memory performance and neural responses. a. ERPs contrasting the agent and observer condition at the Fz electrode in participants who exhibited a strong memory benefit in the agent relative to the observer condition (active learners) and in participants that did not exhibit a strong memory advantage for the agent condition (indifferent learners). Below, topographical plots that show the distribution of the effect in the N1 time window. b. The difference between the N1 amplitude in the agent versus observer conditions during the first learning stage is plotted on the y-axis, and the difference between the agent and the observer condition in the %Correct during the memory task is plotted on the x-axis. A linear regression is fitted to the data (blue ). A dotted line indicates the median on the x-axis, based on which participants were sorted into the “active learners” and “indifferent learners” groups.

Discussion

The aim of this study was to investigate the neural mechanisms underlying the benefits of active control for associative learning while controlling for the factors of movement and predictability, which in the existing literature often conflate the effects of agency during learning. Using a gaze-controlled interface in a motor-auditory associative memory task, we showed that control over stimuli alone – controlling for unspecific neuromodulatory effects through movement and stimulus predictability – can lead to learning benefits on a behavioural level.
We found higher movement-sound association memory accuracy for associations studied with active oculomotor control of visual exploration versus objects studied passively. This active-learning advantage for memory occurred despite the fact that visuo-auditory information was matched between the agent and observer study conditions. However, some participants did not seem to follow this pattern, exhibiting small or no differences between the agent and observer learning condition. We found that we could distinguish amongst participants based on their learner type, that is to say that some participants indeed exhibited the expected memory benefit for active learning, while others did not. Interestingly, this behavioural difference was correlated with the individual participant’s degree of N1 attenuation, a well-established marker of self-generation during sound perception. We found that the stronger sensory processing differences between self- and externally generated stimuli – represented by the attenuation of the N1 component – a participant exhibited, the more they would benefit from control during learning, and the better their overall performance.
A large body of research shows that, chiefly, being in control of information during learning is beneficial for memory encoding. Beyond the well-known production effect (Brown & Palmer, 2012; MacDonald & MacLeod, 1998; MacLeod et al., 2010), an advantage for active, self-directed over passive learning methods is an established fact in educational contexts (Tomporowski et al., 2015) and has been observed in different modalities and domains of learning (Butler et al., 2011; Cohen, 1989; Gathercole & Conway, 1988; James et al., 2002; Kuhn et al., 2000; Schulze et al., 2012). Low-intensity exercise or simple motor-activity (such as walking or finger tapping) produces mixed results in relation to memory performance, with some studies finding memory benefits (Schaefer et al., 2010; Schmidt-Kassow, Deusser, et al., 2013; Schmidt-Kassow et al., 2010, 2014; Schmidt-Kassow, Heinemann, et al., 2013) and others memory impairment (Lajoie et al., 1996; Li et al., 2001; Lindenberger et al., 2000; Yogev-Seligmann et al., 2008). In many of the studies on the benefits of production for memory the effects of movement and the effects of being in control cannot be interpreted separately (Mama & Icht, 2016; Ozubko et al., 2012; Rummell et al., 2016). Some studies have tried to single out the effect of agency from conflating factors and found benefits for learning and memory (Chi, 2009; Gureckis & Markant, 2012; Markant et al., 2016). In this study we tried to relate self-generation effects during sensory processing to memory benefits of active learning. Taking into consideration the substantial evidence suggesting that self-generation effects are in part due to unspecific neuromodulation through motor activity, we asked ourselves whether or not the established self-generation effects would be reproducible in a paradigm that specifically singles out the effect of agency while controlling for movement. Additionally, we used eye movements for sound generation. Eye movements don’t trigger sounds in real life, so participants had to learn the associations between their movements and the different sounds from scratch. The fact that the production effect was reproduced in this set-up suggests that agency contributes significantly to the phenomenon, beyond the effects of coincidental proximity to a motor act. Specifically, we found that the attenuation of the N1 component could predict the strength of the active learning memory benefits an individual participant would experience.
The production effect is frequently explained with the distinctiveness account – the idea that retrieval of an event from memory is facilitated if the event is embedded in a network of associations rather than remembered in isolation (Hommel, 2005). An alternative explanatory approach is the idea that being in control is rewarding, that motivation is higher, and that it activates more strongly those areas of the brain that process reward (Leotti & Delgado, 2011), facilitating memory encoding. It has been hypothesized that feeling in control over something makes it self-relevant, which by default might be remembered better (Kim & Johnson, 2012). In experiments comparing the memory encoding of stimuli that are either under the control of the participant or under the control of the experimenter, there is also an inherent information processing advantage in control conditions: Self-directed learners can decide when they want to see what information. They can select the information that has the biggest effect on reducing their uncertainty and optimise the flow of information according to their needs. This makes the learning experience more efficient (Gureckis & Markant, 2012; Markant & Gureckis, 2010; Schulz & Bonawitz, 2007). In this study, the correlational finding between the attenuation of the N1 component and the memory performance of individual subjects suggests that whatever differences in performance we find are at least partly due to perceptual differences during learning, rather than conflating factors such as information efficiency.
It is not yet well understood how active production leads to improved memory performance on a neural level, and so far there are few established links between sensorimotor processing and memory gains. Our study contributes to this discussion by delivering evidence towards a link between the way we process a self-generated stimulus and the strength of its memory trace. Linking the differences in memory encoding that were found on a behavioural level to the differences in sensory processing during the learning phases of our experimental task, we were able to establish a connection between self-generation effects on ERP components and the production effect on memory. Memory performance was correlated with the degree of attenuation of the N1 component in self- versus externally generated sounds. We can draw two tentative conclusions from this: That there are individual differences in the strength of the self-generation effect on the N1 component, and that there is a link between the processing of self-generated sounds and their memory encoding.
Due to physical differences between eye movements in the agent and observer condition, we were not able to interpret the effect of agency on acquisition sound ERPs directly. Nevertheless, we were able to study whether the effects of the other two factors, learning stage and congruency, were modulated by agency. Contrary to our expectations, we did not find that the effects of learning progress and identity predictability (i.e. congruency with learned associations) themselves on neural processing were modulated by agency. Observing the change of ERP components over the course of the learning process, we found an attenuation of the P3a component in acquisition. We expected that faster learning through agency during acquisition might speed up this process, which would have led to a stronger attenuation earlier during learning. Studying test sounds, which we manipulated to be either congruent or incongruent with the learned movement-sound associations, we found a late positivity for congruent sounds. A movement-sound association strengthened by agency during learning should have reflected in a stronger congruency effect overall. Neither of these effects was modulated by agency during learning.
The P3a component is an orienting response typically associated with novel stimuli (Polich, 2007). We found an attenuation of the P3a with learning. Why was the attenuation effect not enhanced, or established earlier in the learning process, by agency during acquisition? The behavioural results suggest that the effect of agency should be most visible in the early and intermediate stages of learning, while towards the end, both conditions become similar. We could speculate that we would have found an earlier attenuation in the agent condition, had we been able to perform a more fine-grained analysis. Our design allowed us to separate into early and late learning stages. Maybe an analysis using more levels for this factor – which in our case was not possible due to an insufficient number of trials – would have detected an effect during intermediate stages of learning.
The congruency effects that we found were not exactly what we had anticipated, but they were nevertheless conclusive. We had expected that sounds that were incongruent with the learned associations between movements and sounds would trigger some form of mismatch response, possibly an audio-visual mismatch negativity (avMMN), which has been observed in response to violations of cross-modal predictions, similar to what was found by Winkler and colleagues (Winkler et al., 2009). We hypothesized that the way in which associations have been learned (either passively or as motor-associations) would impact the strength of the prediction error elicited by violations of those learned associations. Specifically, we expected to observe differences in certain ERPs that had previously been linked to deviant or target processing, like the N2b and the P3a (Knolle et al., 2013). We expected that deviating from an association learned as linked to a motor act will trigger a more efficient processing and yield stronger N2b and P3a responses. What we found instead was that test sounds that were congruent with learned associations between movements and sounds triggered a late positive component with a central distribution, which we could call P3. The P3 is often considered an index of context or internal model updating (Polich, 2007; Reed et al., 2022), and depending on the nature of the experimental task, it has also been observed as a response to target stimuli (Hillyard & Kutas, 1983; Nieuwenhuis et al., 2005; O’Connell et al., 2012; Twomey et al., 2016; Verleger et al., 2017). We found a P3 triggered by congruent sounds, so if we want to integrate this finding into existing theories, we should consider it a marker of model updating based on a positive match – participants see an animation, predict the upcoming sound, and when the prediction is matched, the model is reinforced. Alternatively, we could think of this component as a late positive component (LPC). This component has been hypothesized to be correlate of the working memory updating processes (Donchin, 1981; Donchin & Coles, 1988; Polich, 2007). It has been found in experiments where stimuli are task-relevant or response-dependent (Pritchard, 1981; Snyder & Hillyard, 1976). In one experiment, it was elicited when participants had to detect and respond to deviant stimuli, but not when they were instructed to ignore deviants (Maidhof et al., 2010). The LPC may reflect participants detecting a stimulus they had been looking out for (Mathias et al., 2015). Just like in this experiment, Mathias and colleagues found that the LPC was not modulated by active or passive acquisition mode, which they see as support for the idea that the LPC depends on the stimulus’ task relevance rather than the degree of deviation from a memory representation.

Conclusion

We found that active control during the learning of movement-sound associations using a gaze-controlled interface facilitates memory encoding. We found that the degree of attenuation of the N1 component for self-generated sounds correlated with the behavioural performance of each participant: the stronger the sensory processing differences during learning, the stronger the memory gain for active learning, and the better the overall performance on the memory task. This finding suggests that memory benefits of active learning are at least in part linked to perceptual differences during sensory processing, and that there may be a continuum of variation in the self-generation N1 attenuation effect across the population that allows us to assess different learner “profiles”. Although we did not find across-the-board modulation of neural responses by the factor of agency during learning, we see neural responses being modulated by increasing stimulus predictability, and we found that during memory recall, matching association pairs triggered a target matching response.