Discussion
In this study we demonstrated the feasibility of differentiating electrocortical reactivity evoked during multimodal emotional and neutral video perception. A considerable literature has employed video stimuli to evoke states of emotion while assessing peripheral physiological measures and conscious experience, and here we expand this area of study to cortical measures. This effort involved the curation of 45 brief video clips, and the addition of a competing visual flicker to evoke a steady-state response that serves as a continuous index of uninstructed emotional engagement toward each video clip. This novel paradigm resulted in reliably reduced ssVEP amplitude that was clearly correlated with the rated emotional intensity of emotional videos, both pleasant and unpleasant. Other perceptual video features that might be confounded with emotional intensity and modulate ssVEP amplitude independent from emotion did not show any significant relationships. These data demonstrate that narrative audiovisual stimuli can be employed to track dynamic emotional processing in the cortex, potentially enabling new research questions to be addressed in affective neuroscience. Realistic audiovisual clips have the potential to recruit multimodal brain networks involved in the dynamic perception of emotional situations and thus evoke emotional states that more closely represent real life experience. While technically challenging, this move toward increased ecological validity could prompt qualitatively different brain states and expand our opportunities to understand how the human brain processes realistic emotional events.
Multiple brain networks are involved in the processing of any dynamic, multimodal stimulus which continuously draws perceptual and integrative resources across visual and auditory modalities. The steep drop in ssVEP amplitude in the first second of video presentation (Figure 2) likely reflects the rapid shift in cortical entrainment from the initially dominant border flicker to the processing of the content of each video. The additional reduction in ssVEP amplitude during emotional, compared to neutral videos appears to represent the enhanced activation of widespread sensory and association areas engaged by frontal cortical and (indirectly) subcortical structures involved in emotional discrimination and response processing (Bo et al., 2021; Frank et al., 2019; Liu et al., 2012; Pessoa, 2017; Sabatinelli & Frank, 2019). Prior ssVEP studies of scene processing suggest an enhanced contribution of superior parietal, middle temporal and inferior frontal cortex during emotional compared to neutral scenes (Keil et al., 2009; 2012; Moratti et al., 2004) that may also support perceptual processing of motivationally relevant videos. The current study was intended to demonstrate the feasibility of the paradigm, and thus we focused on modulation of ssVEP amplitude over occipital sensors where the visual flicker entrainment is strongest. Given the relatively sparse coverage of the 64-channel EEG net and limited number of videos per category, a more detailed analysis was underpowered. In future work with a larger video set and greater EEG sensor density, source localization analyses might be applied to differentiate visual and auditory cortical contributions, as well as other cortical networks that are dynamically engaged during emotional video processing.
A recent and relevant study by Stegmann & colleagues (2022) employed classical conditioning of shock with 32 s neutral videos that placed the viewer’s perspective as walking through empty office hallways, and used a flickering overlay (a black frame appeared in the video every 50 ms) to induce a ssVEP and thus track cortical engagement. Their data showed a decrease in ssVEP amplitude over occipital areas during CS+ acquisition, consistent with the effects of the current study, though our videos were inherently emotion-evoking. The use of a flickering border may therefore be equivalent to interspersed black frames, with the emotional or conditioned content resulting in reduced ssVEP amplitude.
Compared to scenes, video stimuli are difficult to standardize, as each exemplar is a series of 240 images, and their unfolding narrative is unpredictable to the viewer. The 45 videos assembled for this study were selected to convey a reasonably consistent narrative, without radically unexpected changes. For example, a video from the perspective of a person walking on a city street did not transform into a mugging, or a surprise reunion with an old friend. While narrative variability is a potential confound if uncontrolled, future studies might manipulate this predictability by including videos that shift from neutral to pleasant or unpleasant, to potentially reveal how emotional networks characterize changing events. The audio track could also be manipulated independent of the video clip to investigate the impact of consistent relative to conflicting information, such as ambiguous language and voice inflection. Subtle changes may have large effects in the interpretation of dynamic narratives which might be assessed with time-locked shifts of ssVEP amplitude to video events.
Limitations . While encouraging, the use of video stimuli to evoke emotional states involves several limitations. Compared to scenes, considerable labor is involved in video collection, editing, and balancing. The use of a flickering border to elicit the ssVEP works against the intention to move the evocative stimulus farther toward realism, and depends on an indirect reduction is flicker entrainment as the index of emotional engagement. The lack of a single stimulus onset precludes the averaging of event-related potentials, and the assessment of evoked cortical oscillations by video clips with scalp-derived EEG has not yet been demonstrated, perhaps due to the difficulty of capturing consistent emotional network activity across participants (Shen & Yi, 2019).
A common means of assessing emotional perception in the laboratory is with the late positive potential (LPP; Ferrari et al., 2017; Hajcak et al., 2010; Schindler & Bublatsky, 2020; Schupp & Kirmse, 2021). Thus it would be helpful to compare the modulation of the LPP with modulation of the ssVEP during video perception within a single sample. The individual pattern of modulation (e.g., a bias toward pleasant or unpleasant stimuli) and overall effect size of emotion may or may not be consistent, depending on the relationship between the conscious recognition and elaborative processing reflected in the LPP to a static scene and the ongoing deciphering and evaluation of videos.
Though arousal ratings clearly coincide with the degree of ssVEP reduction, the valence of the videos do not appear to differentially affect ssVEP amplitude, similar to the LPP (Frank & Sabatinelli, 2019; Codispoti et al., 2021). Future studies may exploit the 10 second duration to contrast pleasant and unpleasant videos to potentially reveal valence (or reward) related effects that may originate in anterior medial regions (Costa et al., 2010; Junghöfer et al., 2017; Sabatinelli et al., 2015) that may become evident during the comparatively long duration video stimuli.
In the current design, the video and its flickering border were presented simultaneously, thus intermixing the brain’s response, and delaying the evidence of emotional modulation until entrainment had stabilized and the content of the video could be understood by the participant. Future research could separate these 2 events by initiating the flicker prior to video onset, ideally using an audiovisual stimulus that shared the basic sensory features of the upcoming video. This refinement to the paradigm may allow a more temporally resolved assessment of emotional reactivity (Bekhtereva et al., 2018).
Conclusions . In summary, this study found a significant modulation of ssVEP amplitude during naturalistic multimodal videos, which correlated strongly with rated arousal. This finding suggest that narrative audiovisual stimuli can be used to track emotional processing in the cortex, potentially enabling new research questions to be investigated which could facilitate a better understanding of how the human brain processes realistic emotional events. Future development and improvement can expand the utility of this approach to studying emotional processing in the laboratory.