Abstract
While previous research has investigated the effects of emotional videos on peripheral physiological measures and conscious experience, this study extends the research to include cortical measures, specifically the steady-state visual evoked potential (ssVEP). A carefully curated set of 45 videos, designed to represent a wide range of emotional and neutral content, were presented with a flickering border. The videos featured a continuous single-shot perspective, natural soundtrack, and excluded elements associated with professional films, to enhance realism. The results demonstrate a consistent reduction in ssVEP amplitude during emotional videos which strongly correlates with the rated emotional intensity of the clips. This suggests that narrative audiovisual stimuli have the potential to track dynamic emotional processing in the cortex, providing new avenues for research in affective neuroscience. The findings highlight the potential of using realistic video stimuli to investigate how the human brain processes emotional events in a paradigm that increases ecological validity. Future studies can further develop this paradigm by expanding the video set, targeting specific cortical networks, and manipulating narrative predictability. Overall, this study establishes a foundation for investigating emotional perception using realistic video stimuli and has the potential to expand our understanding of real-world emotional processing in the human brain.
The substantial impact of smartphone videos on society is unquestionable. Watching a recording of a spontaneous event carries the viewer into a situation with realism and context that is missing in photographs. Seeing and hearing the natural sounds as a narrative unfolds captures our attention and can evoke powerful, long-lasting emotional states (Burns et al., 2008; Holman et al., 2014). Some studies have added emotional audio tracks (Brown & Cavanaugh, 2016; Gerdes et al., 2013) or music (Spreckelmeyer et al., 2006) to static emotional scene presentations while collecting ERPs to scene onset, and report modulatory effects primarily on early latency components. However, the complexity of multimodal video makes it a challenging stimulus to control experimentally, particularly for electrocortical measures.
Several peripheral psychophysiological studies have presented participants with studio film excerpts to evoke emotion. Emotional as compared to neutral videos lead to elevated skin conductance, differential heart rate modulation, and expressive facial muscle activity similar to that evoked by emotional and neutral scenes (Bos et al., 2013; Codispoti et al., 2008; Christie & Friedman, 2004; Kolodyazhniy et al., 2011; Koruth et al., 2015; Kriebig et al., 2007; Palomba et al., 2000). Affective startle modulation during video perception also shows effects consistent with emotional scene studies (Bradley, 2007; Kaviani et al, 1999; Koukounas & McCabe, 2001). In a series of studies using 27 content-matched emotional videos and scenes, videos were shown to enhance ratings of emotional intensity (Detenber et al., 1998; Simons et al. 1999; Simons et al., 2000; but see also Detenber & Reeves, 1996) and elevate skin conductance (Detenber et al., 1998; Simons et al., 1999; but see also Simons et al, 2000). These studies of peripheral and reflex physiology demonstrate the utility of emotional videos and suggest that the video may offer a more potent medium to induce states of emotion in a research setting.
To our knowledge, only 2 studies have used video stimuli to induce emotional states while recording EEG, both of which assessed changes in alpha-band power. One reported no differential effects of fearful, sad, or neutral video content on frontal alpha power (Dennis & Solomon, 2010), and one employed both videos and scenes (Simons et al., 2003), and reported reliable reductions in parietal alpha power during arousing, compared to neutral videos. A possible reason for the scarcity of EEG studies of emotional video perception is simply methodological, in that there is no single stimulus event from which an event-related potential (ERP) can be averaged, the most common means of analyzing electrocortical activity.
This practical difficulty could be circumvented through the use of steady-state visual evoked potentials (ssVEP) in combination with video presentation. Just as instructed spatial attention toward flickering stimuli enhances ssVEP amplitude at the attended flicker frequency (Hillyard et al., 1997), motivated attention enhances ssVEP amplitude during emotional relative to neutral flickering scene perception (Bekhtereva et al., 2015, 2021; Keil et al., 2003; 2005). Conversely, if nonflickering scenes are presented in combination with a competing flickering visual stimulus, ssVEP amplitude is reduced during emotional relative to neutral scene perception (Deweese et al., 2014; Müller et al., 2008; Weiser et al., 2016). Here we explore whether multimodal videos presented with a flickering border might show a similar modulation, such that emotional videos would be associated with reduced ssVEP amplitude compared to neutral videos.
Several emotional video sets have been assembled and used to investigate emotional reactivity in self report and peripheral physiology. However, these emotional video sets are not optimal for EEG recording, for several reasons. Nearly all video sets use long and variable clip duration (1-15 minutes; Gilman et al., 2017; Gross & Levenson, 1995; Hewig et al., 2005; Kreibig et al., 2007; Philippot, 1993; Schaefer et al., 2011). Many emotion film sets also lack audio, or include narration and music (Cowen & Keltner, 2017; Gilman et al., 2017; Philippot, 1993; Samson et al., 2016). In addition, all available video sets contain primarily content drawn from studio productions. These professional films often switch between multiple camera viewpoints, such that the viewer experiences the scene from more than one perspective, creating a ’suspension of disbelief’ and limiting ecological validity (Holland, 2003; Lee, 2004; Prentice & Gerrig, 1999). Studio films also often feature well-known actors and special effects that lead the viewer to recognize that the action is artificial, which suppresses emotional reactivity (Gross & Thompson, 2007; Hacjak, MacNamara & Olvet, 2010). Ideally, a video intended to evoke emotion might present a situation as a true event, as it actually happened in the real world. Because of the high prevalence smartphone video ownership, and the public posting of these videos, there is now sufficient ’raw material’ available to assemble a realistic video set for experimental use.
Compared to static scenes, multimodal video clips present several problems with regard to experimental control when recording electroencephalography (EEG). In order to interpret the potential differences in electrocortical reactivity across video contents, the video set should not differ systematically in basic sensory and perceptual features, while retaining factors that promote realism (Allison, Wilcox, & Kazimi, 2013; Lin & Peng, 2015). These include a continuous single shot, a landscape perspective that places the viewer in a ground-level position, ambient sound track without music or narration, and videographic quality that does not suggest professional creation, to enhance the impression that the clip represents a genuine event. A relatively brief duration of 10 seconds was chosen to provide short enough periods to enable multiple exemplars from different emotional content categories, but long enough to engage and sustain an emotional state during which a cortical response could be recorded.
Based on past work with emotional scenes, we hypothesize that emotional video perception will result in reduced ssVEP amplitude toward the competing flickering border. As with past ssVEP and ERP studies of emotional perception, the impact of pleasant and unpleasant videos is expected to be equivalent, and strongly related to arousal ratings (Bekhtereva et al., 2015; Cuthbert, et al., 2000; Frank & Sabatinelli, 2019; Keil et al., 2008). Alternatively, the dynamic nature of the video stimuli and the addition of an ambient audio track may engage continuous perceptual processing across all videos to the extent that any differential impact of emotional videos on ssVEP amplitude fails to rise above the ssVEP variability.