Abstract
While previous research has investigated the effects of emotional videos
on peripheral physiological measures and conscious experience, this
study extends the research to include cortical measures, specifically
the steady-state visual evoked potential (ssVEP). A carefully curated
set of 45 videos, designed to represent a wide range of emotional and
neutral content, were presented with a flickering border. The videos
featured a continuous single-shot perspective, natural soundtrack, and
excluded elements associated with professional films, to enhance
realism. The results demonstrate a consistent reduction in ssVEP
amplitude during emotional videos which strongly correlates with the
rated emotional intensity of the clips. This suggests that narrative
audiovisual stimuli have the potential to track dynamic emotional
processing in the cortex, providing new avenues for research in
affective neuroscience. The findings highlight the potential of using
realistic video stimuli to investigate how the human brain processes
emotional events in a paradigm that increases ecological validity.
Future studies can further develop this paradigm by expanding the video
set, targeting specific cortical networks, and manipulating narrative
predictability. Overall, this study establishes a foundation for
investigating emotional perception using realistic video stimuli and has
the potential to expand our understanding of real-world emotional
processing in the human brain.
The substantial impact of smartphone videos on society is
unquestionable. Watching a recording of a spontaneous event carries the
viewer into a situation with realism and context that is missing in
photographs. Seeing and hearing the natural sounds as a narrative
unfolds captures our attention and can evoke powerful, long-lasting
emotional states (Burns et al., 2008; Holman et al., 2014). Some studies
have added emotional audio tracks (Brown & Cavanaugh, 2016; Gerdes et
al., 2013) or music (Spreckelmeyer et al., 2006) to static emotional
scene presentations while collecting ERPs to scene onset, and report
modulatory effects primarily on early latency components. However, the
complexity of multimodal video makes it a challenging stimulus to
control experimentally, particularly for electrocortical measures.
Several peripheral psychophysiological studies have presented
participants with studio film excerpts to evoke emotion. Emotional as
compared to neutral videos lead to elevated skin conductance,
differential heart rate modulation, and expressive facial muscle
activity similar to that evoked by emotional and neutral scenes (Bos et
al., 2013; Codispoti et al., 2008; Christie & Friedman, 2004;
Kolodyazhniy et al., 2011; Koruth et al., 2015; Kriebig et al., 2007;
Palomba et al., 2000). Affective startle modulation during video
perception also shows effects consistent with emotional scene studies
(Bradley, 2007; Kaviani et al, 1999; Koukounas & McCabe, 2001). In a
series of studies using 27 content-matched emotional videos and scenes,
videos were shown to enhance ratings of emotional intensity (Detenber et
al., 1998; Simons et al. 1999; Simons et al., 2000; but see also
Detenber & Reeves, 1996) and elevate skin conductance (Detenber et al.,
1998; Simons et al., 1999; but see also Simons et al, 2000). These
studies of peripheral and reflex physiology demonstrate the utility of
emotional videos and suggest that the video may offer a more potent
medium to induce states of emotion in a research setting.
To our knowledge, only 2 studies have used video stimuli to induce
emotional states while recording EEG, both of which assessed changes in
alpha-band power. One reported no differential effects of fearful, sad,
or neutral video content on frontal alpha power (Dennis & Solomon,
2010), and one employed both videos and scenes (Simons et al., 2003),
and reported reliable reductions in parietal alpha power during
arousing, compared to neutral videos. A possible reason for the scarcity
of EEG studies of emotional video perception is simply methodological,
in that there is no single stimulus event from which an event-related
potential (ERP) can be averaged, the most common means of analyzing
electrocortical activity.
This practical difficulty could be circumvented through the use of
steady-state visual evoked potentials (ssVEP) in combination with video
presentation. Just as instructed spatial attention toward flickering
stimuli enhances ssVEP amplitude at the attended flicker frequency
(Hillyard et al., 1997), motivated attention enhances ssVEP amplitude
during emotional relative to neutral flickering scene perception
(Bekhtereva et al., 2015, 2021; Keil et al., 2003; 2005). Conversely, if
nonflickering scenes are presented in combination with a competing
flickering visual stimulus, ssVEP amplitude is reduced during emotional
relative to neutral scene perception (Deweese et al., 2014; Müller et
al., 2008; Weiser et al., 2016). Here we explore whether multimodal
videos presented with a flickering border might show a similar
modulation, such that emotional videos would be associated with reduced
ssVEP amplitude compared to neutral videos.
Several emotional video sets have been assembled and used to investigate
emotional reactivity in self report and peripheral physiology. However,
these emotional video sets are not optimal for EEG recording, for
several reasons. Nearly all video sets use long and variable clip
duration (1-15 minutes; Gilman et al., 2017; Gross & Levenson, 1995;
Hewig et al., 2005; Kreibig et al., 2007; Philippot, 1993; Schaefer et
al., 2011). Many emotion film sets also lack audio, or include narration
and music (Cowen & Keltner, 2017; Gilman et al., 2017; Philippot, 1993;
Samson et al., 2016). In addition, all available video sets contain
primarily content drawn from studio productions. These professional
films often switch between multiple camera viewpoints, such that the
viewer experiences the scene from more than one perspective, creating a
’suspension of disbelief’ and limiting ecological validity (Holland,
2003; Lee, 2004; Prentice & Gerrig, 1999). Studio films also often
feature well-known actors and special effects that lead the viewer to
recognize that the action is artificial, which suppresses emotional
reactivity (Gross & Thompson, 2007; Hacjak, MacNamara & Olvet, 2010).
Ideally, a video intended to evoke emotion might present a situation as
a true event, as it actually happened in the real world. Because of the
high prevalence smartphone video ownership, and the public posting of
these videos, there is now sufficient ’raw material’ available to
assemble a realistic video set for experimental use.
Compared to static scenes, multimodal video clips present several
problems with regard to experimental control when recording
electroencephalography (EEG). In order to interpret the potential
differences in electrocortical reactivity across video contents, the
video set should not differ systematically in basic sensory and
perceptual features, while retaining factors that promote realism
(Allison, Wilcox, & Kazimi, 2013; Lin & Peng, 2015). These include a
continuous single shot, a landscape perspective that places the viewer
in a ground-level position, ambient sound track without music or
narration, and videographic quality that does not suggest professional
creation, to enhance the impression that the clip represents a genuine
event. A relatively brief duration of 10 seconds was chosen to provide
short enough periods to enable multiple exemplars from different
emotional content categories, but long enough to engage and sustain an
emotional state during which a cortical response could be recorded.
Based on past work with emotional scenes, we hypothesize that emotional
video perception will result in reduced ssVEP amplitude toward the
competing flickering border. As with past ssVEP and ERP studies of
emotional perception, the impact of pleasant and unpleasant videos is
expected to be equivalent, and strongly related to arousal ratings
(Bekhtereva et al., 2015; Cuthbert, et al., 2000; Frank & Sabatinelli,
2019; Keil et al., 2008). Alternatively, the dynamic nature of the video
stimuli and the addition of an ambient audio track may engage continuous
perceptual processing across all videos to the extent that any
differential impact of emotional videos on ssVEP amplitude fails to rise
above the ssVEP variability.