Methods
Participants . Forty-four participants were recruited from the
University of Georgia student body and were compensated with course
credit. All participants gave informed consent after reading a
description of the study approved by the University of Georgia Human
Subjects Institutional Review Board. Data from 2 participants were
excluded due to excessive EEG artifacts (described below) leaving 42
participants in all subsequent analyses. The mean age for final sample
was 18.9 years (SD = 1) with a range from 18 to 22. Thirty-four
of the participants identified as female, 6 of the participants
identified as Asian, 2 as Black, 1 as multiracial, and 33 as White.
Video Stimuli Set . The goal in assembling the video set was to
represent a wide variety of emotional and neutral content, while holding
basic perceptual features of the videos reasonably constant. The videos
were clipped to 10 seconds in duration with the intention to allow the
viewer to recognize the nature of the situation quickly, with the
narrative maintained across the interval. The videos featured a single
lens perspective that placed the viewer roughly at eye level, and
included a natural soundtrack. The videos excluded recognizable actors
or high production value elements associated with professional film
(e.g. expert lighting, composition). The intention was to minimize
artificial elements that may break the viewer’s belief that the clips
depict true events. Videos were selected to depict roughly equivalent
degrees of loudness and motion across emotional and neutral videos to
avoid a confound of emotion and action.
Forty-five clips meeting these criteria were collected from the internet
that depicted a range of pleasant, neutral, and unpleasant situations.
These videos were divided into 5 groups of 9 videos each. Four of these
video categories depicted emotional situations, judged by the
experimenters to be highly arousing (roller coasters, passionate
couples, graphic surgery, direct threats) or modestly arousing content
(puppies, cute babies; indirect threats). One group of 9 videos depicted
active, but common life experiences (walking down a busy street, a
kitchen staff hard at work). The video clips were quantified on a number
of basic perceptual qualities. This included sound intensity, measured
with a BAFX 3370 Digital Sound Level Meter placed at ear level at the
participant’s chair. Audio was presented with a Dell A525 3-speaker
system with subwoofer, placed directed under the video monitor, and
loudness was assessed as decibel (A weighted) values recorded twice per
second, and averaged across each video. Brightness was defined by
converting the color videos to grayscale and averaging the 0-255 values
for each video frame. Movement depicted in the videos was quantified as
the average difference in grayscale pixel values between successive
frame of each video, using the Magick R package version 2.7.3 (Ooms,
2021). For example, during periods with high levels of movement in a
video, many pixels will show large changes in brightness from frame to
frame. These changes were averaged to yield a score representing the
total movement represented in each video. Lastly, Shannon’s entropy was
used as an index of perceptual complexity, by quantifying the entropy
value within each frame to provide a mean and standard deviation across
each 10 s video. These quantified video features will be correlated with
the electrocortical data along with emotional ratings of pleasantness
and arousal collected from our participant sample.
Experimental Design and Procedure. After providing informed
consent, each participant was given instructions and seated in a chamber
shielded for sound and electromagnetic noise. A 64-channel EEG net
(described below) was placed and adjusted over the course of 10-15
minutes. The research assistants then reminded the participants to
remain still and maintain fixation on a red cross at the center of the
video screen throughout the series. Participants were also asked to
avoid blinking during each video clip presentation, to the best of their
ability.
The video series started with an acclimation trial, which presented a 10
s fixed checkerboard surrounded by a flickering border. A delay of 12 s
was then followed by the 45 experimental video clips, which were
arranged in a pseudo-randomized order with an average inter-trial
interval (ITI) of 12 s (range 10 to 14 s). Videos were ordered such that
no more than two videos from the same category were shown in succession,
and that video contents were
equivalently distributed across
the series.
PsychoPy open source software (Peirce et al., 2019) was used to present
the video clips and send triggers to the EEG acquisition computer. The
PsychoPy control files and video stimuli are available on Open Science
Framework (final link TBD). A Dell Optiplex 380 computer presented
videos to a 60 Hz Westinghouse 32-in LCD monitor, which was placed 1.6 m
from the participants eyes, at a 960 by 648 pixel resolution, with the
video clip shown in the central 720 by 405 pixels (20º x 15º visual
angle). The remaining monitor space displayed a gray border (RGB value
of [148, 148, 148]) around the video which flickered to black (RGB
value of [0, 0, 0]) at a 7.5 Hz frequency to evoke a steady-state
visual potential. To ensure a precise flash rate, the black border was
drawn every eighth screen refresh on the 60 Hz monitor. The presentation
code logged the exact time of
each flash of the border, which was consistent at 7.5 Hz for all videos
and all participants.
Following the presentation of the video series, there was a brief
(~2 m) break while research assistants entered the
chamber and checked in with the participant and the status of the
sensors. The video series was then repeated in a new pseudo-random
order. After the second video block, the recording equipment was
removed, and participants left the chamber and were given towels and
water to clean the electrolyte gel out of their hair. Participants then
sat at a computer and viewed the 45 videos again and provided ratings of
each video using a computer-based Qualtrics survey. Videos were viewed
to completion before participants could give their ratings and advance
to the next video. Ratings of experienced valence and arousal were
recorded using a computer-based version of the Self-Assessment Manikin
(SAM) (Bradley & Lang, 1994). Participants input their ratings of
pleasantness and arousal on a scale of 1-9 in units of .1 by clicking
and dragging a cursor. After ratings were completed, participants filled
out a brief post-experimental questionnaire, and were debriefed on the
rationale for the experiment.
EEG Data Collection. The EEG data were recorded using a BioSemi
ActiveTwo 64-channel system (BioSemi Amsterdam, Netherlands). The
electrode cap was positioned according to the 10-20 system. Data were
recorded continuously in reference to common mode electrodes (CMS and
DRL). The electrode offsets were kept between 50 and -50 millivolts
before data collection. The data were sampled at 512 Hz with no online
low- or high-pass filters. Triggers corresponding to video onsets were
sent from the PsychoPy presentation computer to the EEG system via a
BioSemi USB Trigger Interface cable.
EEG preprocessing. The EEG data were preprocessed using the
MATLAB-based Electro Magnetic Encephalography Software (EMEGS;
emegs.org; Peyk, De Cesarei, & Junghöfer, 2011). The software
implements an artifact correction procedure designed for use with dense
array EEG (Junghöfer, Elbert,
Tucker, & Rockstroh, 2000). Offline, the EEG data were filtered using a
low-pass Butterworth filter with a passband of 30 Hz and stopband of 40
Hz. Because we were primarily interested in our steady-state driven
frequency of 7.5 Hz, we used a high-pass filter with a passband of 3 Hz
and stopband of 1 Hz. From the continuously recorded data, each trial
was segmented from 100 ms prior to 10 s after video onset.
To identify trials and sensors that were contaminated by artifacts,
sensor by trial distributions were made by a composite measure from the
maximum amplitudes, standard deviation, and the maximum first derivative
by EMEGS (Junghöfer et al., 2000). These distributions were jointly used
to locate noisy channels as compared to the distribution medians. Trials
that contained excess artifact are removed by EMEGS, and sensors that
contain excess artifact are replaced by spherical spline interpolation
with a weighted average of all remaining sensors. The largest weights
for this procedure go to the closest sensors that have the smallest
standard deviations. The data were then re-referenced to an average
reference and the artifact detection procedure was repeated.
To estimate the elicited ssVEP amplitude, waveforms for each video were
created using a custom-built function in MATLAB. The function used a
Hilbert transform to isolate the 7.5 Hz phase and derive the
instantaneous amplitude at this frequency. This allowed us to transform
our data to represent the border-driven 7.5 Hz amplitude over each video
presentation. Due to onset / offset transition effects, the first and
last second of video duration was excluded from analysis, leaving 8
seconds of ssVEP amplitude.
Following preprocessing, trials were averaged together by category for
each of the 42 participants. To be included in the final group, each
participant were required to retain at least 50% of the trials from
each video category, otherwise the participant was excluded from further
analysis, which resulted in the exclusion of 2 participants. Of the
remaining sample of 42, 78.1% of trials were retained (standard
deviation 9%) across participants. Averaged ssVEP amplitude across the
sample was used for the by-video
correlation analyses with emotional ratings and video features.
ssVEP data analyses. The amplitude of the ssVEP signal was
sampled from 14 occipital sensors in which the ssVEP signal is
strongest, based on prior studies using similar designs (Müller et al.,
2008; Deweese et al., 2014; Bekhtereva et al., 2017), and confirmed with
topographical representation of 7.5 Hz power in the current dataset.
These sensors included P1, P3, PO3, PO7, O1, Iz, Oz, POz, Pz, P2, P4,
PO4, PO8, & O2. To estimate ssVEP amplitude, the middle 8 seconds of
video presentation (from 1 to 9 seconds after onset) was averaged and
used for all analyses to avoid onset and offset artifacts. There is
considerable inter-subject variability in overall ssVEP amplitude
(Moratti et al., 2004; Weiser et al., 2016), thus within-subject data
were z-scored across video contents.
Self-reported emotion ratings and ssVEP amplitudes were analyzed across
the 5 content categories using repeated measure ANOVAs corrected for
violations of sphericity as needed, with effects broken down by paired
t-tests, with Tukey correction. Effect sizes were quantified by the
generalized eta squared
(ηG2 ) measure. Generalized
eta can be interpreted such that
.01 is a small effect, .06 a medium effect, and .14 a large effect
(Olejnik & Algina, 2003). Pearson’s correlations were used to assess
the potential relationships between ssVEP amplitude and video estimates
of self-reported valence, arousal, luminance, sound intensity, entropy,
entropy SD, and pixel motion.