Figure 2. Analysis results of the high double trill as a function of
time. The magenta dashed lines represent the beginning and ending points
of the call, and the black dashed lines represent the maximum and
minimum frequencies. (A) Waveforms (cyan solid: raw waveform, red
dashed: bandpass-filtered waveform), (B) sound pressure level, (C)
spectrogram (48,000 fast Fourier transform points and 4,800-point
Hanning window), and (D) spectrogram of the envelope waveform
representing the amplitude modulation frequency (96,000 fast Fourier
transform points and 19,200-point Hanning window).
The call duration was calculated from the difference between the start
and end points of the call signal, which varied depending on the call
type. For HDT, the start of the first trill and the end of the
subsequent trill were aligned well with the beginning and ending points
of amplitude modulation, respectively. Therefore, the initiation and
termination of the call were determined by the start and end points of
the amplitude modulation within the frequency range of 30 to 80 Hz
(Figure 2D). For the MST, LDT, HST, DT, and AT, the end point of the
call was designated the termination of the amplitude modulation pattern.
However, determining the start of the amplitude modulation pattern was
ambiguous because the amplitude modulation rate and energy of the trill
at the beginning of the call either increased gradually or the amplitude
modulation pattern did not appear when the call started with a
narrowband component, like a hoot rather than a trill. Therefore, for
these five call types, call initiation was determined when vocal
characteristics were discernible and exceeded the background noise level
in both the sound pressure level and the spectrogram. H is characterized
by the absence of a trill, the start point of the call was manually
determined at the point where the energy significantly increased in the
sound pressure level, spectrogram, and PRR. The end point of the call
was defined as the moment when strong energy ceased at an amplitude
modulation frequency below 10 Hz. The hoot is discussed in Section 4.
In previous studies, the frequency bandwidth of calls was defined as ±20
dB around the peak frequency (Rogers et al., 1995), or specialized
programs such as Raven Pro (Cornell lab) and Osprey (Mellinger and
Bradbury, 2007) were utilized (Shabangu and Rogers, 2021; Heimrich et
al., 2020). While we adhered to the previously established criterion of
a ±20 dB bandwidth, this criterion was insufficient for HDT, MST and AT.
For these three calls, the start and end frequencies that exceeded the
background noise spectrum levels before and after the call signal were
calculated as maximum and minimum frequencies, respectively. The peak
frequency was determined as the frequency corresponding to the highest
power value at the spectrum level, and this criterion was applied to
every signal. More detailed features not mentioned in previous studies
of each call are explained in Section 3-2.