Figure 2. Analysis results of the high double trill as a function of time. The magenta dashed lines represent the beginning and ending points of the call, and the black dashed lines represent the maximum and minimum frequencies. (A) Waveforms (cyan solid: raw waveform, red dashed: bandpass-filtered waveform), (B) sound pressure level, (C) spectrogram (48,000 fast Fourier transform points and 4,800-point Hanning window), and (D) spectrogram of the envelope waveform representing the amplitude modulation frequency (96,000 fast Fourier transform points and 19,200-point Hanning window).
The call duration was calculated from the difference between the start and end points of the call signal, which varied depending on the call type. For HDT, the start of the first trill and the end of the subsequent trill were aligned well with the beginning and ending points of amplitude modulation, respectively. Therefore, the initiation and termination of the call were determined by the start and end points of the amplitude modulation within the frequency range of 30 to 80 Hz (Figure 2D). For the MST, LDT, HST, DT, and AT, the end point of the call was designated the termination of the amplitude modulation pattern. However, determining the start of the amplitude modulation pattern was ambiguous because the amplitude modulation rate and energy of the trill at the beginning of the call either increased gradually or the amplitude modulation pattern did not appear when the call started with a narrowband component, like a hoot rather than a trill. Therefore, for these five call types, call initiation was determined when vocal characteristics were discernible and exceeded the background noise level in both the sound pressure level and the spectrogram. H is characterized by the absence of a trill, the start point of the call was manually determined at the point where the energy significantly increased in the sound pressure level, spectrogram, and PRR. The end point of the call was defined as the moment when strong energy ceased at an amplitude modulation frequency below 10 Hz. The hoot is discussed in Section 4.
In previous studies, the frequency bandwidth of calls was defined as ±20 dB around the peak frequency (Rogers et al., 1995), or specialized programs such as Raven Pro (Cornell lab) and Osprey (Mellinger and Bradbury, 2007) were utilized (Shabangu and Rogers, 2021; Heimrich et al., 2020). While we adhered to the previously established criterion of a ±20 dB bandwidth, this criterion was insufficient for HDT, MST and AT. For these three calls, the start and end frequencies that exceeded the background noise spectrum levels before and after the call signal were calculated as maximum and minimum frequencies, respectively. The peak frequency was determined as the frequency corresponding to the highest power value at the spectrum level, and this criterion was applied to every signal. More detailed features not mentioned in previous studies of each call are explained in Section 3-2.