4. Discussion
Our research was motivated by the objective of implementing passive
acoustic monitoring in a region that is known for its well-preserved
marine environment. During the analysis, we identified leopard seal
vocalization as the dominant sound source. Unlike many previous studies
that have focused on vocalizations of target species from identified
individuals from an ecological perspective, our research aimed to
interpret the vocalizations produced by unidentified individuals of both
sexes and ages after the measurements.
One of the most significant findings in our results was the
identification of a new call type, the triple ascending trill. The
ascending trill has been mentioned in only four previous studies
(Shabangu and Rogers, 2021; Van Opzeeland et al., 2010; Klinck, 2008;
Thomas and Golladay, 1995). Specifically, Van Opzeeland et al. (2010)
and Klinck (2008) reported it as a single ascending trill with a
spectrogram, while Thomas and Golladay (1995) described it as call 4,
consisting of two or three components. Most of the ascending trills
observed in our data comprised three distinguishable trill parts. In
contrast, the single ascending trills recorded in our measurements had a
very low signal-to-noise ratio, and some triple ascending trills
exhibited faint second and third ascending trill parts (Figure 7A, B),
making clear identification challenging. This suggests that AT is
fundamentally composed of three parts, but weak single ascending trills
might be recorded when the distance between the recorder and the
individual is large. Consequently, the single ascending trill was
excluded from detection and acoustic characterization due to its
significantly higher detection uncertainty. To verify whether the triple
ascending trill is a call of the leopard seal, we checked video data
containing the airborne vocalizations of the male leopard seal, taken on
December 12 after the underwater acoustic measurements for 26.9 minutes
using a monitoring camera (48,000 Hz sampling rate). We extracted
fifty-five vocal signals, which were divided into 7 call types — HDT,
MST, LDT, DT, triple ascending trill (Figure 7D), low single trill
(Figure 7E) and, hoot (Figure 7F) — from the 4.4-minute video,
excluding indistinct signals due to the wind noise and sections without
vocalizations. The call pattern of the triple ascending trill observed
in the air was similar to that recorded underwater. The lack of the
colorbar on the vocal spectrogram recorded in the air was due to our
inability to obtain the receiving voltage sensitivity of the camera
microphone from the manufacturer, which rendered quantitative analysis
of the airborne acoustic data impossible. The airborne vocalizations
exhibit more apparent trill patterns and harmonic components than those
recorded underwater, with much less reverberation. Reverberation in the
underwater acoustic waveguide lasts relatively longer compared to air
due to interaction with ocean boundaries such as the sea surface and
seafloor (Katsnelson et al., 2012). This supports the validity of the
method for determining the end point of the call by the end point of the
amplitude modulation pattern. While most of the HDT, MST, and LDT
waveforms exhibited relatively distinct amplitude modulation patterns,
those of the trill parts in HST, DT, and AT were not clearly visible.
This difference may be caused by their vocal mechanisms, but further
verification is required to confirm the reason. Since we focused on the
underwater vocalizations of leopard seals and airborne data were used to
support the observations of triple ascending trill, detailed information
regarding airborne vocalizations can be found in previous studies
(Rogers et al., 1995).
In our study, call types were categorized based on previous research;
however, there were low single trill and LDT cases without a strong
narrow component and indistinct double trill cases, similar to low
single trill in the frequency band overlapping with LDT. As these were
identified as variant calls of LDT in a previous study (Rogers, 2007),
we did not classify them as separate call types. Additionally, HST
exhibited large variations in the duration and interval of hoot and
trill. Consequently, the uncertainty in the call counts of low-frequency
vocalizations, including DT, was high. Furthermore, single hoot was
detected in all the acoustic data; however, due to the significant
variability in its sound pressure level and the difficulty in
distinguishing them from HST, they were not included in the call count,
similar to the single ascending trill. From 101 sample signals of single
hoot with relatively high signal-to-noise ratios, the estimated peaks,
minimum frequencies, maximum frequencies, and call duration were 182 (±
10 SD), 163 (± 8 SD), 201 (± 10 SD) Hz and 2.7 (± 0.4 SD) seconds,
respectively. A representative spectrogram of a single hoot is shown in
Figure 7C. We also addressed that the process of calculating the upper
and lower limit frequencies of the HDT, MST, and triple ascending trill,
which are relatively broadband calls, based on their contrast against
the ambient noise level, which is also a meaningful point of this study.
Despite these efforts, the call rates and acoustic characteristics of
each call type were estimated, acknowledging that the manual process of
detecting calls and determining the start points of sample calls may be
subject to uncertainty. In particular, call detection under low
signal-to-noise ratio conditions remains technically challenging. We
have established call datasets, which is clustered within a narrow
low-frequency bandwidth, and they will be applied to development of
automatic detection and classification algorithms as foundational data
in future studies.