Introduction
Deep-sea benthos fosters a high diversity of invertebrates with
peracarid crustaceans like Isopoda being particularly species-rich
(Wilson & Ahyong, 2015; Brandt et al., 2018). The underlying factors
that shaped this diversity and the potential role of past climatic
changes or physical barriers, such as deep-sea ridges, are not well
understood. Similarly, fundamental population genetic studies are scarce
for the deep sea (Taylor & Roterman, 2017). One problem is the limited
availability of large-scale datasets in terms of individual numbers and
geographic scale. The deep-sea crustacean isopod speciesHaploniscus bicuspis (G. O. Sars, 1877) offers to be an ideal
surrogate, as it is one of the few isopod species with circum-Icelandic
distribution and large numbers of samples available from the BIOICE
(Benthic Invertebrates of Icelandic Waters) and IceAGE (Icelandic marine
Animals: Genetics and Ecology) projects (Brix et al., 2014; Meißner,
Brix, Halanych, & Jażdżewska, 2018).
The complex interactions of highly diverse water masses make Icelandic
waters an interesting location to study the distribution and speciation
processes of deep-sea taxa. While the warmer and more saline North
Atlantic occur south of Iceland, the colder waters of the Nordic Seas
and the Arctic Ocean shape the environment to the North. Steep gradients
in temperature and salinity, various sediment substrates, differences in
food availability, and several shallow and deep ridges create a unique
and diverse environment. This favors high biodiversity through a
patchwork of different habitats within relatively small scales (Meißner
et al., 2014; Brökeland & Svavarsson, 2017; Jochumsen, Schnurr, &
Quadfasel, 2016; Ostmann, Schnurr, & Martínez Arbizu, 2014).
For other isopod (Brix & Svavarsson, 2010) and amphipod (Weisshappel,
2000, 2001; Dauvin et al., 2012) crustaceans, as well as many other
benthic deep-sea animals, the Greenland-Iceland-Faeroe (GIF) Ridge is an
important barrier that limits their distribution. The GIF Ridge has a
saddle depth of about 480 meters between the Faeroe Islands and Iceland
in the south-east, and 620 meters between Greenland and Iceland in the
north-west of Iceland (Hansen & Østerhus, 2000). Near-bottom
temperatures range from 12 °C in the North Atlantic south of the GIF
Ridge, to -0.9 °C deep north of the GIF Ridge (Schnurr et al. 2014;
Jochumsen et al., 2016). Iceland is located in an area especially
susceptible to climatic change (Hanna, Jónsson, & Box, 2006) and hence
is undergoing rapid changes in terms of species distribution and
composition (e.g. Arnason, 2007; D’Alba, Monaghan, & Nager, 2010; Pecl
et al., 2017). Naturally, the waters around Iceland have been influenced
by past climatic changes. During the Last Glacial Maximum, the ice sheet
extended beyond the Icelandic landmass and within reach of the shelf
break at around 300 meters depth (Patton, Hubbard, Bradwell, &
Schomacker, 2017). Around 15 ka BP, the ice sheet broke apart abruptly
due to rising sea levels (Geirsdóttir, Miller, Axford, & Ólafsdóttir,
2009; Norðdahl & Ingólfsson, 2015). Both the southern (i.e.,
Iceland-Faeroe Ridge) and the northern parts of the GIF Ridge (i.e.
Denmark Strait between Iceland and Greenland) were likely influenced by
spreading and contracting ice sheets, potentially limiting the
distribution of benthic marine animals.
The GIF Ridge and associated ecological differences strongly affect the
observed species compositions with most benthic deep-sea species (e.g.,
Isopoda, Amphipoda, Tanaidacea) being confined to one side of the GIF
Ridge (e.g., Hansen, 1908; Negoescu & Svavarsson, 1997; Svavarsson,
Strömberg, & Brattegard, 1993; Weisshappel, 2001; Gudmundsson, 1998;
Stransky & Svavarsson, 2006; Brix, Stransky, et al., 2018; Brix, Lörz,
et al., 2018; Jakiel, Stępień, & Błażewicz, 2018).
An interesting exception is the widespread asellote isopod speciesHaploniscus bicuspis . It occurs in all water masses and in
various sampled depths (300-2900 m) around Iceland, whereas the majority
of congeneric species are confined to the North Atlantic Ocean (except
for H. angustus Lincoln, 1985; Brökeland & Svavarsson, 2017).
This makes H. bicuspis an ideal surrogate to study patterns of
genetic diversity, population genetics and the phylogeographic history
of a benthic deep-sea isopod species. It also offers the possibility to
assess proteomic differences using MALDI-TOF MS (Matrix-Assisted Laser
Desorption/Ionization Time-of-Flight Mass Spectrometry) associated with
ecological differentiation or potential cryptic speciation. Species
identification with MALDI-TOF MS is well-established in microbiology for
determination of bacteria, viruses, and fungi (Nachtigall, Pereira,
Trofymchuk, & Santos, 2020; Singhal, Kumar, Kanaujia, & Virdi, 2015)
and several proof-of-concept studies supported its general applicability
for species delimitation in marine crustaceans (Riccardi et al., 2012;
Laakmann et al., 2013; Bode et al., 2017; Rossel & Martínez Arbizu,
2019), yet to our knowledge no studies on peracarids have been performed
so far. Understanding the taxonomic resolution of proteomic profiling in
invertebrates is subject of ongoing research.
Haploniscus bicuspis was first described from specimens collected
near Norway (Figure 1). Subsequently, Wolff (1962) described two
subspecies based on the shape of the male antennula and pleopod 1 (see
Figure 2): Haploniscus bicuspis bicuspis (including the Norwegian
specimens) and H. bicuspis tepidus (from the Reykjanes Ridge
south-west of Iceland). The latter features a narrower second segment of
the antennula and a laterally rounded distal part of the pleopod 1 (in
contrast to the presence of a distinct corner in H. b. bicuspis;Figure 2). Unfortunately, the type series of H. bicuspiscomprises no adult male and both subspecies of Wolff (1962) currently
hold the status “unaccepted” in WoRMS (Boyko et al., accessed August
2020). It is currently unknown whether the morphological variation
described by Wolff suggests cryptic speciation or intraspecific
variability.
Asellote isopods are particularly interesting for population genetic
studies, as a pelagic larval stage is absent in their life cycle. Like
all Peracarida, they hatch and rear juveniles in a brooding pouch
(marsupium), and the juveniles largely resemble the adult morphology and
lifestyle upon release (Lincoln, 1985). This limits their dispersal
abilities compared to animals with free-swimming pelagic life stages. In
general, locomotion and dispersal abilities in deep-sea asellote isopods
depend on the adult stage (Brix et al., 2020). In the abyssal Pacific,
Haploniscidae showed a mean species range of 183 km and a maximum range
of 1,310 km, with 83 % of the species (n = 24) present in a single area
only (Brix et al. 2020). These distributional ranges are much lower than
for the swimming isopod families Desmosomatidae and Munnopsidae. As a
typical walking haploniscid isopod, the body plan of Haploniscusdoes not show adaptations specific to swimming or burrowing, and the
animals are found in close association with the sediment surface (Brix
et al., 2020, Thiel and Haye, 2006). This suggests comparably poor
dispersal capabilities of the adults as well.
This study aims to explore the genetic diversity withinHaploniscus bicuspis , focusing on aspects of potential cryptic
diversity, population genetics and its phylogeographic history. The
latter is particularly interesting, as it may reveal historic glacial
refugia and postglacial (re-)colonization for a deep-sea species, and
thus potential population genetic consequences of past climatic changes.
To do so, we combined analyses of mitochondrial COI sequences with
thousands of nuclear ddRAD loci. We further assessed proteome-level
differences between populations and putatively revealed cryptic species.
Methods
All H. bicuspis specimens examined were sampled aboard the
research vessels RV Meteor (M85/3), RV Poseidon (POS456) and RV Maria S.
Merian (MSM75) during the IceAGE (2011), IceAGE2 (2013) and IceAGE_RR
(2018) expeditions. The specimens were sorted at the laboratories of the
DZMB (German Centre for Marine Biodiversity Research) and deposited in
the collections of the Center of Natural History (Hamburg, Germany; see
Supplementary Table S1). Specimens were collected using an epibenthic
sledge (EBS, see Brenke, 2005), except for samples from stations 23 and
63, which were collected using a Van Veen grab.
Following type material as loan from the Museum in Copenhagen was
morphologically compared with the newly collected IceAGE specimens:
A dry vial apparently containing 1 dried specimen and labelled as
ZMUC-CRU-5817 from Ingolf st. 78; The slide portion of ZMUC-CRU-5817; a
dry vial containing 2 microvials of H. b. bicuspis from Ingolf
st. 112; a dry vial containing 1 microvial of H. b. bicuspis from
Ingolf st. 139; a vial with about 20 specimens in alcohol labelled as
ZMUC-CRU-346, but also with the following label from Institut für
Polarökologie, Kiel (not Zoological Museum, Copenhagen):Haploniscus bicuspis , Projekt H21-5, Station 333, Gerät EBS,
Sammler A. Brandt.; the non-slide portion of ZMUC-CRU-5817 (NHMD-83661)
consists of 1 syntype of H. b. tepidus in alcohol; the 2 dried
syntypes of this subspecies from Ingolf st. 78 labelled as NHMD-272194;
a dry vial containing 1 microvial of H. b. bicuspis from Ingolf
st. 104; a dry vial containing 1 microvial of H. b. bicuspis from
Ingolf st. 125; ZMUC-CRU-346 (NHMD-78194) with ZMUC supplementary label
(Meteor st. 333, Kolbeinsey Ridge, Iceland, 67°56.5’ N, 18°02.4’ W).
Selected males were visualized using CLSM to assess the shape of the
pleopod 1. Specimens were stained with the fluorescent markers Congo Red
and Acid Fuchsin (1:1), following the methodology outlined in Michels
and Büntzow (2010). For scanning, specimens were transferred to
glycerin. A Leica TCS SPV with a Leica DM5000 B microscope and DPSS
laser (10 mW, 561 nm) was used to perform the CLSM. Scans were recorded
with the LAS AF 2.2. software.
DNA extractions were performed using the mid-sections of the animals,
leaving the cephalothorax and the pleon intact for subsequent
morphological analyses and as partial vouchers in the collections. The
animals were dissected carefully using a micro scissor and the gut was
removed to avoid contaminations. If possible, the same animal was used
for proteomics, COI barcoding and ddRAD, transferring half of the
dissected tissue in a vial for genetics and the other half into a vial
for proteomics. DNA was extracted using the Marine Animal Tissue Genomic
DNA Extraction Kit (Neo Biotech) or the Genomic DNA from tissue kit
(Macherey-Nagel). DNA was eluted in 70 µL elution buffer. Chelex
(Chelating Ion Exchange Resin, Bio-Rad) was used for some of the
animals, for which only the COI fragment was sequenced (for protocol see
Jennings, Golovan, & Brix, 2020).
COI
1 µl DNA was PCR amplified using PuReTaq Ready-To-Go (RTG) PCR Beads (GE
Healthcare) with 1 µl of either dgLCO (GGT CAA CAA ATC ATA AAG AYA TYG
G; Meyer, 2003)/dgHCO (TAA ACT TCA GGG TGA CCA AAR AAY CA; Meyer, 2003)
or LCOJJ (CHACWAAYCATAAAGATATYGG; Astrin & Stüben, 2008)/HCOJJ
(AWACTTCVGGRTGVCCAAARAATCA; Astrin & Stüben 2008) primers and 22 µl
nuclease-free water. The PCR ran at 94 °C for 5 minutes, followed by 38
cycles of denaturation at 94 °C for 45 seconds, annealing at 45 °C for
50 seconds, and elongation at 72 °C for 1 minute. This was followed by
the final elongation at 72 °C for 5 minutes. The resulting fragment
length was assessed by gel electrophoresis using 1.5% TAE gels. Excess
primers were removed using ExoSAP. The PCR products and the
corresponding primers were sent to Macrogen or Eurofins for
bidirectional sequencing.
The resulting sequences were individually checked for their quality
using GENEIOUS Prime version 2019.2.3 and the forward and reverse
sequences were assembled using the de novo assembly function. An
alignment was created using MUSCLE (v 3.8.425, Edgar 2004). To assess
the potential presence of cryptic diversity within H. bicuspis ,
we employed two species delimitation methods: generalized mixed Yule
coalescent model (GMYC; Pons et al., 2006) and Automatic Barcode Gap
Discovery (ABGD; Puillandre, Lambert, Brouillet, & Achaz, 2012). GMYC
was run in R using the single threshold method (Reid & Carstens, 2012).
The GMYC method uses an ultrametric tree as input to delimit clusters of
putative species. The ultrametric tree was generated with BEAST2
(Bouckaert et al., 2019), running the analyses for 30 million
generations, employing the Yule model and including every observed COI
haplotype once. ABGD uses the so-called barcode gap, which corresponds
to the gap between intra- and interspecific genetic distances, to
delimit putative species. We used the web version of ABGD (seehttps://bioinfo.mnhn.fr/abi/public/abgd/abgdweb.html)
with standard settings except for a relative gap width of 0.5 and 100
steps and the uncorrected p -distances, pre-calculated with MEGA-X
version 10.0.5 (Kumar, Stecher, Li, Knyaz, & Tamura, 2018). An unrooted
phylogenetic network was calculated with SplitsTree (Huson & Bryant,
2006) to visualize the divergence between putative species (henceforth
called lineages).
To assess the distribution of genetic diversity and the phylogeographic
history, a median-joining haplotype network was generated using the
program Network v 10.0 (see fluxus-engineering.com; Bandelt, Forster, &
Röhl, 1999) and redrawn using InkScape. Furthermore, for each population
(i.e., sampling station) nucleotide diversity (π) and gene diversity (h)
were calculated with Arlequin 3.5.2.2 (Excoffier & Lischer, 2010). The
demographic parameters Tajima’s D and Fu’s F, which assess deviations
from neutrality, were also assessed. Genetic differentiation between
populations was evaluated using pairwise FST, calculated
only among populations of the same lineage and among the very closely
associated lineages ICOI-IIICOI (see
below). For Tajima’s D, Fu’s F, and FST only populations
with at least five individuals present were
included.
ddRAD
Based on DNA yield, a subset of samples was selected for ddRAD
sequencing, only including samples with at least 30 ng DNA. DNA
concentration was measured with a Qubit 3.0 (Invitrogen). If samples had
a lower starting concentration than 150 ng, they were concentrated via
drying. All samples were brought to 24 µl. The protocol described by
Peterson et al. (2012) was mostly adhered to, with a few modifications
(see also Schwentner & Lörz, 2021; Franchini, Monné Parera, Kautt, &
Meyer, 2017).
Samples were grouped in batches of eight by starting DNA concentration.
Restriction digestion was carried out at 37 °C for 4 hours using 3 µl
fastdigest buffer, 1.5 µl fastdigest MspI and fastdigest EcoRI enzymes
each (Thermo Fisher Scientific). Digested DNA was cleaned using 1.5x
volume of magnetic beads and eluted in 21 µl H2O
(AmpliClean Cleanup Kit, Nimagen). MspI and EcoRI adapters were ligated
to the 19 µl digested DNA using 3 µl of each adaptor, 3 µl of 10x T4
ligase buffer, and 2 µl T4 ligase (1-2 Weiss units). 16 EcoRI and 8 MspI
adapter variations with unique barcodes (MspI adapters with four
additional random nucleotides, as described in (Franchini et al., 2017)
were used to provide a unique barcode combination for each sample within
each batch of eight samples (Supplementary Table S1). Ligation commenced
at 22 °C for 1 hour and was heat terminated at 65 °C for 10 minutes.
Afterwards, the eight samples of each batch were pooled, and the pools
were cleaned up using 1.5x volume of magnetic beads, eluting in 30 µl of
TE buffer. DNA fragments of 300 base pairs (bp) were selected with a
BluePippin (Sage Science, +/- 30 bp allowed, “tight” setting) using a
1.5% agarose gel cassette with DF marker R2.
To reduce PCR amplification biases, four PCR replicates were run for
each pool after size selection. PCRs comprised 5 µl 2x Kappa HS HIFI
mix, 0.3 µl of each primer and 4.4 µl library. Forward and reverse
primers included eight different 8 bp indices (Supplementary Table S1)
and were combined to add a unique index combination for each pool. The
two-step PCR program ran at 95 °C for 3 minutes, followed by 17 to 19
cycles of 98 °C for 20 seconds, 72 °C for 25 seconds, and final
elongation at 72 °C for 1 minute. The four replicates of each library
were combined and cleaned up using 1x volume of magnetic beads, eluting
in 18 µl H2O. DNA concentration was measured using a
Qubit and mean size assessed using a Tapestation (Agilent). All samples
were pooled at equal molarity and sent off to Macrogen (South Korea) for
sequencing on one Illumina HiSeq4000 lane (100 bp, paired end).
The reads were pre-demultiplexed by indices by Macrogen upon delivery.
Potential PCR duplicates were removed using the ‘clone_filter’ function
of STACKS (Rochette, Rivera-Colón, & Catchen, 2019) and the
demultiplexing by barcodes was performed with the ‘process_radtags’
function. Assembly of loci was performed using ipyrad (Eaton &
Overcast, 2020). Parameters were optimized in multiple test runs. The
crucial parameters in the final analyses were set to: #14
[clust_threshold] = 0.91, #20 [max_Hs_consens] = 0.1, and
#22 [max_SNPs_locus] = 0.15 (see Supplemental File S1 for full
set of parameters). To further optimize the number of retrieved loci,
two different runs were performed: one including all specimens and one
including all specimens previously delimited into lineages
ICOI-IIICOI. The latter run was
performed to optimize loci recovery for this set of genetically and
geographically closely associated lineages. In all analyses, a
‘populations’ file was included (defining lineages retrieved by COI) and
requiring at least 50% of specimens of each population to be
represented for a locus to be retained. Specimens with less than 50% of
retrieved loci were removed after initial test runs.
To assess if the nuclear ddRAD data support the lineages differentiated
by COI phylogenetic networks, principal component (PCA), Structure and
coancestry analyses were performed on each of the three ddRAD data sets.
Unrooted phylogenetic networks were computed with SplitsTree. PCA and
Structure analyses were run via Python scripts, closely following the
workflows outlined on the ipyrad homepage
(https://ipyrad.readthedocs.io/en/latest/API-analysis/index.html;
visited March 10th, 2020) using the .snps.hdf5 ipyrad output files. A
minimum coverage of 80% was set for each locus. For PCA all available
SNPs were included, but for Structure only one SNP of each locus was
included to reduce artefacts by linkage. Structure analyses ran for
500,000 generations, with a burn-in of 50,000 generations, for k=2 to
k=8 and with five replicates each. In each replicate, one SNP was
randomly selected per locus. The replicates were summarized with CLUMPP
(Jakobsson & Rosenberg, 2007). The best-fitting k was identified based
on ad hoc posterior probability models of [Pr(X|K)]
(Pritchard, Stephens, & Donnelly, 2000) and deltaK (Evanno, Regnaut, &
Goudet, 2005) using the web version of STRUCTURE HARVESTER (see
http://taylor0.biology.ucla.edu/structureHarvester/, accessed July 23rd,
2020; Earl & vonHoldt, 2012). The output of the best-fitting k was
plotted with the online version of StructurePlot v2
(http://omicsspeaks.com/strplot2/;
Ramasamy, Ramasamy, Bindroo, & Naik, 2014) sorting individuals by
similarity. Coancestry analyses used RADpainter and fineRADstructure
(Malinsky, Trucchi, Lawson, & Falush, 2018) closely following the
proposed usage
(https://cichlid.gurdon.cam.ac.uk/fineRADstructure.html,
accessed 01.12.2020). The results were plotted using the provided R
script FinestructureLibrary.R
(https://github.com/millanek/fineRADstructure/blob/master/FinestructureLibrary.R,
accessed 01.12.2020). One great advantage of the coancestry analysis
compared to Structure is the utilization of the complete nuclear
haplotypes instead of a single SNP per locus. The vcf files from ipyrad
were used as input.
Key population genetic and demographic parameters were calculated from
the vcf files generated by ipyrad. Nucleotide diversity, Tajima’s D, the
inbreeding coefficient FIS and pairwise
FST values were calculated for each locus with VCFtools
(Danecek et al., 2011) and then averaged for each lineage and
population. Nucleotide diversities were corrected by the total number of
nucleotides. Heterozygosity per site per individual was reported by
ipyrad and summarized (mean) for each lineage and population. To assess
demographic changes over time, extended Bayesian skyline plots (Figure
5) were calculated with BEAST2 for lineages I-III jointly as these
probably constitute a single species with inter-lineage hybridization
(see discussion). Three runs were performed, one using the COI data set,
and two runs each using 200 randomly chosen ddRAD loci (non-overlapping
among runs), selecting only loci with 5-10 SNPs each. Loci were
retrieved from the *snpsmap ipyrad outputfile. Among loci, the site and
clock models were linked, but the tree models unlinked. The HKY model
was selected with four gamma categories, empirical frequencies and kappa
2.0. A strict clock with a rate of 1 was enforced, as no suitable
substitution rates are available (a few crustacean COI substitution
rates have been published, their applicability to deep-sea Isopoda is
highly questionable). The “Coalescent Extended Bayesian Skyline” prior
was selected for each locus. The weights for ”indicatorSampler.alltrees”
and ”indicatorScaler.alltrees” were set to 5000, of
”EBSPupDownOperator.alltrees” to 3000 and of ”bitflip.alltrees” to
10000. The MCMC chain was run for 100*106 generations,
sampling every 5000th for EBSP. The output was analyzed with the
EBSPAnalyser included in the BEAST2 package, discarding the first 25 %
as burn-in. The final data including the 95% highest probability
density intervals was plotted in R.