Introduction

Deep-sea benthos fosters a high diversity of invertebrates with peracarid crustaceans like Isopoda being particularly species-rich (Wilson & Ahyong, 2015; Brandt et al., 2018). The underlying factors that shaped this diversity and the potential role of past climatic changes or physical barriers, such as deep-sea ridges, are not well understood. Similarly, fundamental population genetic studies are scarce for the deep sea (Taylor & Roterman, 2017). One problem is the limited availability of large-scale datasets in terms of individual numbers and geographic scale. The deep-sea crustacean isopod speciesHaploniscus bicuspis (G. O. Sars, 1877) offers to be an ideal surrogate, as it is one of the few isopod species with circum-Icelandic distribution and large numbers of samples available from the BIOICE (Benthic Invertebrates of Icelandic Waters) and IceAGE (Icelandic marine Animals: Genetics and Ecology) projects (Brix et al., 2014; Meißner, Brix, Halanych, & Jażdżewska, 2018).
The complex interactions of highly diverse water masses make Icelandic waters an interesting location to study the distribution and speciation processes of deep-sea taxa. While the warmer and more saline North Atlantic occur south of Iceland, the colder waters of the Nordic Seas and the Arctic Ocean shape the environment to the North. Steep gradients in temperature and salinity, various sediment substrates, differences in food availability, and several shallow and deep ridges create a unique and diverse environment. This favors high biodiversity through a patchwork of different habitats within relatively small scales (Meißner et al., 2014; Brökeland & Svavarsson, 2017; Jochumsen, Schnurr, & Quadfasel, 2016; Ostmann, Schnurr, & Martínez Arbizu, 2014).
For other isopod (Brix & Svavarsson, 2010) and amphipod (Weisshappel, 2000, 2001; Dauvin et al., 2012) crustaceans, as well as many other benthic deep-sea animals, the Greenland-Iceland-Faeroe (GIF) Ridge is an important barrier that limits their distribution. The GIF Ridge has a saddle depth of about 480 meters between the Faeroe Islands and Iceland in the south-east, and 620 meters between Greenland and Iceland in the north-west of Iceland (Hansen & Østerhus, 2000). Near-bottom temperatures range from 12 °C in the North Atlantic south of the GIF Ridge, to -0.9 °C deep north of the GIF Ridge (Schnurr et al. 2014; Jochumsen et al., 2016). Iceland is located in an area especially susceptible to climatic change (Hanna, Jónsson, & Box, 2006) and hence is undergoing rapid changes in terms of species distribution and composition (e.g. Arnason, 2007; D’Alba, Monaghan, & Nager, 2010; Pecl et al., 2017). Naturally, the waters around Iceland have been influenced by past climatic changes. During the Last Glacial Maximum, the ice sheet extended beyond the Icelandic landmass and within reach of the shelf break at around 300 meters depth (Patton, Hubbard, Bradwell, & Schomacker, 2017). Around 15 ka BP, the ice sheet broke apart abruptly due to rising sea levels (Geirsdóttir, Miller, Axford, & Ólafsdóttir, 2009; Norðdahl & Ingólfsson, 2015). Both the southern (i.e., Iceland-Faeroe Ridge) and the northern parts of the GIF Ridge (i.e. Denmark Strait between Iceland and Greenland) were likely influenced by spreading and contracting ice sheets, potentially limiting the distribution of benthic marine animals.
The GIF Ridge and associated ecological differences strongly affect the observed species compositions with most benthic deep-sea species (e.g., Isopoda, Amphipoda, Tanaidacea) being confined to one side of the GIF Ridge (e.g., Hansen, 1908; Negoescu & Svavarsson, 1997; Svavarsson, Strömberg, & Brattegard, 1993; Weisshappel, 2001; Gudmundsson, 1998; Stransky & Svavarsson, 2006; Brix, Stransky, et al., 2018; Brix, Lörz, et al., 2018; Jakiel, Stępień, & Błażewicz, 2018).
An interesting exception is the widespread asellote isopod speciesHaploniscus bicuspis . It occurs in all water masses and in various sampled depths (300-2900 m) around Iceland, whereas the majority of congeneric species are confined to the North Atlantic Ocean (except for H. angustus Lincoln, 1985; Brökeland & Svavarsson, 2017). This makes H. bicuspis an ideal surrogate to study patterns of genetic diversity, population genetics and the phylogeographic history of a benthic deep-sea isopod species. It also offers the possibility to assess proteomic differences using MALDI-TOF MS (Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry) associated with ecological differentiation or potential cryptic speciation. Species identification with MALDI-TOF MS is well-established in microbiology for determination of bacteria, viruses, and fungi (Nachtigall, Pereira, Trofymchuk, & Santos, 2020; Singhal, Kumar, Kanaujia, & Virdi, 2015) and several proof-of-concept studies supported its general applicability for species delimitation in marine crustaceans (Riccardi et al., 2012; Laakmann et al., 2013; Bode et al., 2017; Rossel & Martínez Arbizu, 2019), yet to our knowledge no studies on peracarids have been performed so far. Understanding the taxonomic resolution of proteomic profiling in invertebrates is subject of ongoing research.
Haploniscus bicuspis was first described from specimens collected near Norway (Figure 1). Subsequently, Wolff (1962) described two subspecies based on the shape of the male antennula and pleopod 1 (see Figure 2): Haploniscus bicuspis bicuspis (including the Norwegian specimens) and H. bicuspis tepidus (from the Reykjanes Ridge south-west of Iceland). The latter features a narrower second segment of the antennula and a laterally rounded distal part of the pleopod 1 (in contrast to the presence of a distinct corner in H. b. bicuspis;Figure 2). Unfortunately, the type series of H. bicuspiscomprises no adult male and both subspecies of Wolff (1962) currently hold the status “unaccepted” in WoRMS (Boyko et al., accessed August 2020). It is currently unknown whether the morphological variation described by Wolff suggests cryptic speciation or intraspecific variability.
Asellote isopods are particularly interesting for population genetic studies, as a pelagic larval stage is absent in their life cycle. Like all Peracarida, they hatch and rear juveniles in a brooding pouch (marsupium), and the juveniles largely resemble the adult morphology and lifestyle upon release (Lincoln, 1985). This limits their dispersal abilities compared to animals with free-swimming pelagic life stages. In general, locomotion and dispersal abilities in deep-sea asellote isopods depend on the adult stage (Brix et al., 2020). In the abyssal Pacific, Haploniscidae showed a mean species range of 183 km and a maximum range of 1,310 km, with 83 % of the species (n = 24) present in a single area only (Brix et al. 2020). These distributional ranges are much lower than for the swimming isopod families Desmosomatidae and Munnopsidae. As a typical walking haploniscid isopod, the body plan of Haploniscusdoes not show adaptations specific to swimming or burrowing, and the animals are found in close association with the sediment surface (Brix et al., 2020, Thiel and Haye, 2006). This suggests comparably poor dispersal capabilities of the adults as well.
This study aims to explore the genetic diversity withinHaploniscus bicuspis , focusing on aspects of potential cryptic diversity, population genetics and its phylogeographic history. The latter is particularly interesting, as it may reveal historic glacial refugia and postglacial (re-)colonization for a deep-sea species, and thus potential population genetic consequences of past climatic changes. To do so, we combined analyses of mitochondrial COI sequences with thousands of nuclear ddRAD loci. We further assessed proteome-level differences between populations and putatively revealed cryptic species.
Methods
All H. bicuspis specimens examined were sampled aboard the research vessels RV Meteor (M85/3), RV Poseidon (POS456) and RV Maria S. Merian (MSM75) during the IceAGE (2011), IceAGE2 (2013) and IceAGE_RR (2018) expeditions. The specimens were sorted at the laboratories of the DZMB (German Centre for Marine Biodiversity Research) and deposited in the collections of the Center of Natural History (Hamburg, Germany; see Supplementary Table S1). Specimens were collected using an epibenthic sledge (EBS, see Brenke, 2005), except for samples from stations 23 and 63, which were collected using a Van Veen grab.
Following type material as loan from the Museum in Copenhagen was morphologically compared with the newly collected IceAGE specimens:
A dry vial apparently containing 1 dried specimen and labelled as ZMUC-CRU-5817 from Ingolf st. 78; The slide portion of ZMUC-CRU-5817; a dry vial containing 2 microvials of H. b. bicuspis from Ingolf st. 112; a dry vial containing 1 microvial of H. b. bicuspis from Ingolf st. 139; a vial with about 20 specimens in alcohol labelled as ZMUC-CRU-346, but also with the following label from Institut für Polarökologie, Kiel (not Zoological Museum, Copenhagen):Haploniscus bicuspis , Projekt H21-5, Station 333, Gerät EBS, Sammler A. Brandt.; the non-slide portion of ZMUC-CRU-5817 (NHMD-83661) consists of 1 syntype of H. b. tepidus in alcohol; the 2 dried syntypes of this subspecies from Ingolf st. 78 labelled as NHMD-272194; a dry vial containing 1 microvial of H. b. bicuspis from Ingolf st. 104; a dry vial containing 1 microvial of H. b. bicuspis from Ingolf st. 125; ZMUC-CRU-346 (NHMD-78194) with ZMUC supplementary label (Meteor st. 333, Kolbeinsey Ridge, Iceland, 67°56.5’ N, 18°02.4’ W).
Selected males were visualized using CLSM to assess the shape of the pleopod 1. Specimens were stained with the fluorescent markers Congo Red and Acid Fuchsin (1:1), following the methodology outlined in Michels and Büntzow (2010). For scanning, specimens were transferred to glycerin. A Leica TCS SPV with a Leica DM5000 B microscope and DPSS laser (10 mW, 561 nm) was used to perform the CLSM. Scans were recorded with the LAS AF 2.2. software.
DNA extractions were performed using the mid-sections of the animals, leaving the cephalothorax and the pleon intact for subsequent morphological analyses and as partial vouchers in the collections. The animals were dissected carefully using a micro scissor and the gut was removed to avoid contaminations. If possible, the same animal was used for proteomics, COI barcoding and ddRAD, transferring half of the dissected tissue in a vial for genetics and the other half into a vial for proteomics. DNA was extracted using the Marine Animal Tissue Genomic DNA Extraction Kit (Neo Biotech) or the Genomic DNA from tissue kit (Macherey-Nagel). DNA was eluted in 70 µL elution buffer. Chelex (Chelating Ion Exchange Resin, Bio-Rad) was used for some of the animals, for which only the COI fragment was sequenced (for protocol see Jennings, Golovan, & Brix, 2020).

COI

1 µl DNA was PCR amplified using PuReTaq Ready-To-Go (RTG) PCR Beads (GE Healthcare) with 1 µl of either dgLCO (GGT CAA CAA ATC ATA AAG AYA TYG G; Meyer, 2003)/dgHCO (TAA ACT TCA GGG TGA CCA AAR AAY CA; Meyer, 2003) or LCOJJ (CHACWAAYCATAAAGATATYGG; Astrin & Stüben, 2008)/HCOJJ (AWACTTCVGGRTGVCCAAARAATCA; Astrin & Stüben 2008) primers and 22 µl nuclease-free water. The PCR ran at 94 °C for 5 minutes, followed by 38 cycles of denaturation at 94 °C for 45 seconds, annealing at 45 °C for 50 seconds, and elongation at 72 °C for 1 minute. This was followed by the final elongation at 72 °C for 5 minutes. The resulting fragment length was assessed by gel electrophoresis using 1.5% TAE gels. Excess primers were removed using ExoSAP. The PCR products and the corresponding primers were sent to Macrogen or Eurofins for bidirectional sequencing.
The resulting sequences were individually checked for their quality using GENEIOUS Prime version 2019.2.3 and the forward and reverse sequences were assembled using the de novo assembly function. An alignment was created using MUSCLE (v 3.8.425, Edgar 2004). To assess the potential presence of cryptic diversity within H. bicuspis , we employed two species delimitation methods: generalized mixed Yule coalescent model (GMYC; Pons et al., 2006) and Automatic Barcode Gap Discovery (ABGD; Puillandre, Lambert, Brouillet, & Achaz, 2012). GMYC was run in R using the single threshold method (Reid & Carstens, 2012). The GMYC method uses an ultrametric tree as input to delimit clusters of putative species. The ultrametric tree was generated with BEAST2 (Bouckaert et al., 2019), running the analyses for 30 million generations, employing the Yule model and including every observed COI haplotype once. ABGD uses the so-called barcode gap, which corresponds to the gap between intra- and interspecific genetic distances, to delimit putative species. We used the web version of ABGD (seehttps://bioinfo.mnhn.fr/abi/public/abgd/abgdweb.html) with standard settings except for a relative gap width of 0.5 and 100 steps and the uncorrected p -distances, pre-calculated with MEGA-X version 10.0.5 (Kumar, Stecher, Li, Knyaz, & Tamura, 2018). An unrooted phylogenetic network was calculated with SplitsTree (Huson & Bryant, 2006) to visualize the divergence between putative species (henceforth called lineages).
To assess the distribution of genetic diversity and the phylogeographic history, a median-joining haplotype network was generated using the program Network v 10.0 (see fluxus-engineering.com; Bandelt, Forster, & Röhl, 1999) and redrawn using InkScape. Furthermore, for each population (i.e., sampling station) nucleotide diversity (π) and gene diversity (h) were calculated with Arlequin 3.5.2.2 (Excoffier & Lischer, 2010). The demographic parameters Tajima’s D and Fu’s F, which assess deviations from neutrality, were also assessed. Genetic differentiation between populations was evaluated using pairwise FST, calculated only among populations of the same lineage and among the very closely associated lineages ICOI-IIICOI (see below). For Tajima’s D, Fu’s F, and FST only populations with at least five individuals present were included.

ddRAD

Based on DNA yield, a subset of samples was selected for ddRAD sequencing, only including samples with at least 30 ng DNA. DNA concentration was measured with a Qubit 3.0 (Invitrogen). If samples had a lower starting concentration than 150 ng, they were concentrated via drying. All samples were brought to 24 µl. The protocol described by Peterson et al. (2012) was mostly adhered to, with a few modifications (see also Schwentner & Lörz, 2021; Franchini, Monné Parera, Kautt, & Meyer, 2017).
Samples were grouped in batches of eight by starting DNA concentration. Restriction digestion was carried out at 37 °C for 4 hours using 3 µl fastdigest buffer, 1.5 µl fastdigest MspI and fastdigest EcoRI enzymes each (Thermo Fisher Scientific). Digested DNA was cleaned using 1.5x volume of magnetic beads and eluted in 21 µl H2O (AmpliClean Cleanup Kit, Nimagen). MspI and EcoRI adapters were ligated to the 19 µl digested DNA using 3 µl of each adaptor, 3 µl of 10x T4 ligase buffer, and 2 µl T4 ligase (1-2 Weiss units). 16 EcoRI and 8 MspI adapter variations with unique barcodes (MspI adapters with four additional random nucleotides, as described in (Franchini et al., 2017) were used to provide a unique barcode combination for each sample within each batch of eight samples (Supplementary Table S1). Ligation commenced at 22 °C for 1 hour and was heat terminated at 65 °C for 10 minutes. Afterwards, the eight samples of each batch were pooled, and the pools were cleaned up using 1.5x volume of magnetic beads, eluting in 30 µl of TE buffer. DNA fragments of 300 base pairs (bp) were selected with a BluePippin (Sage Science, +/- 30 bp allowed, “tight” setting) using a 1.5% agarose gel cassette with DF marker R2.
To reduce PCR amplification biases, four PCR replicates were run for each pool after size selection. PCRs comprised 5 µl 2x Kappa HS HIFI mix, 0.3 µl of each primer and 4.4 µl library. Forward and reverse primers included eight different 8 bp indices (Supplementary Table S1) and were combined to add a unique index combination for each pool. The two-step PCR program ran at 95 °C for 3 minutes, followed by 17 to 19 cycles of 98 °C for 20 seconds, 72 °C for 25 seconds, and final elongation at 72 °C for 1 minute. The four replicates of each library were combined and cleaned up using 1x volume of magnetic beads, eluting in 18 µl H2O. DNA concentration was measured using a Qubit and mean size assessed using a Tapestation (Agilent). All samples were pooled at equal molarity and sent off to Macrogen (South Korea) for sequencing on one Illumina HiSeq4000 lane (100 bp, paired end).
The reads were pre-demultiplexed by indices by Macrogen upon delivery. Potential PCR duplicates were removed using the ‘clone_filter’ function of STACKS (Rochette, Rivera-Colón, & Catchen, 2019) and the demultiplexing by barcodes was performed with the ‘process_radtags’ function. Assembly of loci was performed using ipyrad (Eaton & Overcast, 2020). Parameters were optimized in multiple test runs. The crucial parameters in the final analyses were set to: #14 [clust_threshold] = 0.91, #20 [max_Hs_consens] = 0.1, and #22 [max_SNPs_locus] = 0.15 (see Supplemental File S1 for full set of parameters). To further optimize the number of retrieved loci, two different runs were performed: one including all specimens and one including all specimens previously delimited into lineages ICOI-IIICOI. The latter run was performed to optimize loci recovery for this set of genetically and geographically closely associated lineages. In all analyses, a ‘populations’ file was included (defining lineages retrieved by COI) and requiring at least 50% of specimens of each population to be represented for a locus to be retained. Specimens with less than 50% of retrieved loci were removed after initial test runs.
To assess if the nuclear ddRAD data support the lineages differentiated by COI phylogenetic networks, principal component (PCA), Structure and coancestry analyses were performed on each of the three ddRAD data sets. Unrooted phylogenetic networks were computed with SplitsTree. PCA and Structure analyses were run via Python scripts, closely following the workflows outlined on the ipyrad homepage (https://ipyrad.readthedocs.io/en/latest/API-analysis/index.html; visited March 10th, 2020) using the .snps.hdf5 ipyrad output files. A minimum coverage of 80% was set for each locus. For PCA all available SNPs were included, but for Structure only one SNP of each locus was included to reduce artefacts by linkage. Structure analyses ran for 500,000 generations, with a burn-in of 50,000 generations, for k=2 to k=8 and with five replicates each. In each replicate, one SNP was randomly selected per locus. The replicates were summarized with CLUMPP (Jakobsson & Rosenberg, 2007). The best-fitting k was identified based on ad hoc posterior probability models of [Pr(X|K)] (Pritchard, Stephens, & Donnelly, 2000) and deltaK (Evanno, Regnaut, & Goudet, 2005) using the web version of STRUCTURE HARVESTER (see http://taylor0.biology.ucla.edu/structureHarvester/, accessed July 23rd, 2020; Earl & vonHoldt, 2012). The output of the best-fitting k was plotted with the online version of StructurePlot v2 (http://omicsspeaks.com/strplot2/; Ramasamy, Ramasamy, Bindroo, & Naik, 2014) sorting individuals by similarity. Coancestry analyses used RADpainter and fineRADstructure (Malinsky, Trucchi, Lawson, & Falush, 2018) closely following the proposed usage (https://cichlid.gurdon.cam.ac.uk/fineRADstructure.html, accessed 01.12.2020). The results were plotted using the provided R script FinestructureLibrary.R (https://github.com/millanek/fineRADstructure/blob/master/FinestructureLibrary.R, accessed 01.12.2020). One great advantage of the coancestry analysis compared to Structure is the utilization of the complete nuclear haplotypes instead of a single SNP per locus. The vcf files from ipyrad were used as input.
Key population genetic and demographic parameters were calculated from the vcf files generated by ipyrad. Nucleotide diversity, Tajima’s D, the inbreeding coefficient FIS and pairwise FST values were calculated for each locus with VCFtools (Danecek et al., 2011) and then averaged for each lineage and population. Nucleotide diversities were corrected by the total number of nucleotides. Heterozygosity per site per individual was reported by ipyrad and summarized (mean) for each lineage and population. To assess demographic changes over time, extended Bayesian skyline plots (Figure 5) were calculated with BEAST2 for lineages I-III jointly as these probably constitute a single species with inter-lineage hybridization (see discussion). Three runs were performed, one using the COI data set, and two runs each using 200 randomly chosen ddRAD loci (non-overlapping among runs), selecting only loci with 5-10 SNPs each. Loci were retrieved from the *snpsmap ipyrad outputfile. Among loci, the site and clock models were linked, but the tree models unlinked. The HKY model was selected with four gamma categories, empirical frequencies and kappa 2.0. A strict clock with a rate of 1 was enforced, as no suitable substitution rates are available (a few crustacean COI substitution rates have been published, their applicability to deep-sea Isopoda is highly questionable). The “Coalescent Extended Bayesian Skyline” prior was selected for each locus. The weights for ”indicatorSampler.alltrees” and ”indicatorScaler.alltrees” were set to 5000, of ”EBSPupDownOperator.alltrees” to 3000 and of ”bitflip.alltrees” to 10000. The MCMC chain was run for 100*106 generations, sampling every 5000th for EBSP. The output was analyzed with the EBSPAnalyser included in the BEAST2 package, discarding the first 25 % as burn-in. The final data including the 95% highest probability density intervals was plotted in R.