A network of 112 bomb crater ponds forming a well-delineated pondscape relatively isolated from other waterbodies is situated in the Kiskunság area, Central Hungary (Fig. 1). These ponds were likely created during World War II by mistargeted bombing on a sodic meadow. They are saline waters mostly dominated by sodium carbonates and hydrocarbonates, and despite being the same age and in close proximity to each other, they vary in many of their environmental and morphological characteristics including hydroperiod, surface area, depth and vegetation cover (Vadet al. , 2017). The ponds are not physically connected thus dispersal is expected to occur via wind, animal vectors or active movement of the organisms.Figure 1. A) Location of the study region near Apaj in Hungary. B) Drone photo of a smaller part of the bomb crater pond network. C) Location of the 54 bomb crater ponds with their closeness centrality scores indicated by a colour gradient (Google Earth Pro, 2014)

Sample collection and environmental data

Fieldwork was carried out between 7 and 9 May 2014. We surveyed all ponds holding water (i.e., omitting the smallest and already dry ephemeral habitats), resulting in a total of 54 ponds. The average distance between these ponds is 285 m and all habitats are situated within a radius of 0.5 km (min. and max. distances between the centre point of ponds: 9–863 m). A range of physical and chemical parameters was recorded on site including water depth (cm), diameter (m), and open surface area (%), emergent and submerged vegetation (%). Conductivity (μS cm−1) and pH were measured using a field multimeter (Eutech CyberScan PCD 650). Per pond, a total of 10 L of water was collected and mixed from 10 randomly chosen points, 1 L of this composite water sample was taken for water chemistry analysis and community sequencing of pro- and microeukaryotes. 1-50 mL (depending on turbidity) of this water sample was filtered through a nitrocellulose membrane filter (Ø 47 mm, 0.22 μm pore size). These filters were stored at −20 °C until DNA extraction. Concentrations of total phosphorus (TP, mg L-1), total suspended solids (TSS, mg L-1), dissolved inorganic nitrogen (DIN, mg L-1), calcium (Ca2+, mg L-1), and chlorophyll a (Chl-a ćg L-1) were measured in the lab as detailed in Vadet al. , (2017).
To collect zooplankton samples, 10 L of water was randomly collected from the open water and sieved through a 45-μm mesh. Macroinvertebrates were sampled using sweep netting standardised to 3 minutes (frame-size: 0.25x0.25 m2, mesh size: 500μm), for which all microhabitats present in a pond were included. Zooplankton and macroinvertebrate samples were preserved in 70% ethanol for further analysis. For the amphibian survey, hand netting maximised to 15 minutes, visual searches, and dip netting for tadpoles and newt larvae were applied. All identified specimens were released back to their habitat. For a more detailed description of the study area and sampling procedures, see Vad et al. (2017)

Sample processing and community datasets

The isolation of community DNA from the membrane filters was carried out using the PowerSoil® DNA Isolation Kit (MO BIO Laboratories Inc.). Prokaryotic 16S and microeukaryotic 18S rRNA gene amplification and sequencing were performed by LGC Genomics (Berlin, Germany). For gene amplification, the following primer pairs were applied: EMBf 515F (GTGYCAGCMGCCGCGGTAA, Parada et al., 2016) – EMBr 806R (GGACTACNVGGGTWTCTAAT, Apprill et al., 2015) for the V4 region of the prokaryotic 16S rRNA gene and UnivF-1183mod (AATTTGACTCAACRCGGG) – UnivR-1443mod (GRGCATCACAGACCTG) (Ray et al., 2016) for the V7 region of the eukaryotic 18S rRNA gene. Sequencing was carried out on an Illumina MiSeq platform. For a detailed description of the entire procedure, see Szabó et al. (2022). Amplicon readsets were analysed using mothur v1.43.0 (Schloss et al. , 2009) involving, sequence processing, taxonomic assignments, and OTU picking with the MiSeq SOP as a reference (http://www.mothur.org/wiki/MiSeq_SOP; Kozich et al. , 2013, downloaded on 12thNovember 2020). Additional quality filtering steps were implemented to minimize the presence of sequence artefacts. These steps included the adjustment of the deltaq parameter to 10 in the ’make.contigs’ command, primer removal from both ends of the sequences, chimera identification and removal using the mothur-implemented version of VSEARCH and the exclusion of singleton reads. De-noising was carried out using mothur’s ‘pre.cluster’ command with the suggested 2 bp difference cutoff. Read alignment and taxonomic assignment were performed using the ARB-SILVA SSU Ref NR 138 reference database with a minimum bootstrap confidence score of 80 (Quast et al. , 2012). Non-primer-specific taxonomic groups (‘Chloroplast’, ‘Mitochondria’ and ‘unknown’) were excluded from the dataset.
OTUs were selected at the 99% similarity threshold. Taxonomic assignment of 18S rRNA gene OTUs was performed using the ‘classify.otu’ command with the PR2 v4.12.0 reference database (Guillou et al. , 2012). 18S rRNA gene OTUs assigned to taxa Streptophyta, Metazoa, Ascomycota, and Basidiomycota were excluded from the microeukaryotic dataset. For the taxonomic assignment of prokaryotic OTUs, the TaxAss software (Rohwer et al. , 2018) was used with default parameters and the FreshTrain (15 June 2020 release) and ARB-SILVA SSU Ref NR 138 databases. Subsequently, both 16S and 18S OTU sets were rarefied to the read number of the sample having the lowest sequence count. Samples BC40 and BC105 were excluded from the eukaryotic dataset due to a low read count after filtering.
Zooplankton was identified to the lowest possible taxonomic level by microscopic analysis (generally to species), except for bdelloid rotifers which were treated as a single group. Macroinvertebrates were also identified to the lowest possible taxonomic level (generally to family, genus and species levels, depending on the higher taxa) using relevant identification keys. As passively dispersing macroinvertebrates were represented only by four taxa (Vad et al. , 2017), they were excluded from the subsequent analyses. For a detailed description of the sample processing and the list of identified taxa, see Vad et al. , (2017).
We focused on the following organism groups in our study (Table S1; Supporting information): prokaryotes, phototrophic microeukaryotes, heterotrophic microeukaryotes, crustacean zooplankton (copepods and cladocerans), rotifer zooplankton, dipterans, other macroinvertebrates, and amphibians-reptilians. For community matrices of prokaryotes and microeukaryotes, we used the rarified 16S and 18S OTU tables. The 18S dataset was split into phototrophs and heterotrophs based on phyla using the ‘phyloseq’ package v1.36.0 (McMurdie & Holmes, 2013) in R 4.1.0 (R Core Team, 2021). Phototrophs mostly included groups capable of photosynthesis, while heterotrophs consisted of the remaining groups. Dipterans were treated separately from other macroinvertebrates as they are considered weak active dispersers (Bilton, Freeland, & Okamura, 2001; Heino, 2013). Other macroinvertebrates included all actively dispersing groups considered intermediate or strong aerial dispersers with flying adults according to Heino (2013).

Statistical analysis

In order to assess the relative importance of environmental and spatial predictors in explaining patterns of taxonomic richness and community similarity, we carried out three sets of variance partitioning analyses using the varpart function of the ‘vegan’ package (Oksanenet al. , 2020), where either network position (in the case of taxonomic richness) or spatial distance (in the case of taxonomic richness and community similarity) was included as the spatial predictor. We used data from all 54 ponds in all three cases, except for phototrophic and heterotrophic microeukaryotes for which data was available from 52 ponds only. In the environmental dataset, we excluded highly correlated (i.e., Pearson’s r>0.75) environmental variables (turbidity, percentage of reed cover) prior to the analyses. Secchi depth and submerged vegetation cover were also excluded, as Secchi depth delivers the same information as TSS but on a less precise scale, while the percentage of submerged vegetation cover almost exclusively contained zeros. Our final environmental dataset thus contained 10 variables, which were tested for normality and those deviating from it (Shapiro-Wilk test p <0.05) were transformed (conductivity, pH, depth - untransformed; DIN, TP, TSS - natural log; Ca2+, surface area - square root; chlorophyll-a- log(x+1); open surface area - cube transformed).
First, we analysed how taxonomic richness is predicted by the local environment and the relative position of ponds within the habitat network. For this, partial multiple linear regressions for each studied organism group were carried out using the rda function of the ‘vegan’ package (Oksanen et al. , 2020). As a response variable, richness was included either as OTU (pro- and microeukaryotes) or taxonomic richness (other groups), but hereinafter, we refer to these as ‘taxonomic richness’ for simplicity. Taxonomic richness data of crustacean zooplankton and amphibians-reptilians were square-root transformed to normalise model residuals (inspected via diagnostic plots), while untransformed values were used in the other organism groups. As environmental predictors, we used the first two axes of a Principal Component Analysis (PCA) constructed for the environmental variables, following z-score standardisation. For all subsequent analyses, site scores of PC1 and PC2 axes (explaining >65% of the total variation, Fig. S1, Supporting information) were extracted and used as environmental data. As a spatial predictor, we calculated the closeness centrality index of each pond (Fig. 1 C) based on the spatial distance matrix of all ponds using the ‘igraph’, ‘fields’ and ‘reshape2’ packages (Csárdi & Nepusz, 2006; Wickham, 2007; Nychkaet al. , 2021). To test for the possible significance of the unique effects of environment and centrality, a permutation test (2000 permutations) was run for each partial multiple linear regression model using the function anova .
We run a second set of variance partitioning (partial multiple linear regression models) for each organism group separately to test if and how taxonomic richness is affected by the environmental variables (PC1 and PC2 axes) and the number of neighbouring ponds. In these models, environmental variables were included as one set of predictors the same way as described above, and the number of ponds within a radius with increasing distance from each local pond over a scale of 0-800 m, with 10 m increments was involved as the spatial predictor. The unique effects of space and environment were calculated the same way as in the first set of variance partitioning 80 times in total for each organism group. The unique effect of space along threshold distance (cut-off distance for calculating pond numbers) was plotted with the help of General Additive Models (GAMs, formula = y ~ s(x, k =5)) for each taxonomic group separately with the package ‘ggplot2’ (Wickham, 2016).
Third, we tested how metacommunity structure is predicted by the local environment and spatial configuration. Since we did not have abundance data for all organism groups, statistical analyses were run on presence-absence community data for comparability. Taxa occurring in less than three ponds were omitted given their minor contribution to the overall similarity in the community dataset. Then, any pond which contained no taxa was removed. Dissimilarity matrices were calculated based on Sørensen dissimilarity with the vegdist function of the ‘vegan’ package (Oksanen et al. , 2020). Here, the initial pool of environmental predictors was represented by the 10 transformed environmental predictors, while the spatial predictors were Moran’s Eigenvector Maps (MEM). These were constructed based on the spatial distance matrix using the dbMEMfunction from the ‘adespatial’ package (Dray et al. , 2022) and only the 15 positive MEMs were retained (Bauman et al. , 2018; Borcard, Gillet, & Legendre, 2018; Fig. S2; Supporting information). To select significant variables of each set of explanatory variables (environment and space), we applied stepwise selection both for the transformed environmental variables and for the positive MEM eigenvectors using the ordistep function (direction=both, perm.max=2000) of the ‘vegan’ package (Oksanen et al. , 2020). These selection steps were done separately for each organism group resulting in different sets of retained environmental variables and MEMs (Table S2, Supporting information). We carried out a set of partial distance-based redundancy analyses (dbRDA) using the capscalefunction of the ‘vegan’ package (Oksanen et al. , 2020) followed by significance testing with a permutation test (2000 permutations) using the function anova .
All statistical analyses were carried out in R 4.1.0 (R Core Team, 2021) and figures were created using the ‘ggplot2’ (Wickham, 2016), ‘pubbr’ (Kassambara, 2020) and ‘car’ packages (Fox & Weisberg, 2019).