Materials and methods

Study sites

We sampled a total of 35 lakes and reservoirs across the temperate zone of China (latitude 34.60° to 46.06°, longitude 85.69° to 124.29°, Figure 2). The study area covers more than 5 million km2, including a range of landforms such as mountains, hills, plateaus, and plains with altitude ranging from 22.61 m to 4818.10 m, which have large precipitation and temperature gradients. The total area of all the lakes is 508,288.1 ha, ranges from about 70 to 98,800 ha. Mean depth of the lakes ranges from 0.05 to 30.00 m. Most of the lakes are located in the arid and semi-arid region of China, which are relatively independent wetland ecosystems (Wang & Dou, 1998). They provide critical habitats for diverse species (e.g. important breeding and staging sites for waterbirds, some are also used as wintering ground, like Hengshui Lake). The environment in the semi-arid area is sensitive to climate variation (Gong, Shi, & Wang, 2004). Human uses of the lakes mainly include fishery, tourism, water source for irrigation, and reed harvest.

Data collection

Data were collected from 35 lakes and reservoirs during the summers of 2011-2016. The lakes were sampled one to three times, with most of them sampled once in summer. In each lake, water samples were taken on the same day as waterbirds survey during daylight hours.

Waterbirds surveys

Waterbirds were surveyed using transect line method to cover the lakes. Fixed transects with variable lengths were established in each lake and reservoir. Based on the size of each lake and reservoir, the transect length varied from 1 to 5 km and the perpendicular searching distance varied from 0.1 to 0.6 km. We surveyed the transects on foot with constant speed by using binoculars (8×42) and telescopes (Swarovski ATS 80 HD 20-60 × 80). To increase detectability, the surveys were carried out on clear days during daytime and there were at least two fully trained observers for each transect. Visual and/or verbal communication enabled us to avoid duplicate recordings of the same flock of waterbirds by the observers. Waterbirds were all identified to species level. Lists of waterbirds species recorded by the nature reserves and published papers were also collected and incorporated.

Functional trait data

To estimate functional diversity, we collected biometric trait data that describe key ecological attributes of species from published literature for all 148 species. We collected 16 traits for each bird species, including generation length (the average age of breeding individuals), clutch size (number of eggs), incubation times, body size, body mass, wingspan, migratory types (full migration, altitudinal migration, and not migration), breeding habitat range, diet (percentage of scavenger, and invertebrate, fish, seed, fruit and other plant materials), habitat mode (percentage usage of water, riparian, and ground), and pelagic species (0/1) (Table 1). These selected traits measure many aspects of resource used by birds, such as the quantity and the quality of resource consumed (Petchey & Gaston, 2007), as well as the fitness of the species such as reproduction strategies and generation length (Luck, Andrew, Lisa, & Davies, 2013). For example, body mass is highly related to birds’ energy requirements (Blendinger & Villegas, 2011). Diet is related to ecosystem functions such as seed dispersal and food-web structure (Sekercioglu, 2006). The main sources of the trait measurements are Planet of Birds, BirdLife International, and a database compiled by Wilman et al. (2014) (Table 1). Any missing data were filled based on information in the ornithological literature, such as the Handbook of the Birds of the World (http://www.hbw.com/). A brief summary is available in the Appendix S1.
We used functional richness (FRic) as the measure of functional diversity. FRic represents the amount of functional space filled by the community (Villéger, Mason, & Mouillot, 2008). A community with high FRic would be one with many traits (and potentially high utilization of resources), whereas one with lower FRic might have some traits missing, suggesting that some niches are completely empty (Prescott et al., 2016).

Lake chemical characteristics and morphology

Indicators for lake chemical characteristics include total nitrogen (TN), total phosphorus (TP) and chlorophylla (Chl-ɑ). Water samples for TN and TP in each lake were collected and preserved using 100-ml jars on the same day with waterbirds surveys. All the water samples were sent to and processed in lab by using Ultrospec 6300 pro spectrophotometer (GE Healthcare, America). Water samples for chlorophylla (Chl-ɑ) concentration were collected synchronously on site. We used 2-litre bottles to collect water samples in each lake and reservoir and then filtered the water by using GF/C filter membrane, each filter membrane was filtered with 500 mL water and three filter membranes (3*500 mL) were requested for every sampling point. All water samples and filter membranes were preserved in <5 ℃ refrigerator and sent to lab for further test by using Ultrospec 6300 pro spectrophotometer (GE Healthcare, America). Sample size ranges from 3 to 18 based on the area of lake. We set three, six, nine, 12 and 18 sampling points for lakes with area < 10,000 ha, 10,000-20,000 ha, 20,001-30,000 ha, 30,001-50,000 ha and >50,000 ha, respectively. Surface area of each lake and reservoir was calculated from remote sensing interpretation by using Google Earth images from year 2013 to 2016.

Climatic and geographical variables

To define the climate of the lakes, we used the 30 seconds WorldClim bioclimatic variables, which were downloaded from WorldClim website, publicly available athttps://www.worldclim.org/data/worldclim21.html. The following bioclimatic variables were included in the study based on collinearity test (see below): mean diurnal range (T1, mean of monthly (max temp-min temp)), mean temperature of the wettest quarter (T2), mean temperature of the warmest quarter (T3), annual total precipitation (P1), precipitation of the driest month (P2), precipitation of the warmest quarter (P3), and precipitation of the coldest quarter (P4). The variables of climate are spatial means of each lake.
Altitudes of the sampled lakes were retrieved from GPS Visualizer on site. Mean geographical coordinates of each site were calculated from remote sensing interpretation by using Google Earth images of each lake and reservoir between years 2013 to 2016. The data of the 35 lakes and reservoirs used in this paper are provided in the Appendix S2.

Data analyses

Our working hypothesis is that large-scale pattern of high trophic level species (waterbirds in this study) diversity can be interpreted as a multilevel biogeographical pattern. That is, we envision that the species diversity of a lake is to a large extent determined by local environmental factors (e.g. nutrients, lake surface area), and that these local environmental factors in turn vary along biogeographical gradients. This nested structure is reflected in our data analysis.
We combined our own survey data and historical records to compile a waterbirds community dataset for each lake. FD was then calculated based on the compiled community data set. It is necessary to have several years of observation because the species observed in a lake could vary from year to year. In addition, we used the total chlorophylla proxy of primary productivity (Eppley, Stewart, Abbott, & Heyman, 1985; Falkowski & Raven, 2013).

Multiple linear regression –relationships between geography, lake productivity and waterbirds functional diversity

We used stepwise multiple regression with forward selection of variables to identify the most important local environmental variables explaining waterbirds FD in the 35 lakes and reservoirs. Environmental variables were log10-transformed if this resulted in a more uniform spread of data points. In addition, collinearity between variables was also checked based on variance inflation factor by using the VIF function with package “car ” (Fox et al., 2012) in R software (R Development Core Team, 2019). Environmental variables tested in the multiple regression analysis were surface area of the lake (ha), total nitrogen (mg.L-1), total phosphorus (mg.L-1) and lake productivity (mg.m-3), mean diurnal range (℃), mean temperature of the wettest quarter (℃), mean temperature of the warmest quarter (℃), annual total precipitation (mm), precipitation of driest month (mm), and precipitation of the warmest quarter (mm). We used the same stepwise multiple regression approach to investigate how each of these local environmental variables varied with the geographical coordinates of latitude, longitude and altitude (m). The analysis was performed using R version 3.6.1 (R Development Core Team, 2019).

Structural equation modeling – exploring the drivers of lake productivity and waterbirds richness

To gain further understanding of the linkages between waterbirds FD, productivity and environmental variables, we modeled lake productivity and waterbirds functional richness (FRic) in an integrated manner within the structural equation modeling (SEM) framework. SEM is a collection of procedures whereby complex hypotheses, particularly those involving networks of path relations, are evaluated against multivariate data (Grace, 2006). The resulting estimates for path coefficients in structural equation models represent the implied sensitivities of response variables to variations in individual predictors (Grace et al., 2012). We started with an initial SEM that included all plausible pathways between waterbirds functional diversity, the selected set of local environmental variables, and the geographical coordinates of the lakes. Our first attempt with the initial model revealed that it was under-identified, meaning that there was some redundancy such that it was not possible to estimate all of the model’s parameters. We therefore investigated the statistical relationships among the variables included in the model to identify possible redundancies. Subsequently, we fitted two separate SEMs for lake productivity and waterbirds functional diversity, and an integrated SEM revealing the correlation between the two factors. The significance of each path-coefficient was tested 1000 bootstrapped resamples. We reported the standardized coefficients that can be directly compared to make inferences about the relative strength of relationships (Grace & Bollen, 2005). The structural equation analyses were performed using R version 3.6.1 (R Development Core Team, 2019) with package “Lavaan ” version 3.1-3 (Rosseel et al., 2015).

Results

A total of 148 species, belonging to six orders and 19 families were recorded in this study (Appendix S1). The highest number of species was from the family Scolopacidae (37 species), followed by Anatidae (34 species). Some families such as Phalacrocoracidae, Threskiornithidae and Rostratulidae only had one species each. In addition, among the 35 lakes and reservoirs surveyed, number of species ranged from four to 113 species, with the highest species richness recorded in lakes and reservoirs located in Hubei Province and Beijing (eastern China), and lowest species richness recorded in lakes and reservoirs located in Inner Mongolia and Qinghai Province (western China) (Appendix S2).

Geographical patterns of waterbirds species richness and functional diversity

Waterbirds species richness showed remarkable geographical distribution gradients (Figure 3a). Generally, higher waterbirds species richness was found in lakes of the northern and eastern part of China. Lower species richness was found at higher altitude in the west. Similar pattern was found in waterbirds FRic (Figure 3b). Linear regression showed that FRic increased with latitude and longitude (Figure 4a, 3b), but decreased with altitude (Figure 4c). Furthermore, multiple regression analysis showed that lake area, mean temperature of the wettest quarter, annual total precipitation, and precipitation of the warmest quarter had significant effects on waterbirds FRic, explaining greater than 71% of the total variation in the dataset (Table 1). More specifically, FRic increased with lake area, mean temperature of the wettest quarter, and annual total precipitation, but decreased with precipitation of the warmest quarter.
Site-specific environmental variables including lake productivity, mean diurnal range, mean temperature of the wettest quarter, mean temperature of the warmest quarter, annual total precipitation, precipitation of the driest month, precipitation of the warmest quarter and precipitation of the coldest quarter showed large-scale variation across the landscape of north China. For example, lake productivity decreased with altitude; mean diurnal range, mean temperature of the wettest quarter, and mean temperature of the warmest quarter decreased with latitude, longitude and altitude; while annual total precipitation of the driest month, precipitation of the warmest quarter and precipitation of the coldest quarter increased with longitude (Table 2).

Final structural equation models

Three SEMs were fitted: one for lake productivity, one for waterbirds FD, and an integrated one that included both lake productivity and waterbirds FD to explicitly test the causal link between them. All three models converged successfully (Figure 5-7), and achieved adequate fit with CFI (Comparative fit index) of 0.90, 0.98 and 0.90 for the productivity SEM, waterbird FD SEM and integrated SEM, respectively (Hu & Bentler, 1999). These models had similar hierarchical structure, in which the responsible variable (lake productivity or waterbirds FD) was modelled by a few latent factors. These latent factors were in term defined by the observed variables. In all models, two latent factors were common: geographic position and climate. The geographic position was measured by the latitude and longitude of the central point of the lake and its average elevation. The climate was defined by using spatial mean of seven bioclimatic variables with low correlation (checked using VIF). In addition, a causality from geographic position to climate condition was specified in all SEMs.

Determinates of Lake Productivity

The three latent factors (i.e. geographic position, climate, and nutrient in the lake) were all included in the SEM. The three factors collectively explained 93% of the variation in lake productivity (Figure 5). Lake nutrient level, which was defined by water TP and TN concentration, had the largest effects on productivity, followed by climate (standardized path coefficient = 0.93 and 0.77 for nutrient and climate, respectively, Figure 5). The effects of geographic position on lake productivity was indirect and realized through its influence on climatic condition; and had a standardized path coefficient of 0.16 (0.77×0.21).
Lake productivity decreased with elevation (path coefficient = -0.93×0.77×0.07 =-019) and increased with both latitude and longitude with comparable path coefficient (0.12 and 0.13 for latitude and longitude, respectively, Figure 5). For the measured bioclimatic variables, while all variables related to precipitation as well as T2 and T3 had positive effect on lake productivity, the impact of temperature diurnal range was negative with a path coefficient of -0.22 (-0.83×0.27). Lake productivity was positively related with nutrient; and the effect of TP level was slightly larger than that of TN (standardized path coefficient 0.52 and 0.40 for TP and TN, respectively, Figure 5).

Determinates of waterbirds functional diversity

Similar to the lake productivity model, the three latent factors (geographic position, climate, and lake morphology in the lake) were all included in the SEM, collectively explaining 52% of the variation in waterbirds FD (Figure 6). In comparison with lake morphology, climate had higher effects on waterbirds FD (path coefficient = 0.69 and 0.21 for climate and morphology, respectively, Figure 6). The effects of geographic position on waterbirds distribution was indirect through climate with an effect of 0.57 (0.82×0.69 =0.56). Based on the standardized model coefficients, climate condition had the strongest effects on waterbirds FD, followed by geographic position and lake morphology (the coefficients are 0.69, 0.57, and 0.21 for climate, geographic position and lake morphology, respectively, Figure 6).
Waterbirds FD decreased with elevation (path coefficient = -0.59×0.82×0.69 =-0.33), and increased with both latitude and longitude, with longitude showing much stronger effect than latitude (path coefficients were 0.61 and 0.23 respectively, Figure 6). For the measured bioclimatic variables, similar to the productivity model, while all precipitation variables ad well as T2 and T3 had positive effect on waterbirds FD, the impact of temperature diurnal range was negative. In addition, the four precipitation variables had comparable effects on FD (Figure 6). The model also indicated a positive effect of lake area (standardized path coefficient is 0.21, Figure 6).

Relationship between waterbirds functional diversity and lake productivity

The final integrated SEM combined lake productivity and waterbirds FD together and included an explicit pathway from productivity to waterbirds FD (Figure 7). The model had reasonably adequate explanation power for the three key latent variables (R2 for climate, waterbirds FD and lake productivity was 0.73, 0.56 and 0.96, respectively. Figure 7). From the fitted path coefficients, climate had the greatest effect on waterbirds FD (0.65), which were similar to that of the waterbirds FD model (Figure 6). The effects of geographic position (0.60) was relatively strong, while the effects of other variables, including lake morphology (0.21), lake productivity (0.17) and nutrient (0.16, indirectly via its effect on lake productivity) were relatively weak.