2. Materials and Methods
2.1 Study sites and field data collection
Fieldwork was conducted in the period March to July of 2016-2019 at 11
woodland sites in the UK. Sites selected were pre-existing Hawfinch
ringing studies study areas within the Wye Valley, Dolgellau, Cardiff,
the New Forest and Norfolk (Figure 1). The artificial feed sites used to
attract Hawfinches for capture have been operational for a number of
years within regions of Hawfinch population strongholds (Clements, 2013;
Kirby et al., 2018). Study s ites were broadly typical of
British mixed broadleaved woodland, with sites in the Wye Valley and
north Wales dominated by beech, oak and ash (Fraxinus excelsior ,
Linnaeus). The study site located in Norfolk was a mixed woodland
consisting of lime (Tilia sp., Linnaeus), ash and maples
(Acer sp., Linnaeus), while the New Forest site was dominated by
oak, with an understorey flora comprising of Holly (Ilex sp.,
Linnaeus) and bramble (Rubus sp., Linnaeus). All site locations
are approximate for anonymity.
Hawfinches were caught by trained bird ringers, operating under British
Trust for Ornithology (BTO) approved licences using either mist or
whoosh nets. For each bird captured morphometric data and time of
capture were recorded, including maximum chord wing length, sex and body
mass (Svensson, 1992). Hawfinches were individually placed within a
clean, paper bag which was then placed inside a cotton bag and left for
10-20 minutes until the bird defecated. To avoid excessive stress, if
birds had not defecated within this time frame they were processed and
released. Each faecal sample was removed from the paper bags using a
new, clean plastic toothpick to minimise contamination. Samples were
placed in separate 2ml Eppendorf tubes, and frozen to
-20oC at 1-8h after collection.
2.2 Dietary analysis
DNA was extracted from a total of 365 faecal samples using the Qiagen
QIAamp DNA Stool Mini Kit (Qiagen, UK) with modifications designed to
improve DNA yield from avian faeces, following Shutt et al.(2020). Samples were extracted in batches of 23 with an additional
negative control containing no DNA. We used primers UniplantF,
5′-TGTGAATTGCARRATYCMG-3′ and UniPlantR 5′-CCCGHYTGAYYTGRGGTCDC-3′ to
amplify a 187-387-bp fragment covering the ITS2 region of plant nuclear
DNA (Moorhouse-Gann et al., 2018). For amplification of invertebrate
DNA, mlCOIintF, 5’–GGWACWGGWTGAACWGTWTAYCCYCC-3’ (Leray et al.2013) and Nancy 5’-ACTAGCAGTACCCGGTAAAATTAAAATATAAACTTC-3’, (Simonet al. 1992) were used following selection and modification by
Stockdale (2018) for amplification of a 306-bp fragment of the COI
region. Primer sets were validated by Stockdale (2018) to ensure DNA
amplification of the expected range of taxa. A two-stage PCR process
involved initial amplification of the target regions followed by the
addition of a unique combination of 10-bp molecular identifier tags
(MID-tags), with samples having a unique pairing of forward and reverse
tags for subsequent sample identification. Within each PCR 96-well
plate, 12 negative (extraction and PCR) and two positive controls were
included following Taberlet et al. (2018). Amplicons were
multiplexed into five pools, each containing between 63 and 186 samples.
Library preparation for Illumina sequencing was undertaken via NEXTflex
Rapid DNA-Seq kit (Bioo Scientific, Austin, USA), with a unique adapter
added to each pool. Pools were diluted to 4nM and quantified using Qubit
dsDNA High‐sensitivity Assay Kits. Finally, the diluted pools were
combined equimolarly and sequenced on a MiSeq desktop sequencer via a v2
chip with 2 x 250bp paired-end reads (expected capacity 24-30,000,000
reads). Due to the unbalanced nature of the amplicon libraries, a 15%
PhiX buffer was added to the sequencing run in order to improve
cross-talk and phasing calculations.
2.3 Identification of dietary taxa
The Illumina run generated 6,328,388 and 12,307,560 ITS2 and COI reads
respectively. All reads were checked for truncation, quality checked,
filtered and assigned taxonomic identification following Drake et al.,
(2021). All read counts less than the maximum in unused-MID tag
combinations and negative controls for each respective zero radius OTU
(zOTU) were removed prior to statistical analysis. Read counts of
non-positive control taxa detected within positive controls were
calculated as a percentage of the maximum read count for that taxon. The
greatest of these percentages was then applied as a measure of read
percentage to be removed from each taxon. This accounts for
over-represented taxa tag jumping or “leaking” into other samples
during sequencing. Optimal thresholds of 3% and 1% were applied to
ITS2 and COI data respectively. No known lab contaminants such as German
cockroach (Blatella germanica , Linnaeus) or non-target taxa that
could be identified as lab contamination were detected in either
library. Data from respective ITS2 and COI libraries were aggregated
together to form a single taxon list for each marker, and non-target
taxa such as fungi and gastrotrichs were removed. In order to
standardise the taxonomic level and create evenness for analysis, all
taxa were converted to genus-level, as some zOTUs could not be resolved
to species.
2.4 Statistical analysis
For all statistical analysis, the presence/absence of each taxonomic
unit within a sample was used instead of read count, as the latter is
not an accurate representation of abundance due to amplification biases
(Clare, Symondson, & Fenton, 2014; Yu et al., 2012). Control samples
were excluded from the analyses. All statistical analysis were carried
out in R version 3.6.3 (R Core Team, 2020) unless otherwise stated.
To
evaluate the most prevalent plant and invertebrate taxa within Hawfinch
diet, the number of samples in which a dietary zOTU occurred (frequency
of occurrence), was calculated. In order to estimate the total dietary
niche breadth, the specpool function in R’s vegan package
(Oksanen et al., 2019) was used to calculate Chao’s incidence-based
estimator of richness (Chao & Jost, 2012; Oksanen et al., 2019). This
is based upon presence/absence of dietary taxa found in sample sites,
and gives a single estimate for a collection of sampling sites (Oksanen
et al., 2019). Observed species richness divided by the Chao estimate
gave the proportion of total dietary diversity detected. Species
accumulation curves were also produced using the poolaccumfunction within the vegan package, relating the overall dietary
diversity captured to the number of faecal samples analysed. To
determine whether plant or invertebrate taxonomic richness in Hawfinch
diet was greater, a Wilcoxon matched-pairs test was undertaken to test
for a significant difference in the median species richness of plant and
invertebrate taxa detected. Only Hawfinch samples which provided
taxonomic results for both invertebrate and plant taxa were included.
Based upon ringing recapture data, no individual Hawfinch that had
tested positive for both plant and invertebrate DNA was included more
than once.
To test our hypotheses that diet varied among locations and between the
sexes we compared dietary composition between sampling regions and sexes
using multivariate generalised linear models (MGLMs) using the functionmanyglm within the package mvabund (Wang, Naumann,
Eddelbuettel, John, & Warton, 2012). Regions were broadly categorised
into: Wye Valley, Dolgellau, north Cardiff, New Forest and Norfolk, as
some regions contained multiple catching sites. Where an individual was
sampled more than once, data was used from the first capture only to
avoid pseudo replication and subsequent biases. The functionanova.manyglm in mvabund was used to test the significance
of each term within the overall model and the p.uni =
“adjusted” argument was implemented to perform tests for each species
in turn, adjusting the p -values with Holm’s step down resampling
algorithm (Westfall & Young, 1993). Parametric bootstrap (Monte Carlo)
resampling was applied to test for dietary differences, ensuring
inferences took into account correlation between variables (Wang et al.,
2012). When necessary, pairwise comparisons were performed using thepairwise.comp function of anova.manyglm . Models were
simplified using the step function based upon the lowest Akaike’s
information criterion (AIC) value. For all models, diagnostic plots were
checked to ensure model assumptions were met. Dietary differences were
visualised using non-metric multidimensional scaling analysis (nMDS) via
the function metaMDS in the vegan package (Oksanen et al.,
2019). The nMDS was performed with Jaccard dissimilarities in
two-dimensional space (k=2). Spider plots were produced using nMDS
results via ordispider and plotted through ggplot2 (Wickham,
2016). Singletons and outliers were removed to visually improve the
ordination.