Statistical analyses
All the statistical analyses presented here were performed in RStudio
(RStudio Team, 2020) for R (R Core Team, 2021). Our raw data were first
transformed using the Hellinger transformation (square root
transformation of the relative concentration in each sample) to limit
the influence of higher values and compute Euclidean distance matrix
between each sample. We performed a Principal Component Analysis (PCA)
on this distance matrix to identify variables significantly explaining
the variation in physico-chemical characteristics in our dataset using
the vegan package (version 2.6-4) (Oksanen et al., 2022) to
obtain a set of 13 variables in our final dataset : fructose, glucose,
turanose, maltose, mannose, glutamine, leucine, proline, valine, acetic
acid, lactic acid, formic acid, pyruvic acid.
The raw data of the selected compounds were standardized using the
double Wisconsin transformation. Using this approach, each value in the
“species x compounds” matrix was divided by its column maximum, and
then divided by the row total, producing values between 0 and 1 that
equalize emphasis among sample units and among species. Bray–Curtis
dissimilarities were computed on the transformed data to conduct
multivariate analyses using the vegan package. We then performed
Permutational Multivariate Analysis of Variance (PERMANOVA) to evaluate
significant differences among honey samples from stingless bees and
honey bees. These differences between groups of honey samples were
illustrated in Non-Metric Multidimensional Scaling (NMDS) plots using
the ggplot2 package (version 3.3.3) (Wickham, 2016) and a
dendrogram (hierarchical clustering) using the ggtree package
(version 3.4.1) (Yu, 2020). Indicator compounds analysis was performed
to identify the compounds associated with each taxa using theindicspecies package version (1.7.14) (Cáceres & Legendre,
2009). We also validated the classification of our honey samples using
model-based approach. Our dataset was split in two equal parts to (1)
train and (2) test a conditional random forest model that assigns honey
bee or stingless bee label to each sample using theextendedForest package (version 1.6.1) (Liaw & Wiener, 2002). We
also compared honeys from stingless bees and honeys bees in terms of
chemical diversity and dissimilarity by comparing diverse types of
indexes : Hill diversity, Functional Hill diversity, Shannon’s
diversity, Simpson diversity, Rao’s Q, Pielou’s evenness and Hill
evenness. Diversity indexes assessments were based on the chemical
structure of the 36 compounds quantified using the chemodiv package version (0.3.0) (Petrén et al., 2023).
We then investigated patterns of variation in SBH composition by testing
the differences in the composition according to the tropical regions.
Further analyses performed to estimate the importance of environmental
and phylogenetic constraints on honey composition, using the full
dataset at the global scale but also datasets at the continental scale
including honey samples from each of the tropical region. We computed
phylogenetic dissimilarities based on the patristic phylogenetic
distances between genera using the phylogeny established by Rasmussen &
Cameron, 2010 (Supplementary Fig. 1) using the adephylo package
(version 1.1-16) (Jombart & Dray, 2008). Environmental dissimilarities
between locations were computed based on 8 variables impacting nectar
production (soil moisture, precipitation, solar radiation) and
composition (Tree cover, moisture, temperature, forest density, plant
biodiversity). Solar irradiance (The World Bank Group, 2023) and water
availability (Topographic wetness; Title & Bemmels, 2018 & Annual Mean
precipitation; Fick & Hijmans, 2017) are the main limiting factors in
nectar production, influencing its concentration in sugar. Humidity
(Climatic Moisture Index; Title & Bemmels, 2018) and temperature
(Annual Temperature; Fick & Hijmans, 2017) impact evaporation rate
leading to differences nectar viscosity and concentration (Nicolson &
Thornburg, 2007). Honey produced in forest differed in its composition
(Lowore et al., 2018; Cannizzaro et al., 2022) due to the influence of
tree cover and density (Forest density; Riitters et al., 2000) on
microclimatic factors (Parachnowitsch et al., 2019). Finally, we
included plant biodiversity (native plant diversity; Ellis et al., 2012)
which impact resource availability and pollen profile of honeys
(mono-/polyfloral; de Sousa et al., 2016; Ávila et al., 2019). Data were
extracted from Raster files using annual or monthly mean value when
available.
We estimated the part of variation of chemical dissimilarity explained
by the phylogenetic and environmental dissimilarity matrix using thevarpart function in vegan . Finally, we tested correlations
and linear relationships between the Bray-Curtis dissimilarity matrix of
chemical composition and the phylogenetic/environmental dissimilarities
using partial mantel tests and distance-based RDA. Variation
partitioning and correlation analysis were performed at the global but
also at continental scale to disentangle ecological and evolutionary
drivers since phylogeny follows a geographical split.