2.4 Data analysis
Bivariate analysis with ordinary least-squares linear regression (OLR) and quadratic regression (QR) were used to quantify how hypocotyl trait values varied with latitudinal gradients and biotic/abiotic factors. Because of high correlations among most hypocotyl traits (r > 0.36, P < 0.001), we performed a principal component analysis (PCA) with multiple traits using the ‘princomp’ function in R 4.1.3 (R Core Team 2022), and used the two first PC axes to represent the hypocotyl traits.
To evaluate how environmental factors, maternal plants or inherent factors explained variation in hypocotyl traits, we used a nested analysis of variance (ANOVA) coupled with variance partitioning techniques (Martin et al., 2017). We carried out linear mixed model (LMM) for PC1 and RTD by using the ‘lme’ function in ‘nlme’ R package (Pinheiro, Bates, & R Core Team, 2022). In each model, all nested levels (i.e. site > genealogy > within [individual]) were entered as sequential random effects and the intercept was the only estimated fixed effect. We then used the ‘varcomp’ function in the ‘ape’ R package (Paradis et al., 2004) to calculate the variance components associated with each nested level.
To quantify how hypocotyl traits were affected by climatic factors, oceanic factors, and maternal performance, we implemented LMMs using the ‘lme’ function in the R package ‘nlme’. The fixed-effect terms included the climatic, oceanic and maternal variables. To account for additional variation potentially caused by some missing site-specific effects (e.g., other environmental factors), and that caused by other maternal effects uncaptured by aboveground biomass, we treated sampling site and tree genealogy as random factors. All variables were standardised before the modelling, such that each variable had a mean of zero and a standard deviation of one. To reduce the adverse influence of multicollinearity, we removed multicollinear variables until the variance inflation factors (VIFs) of all variables in the model were less than three (Ouyang et al., 2019). Both primary and quadratic mixed models were considered, and only the better fitted model was showed (based on the Akike information criteria). We calculated the VIF using the R package ‘car’ (Fox & Monette, 2019). The pseudo-R2 was calculated using the function ‘r.squaredGLMM’ in the R package ‘MuMIn’ (Bartoń, 2022), to represent the variance explained by the fixed effect in the LMM. The effect sizes of fixed factors were measured by the regression coefficients in the LMM.
Structural equation modelling (SEM) was used to disentangle direct and indirect effects of all predictive factors on hypocotyl traits. After standardising all variables, multicollinear variables were removed based on VIF. We first considered a full model that included all variables and all reasonable pathways. Non-significant pathways were then sequentially removed, unless the pathways were biologically informative. The removing and adding of pathways were repeated until bothPχ 2-test ≥ 0.05 (that is, no significant difference between model predictions and the observed data) and root mean square error of approximation (RMSE) < 0.08 were reached (Wu et al., 2022). The SEM was performed using the ‘lavaan’ R package (Rosseel, 2012).