3.2.1 Evidence for material-specificity in regression models
Given the finding that the calibration slope of Anderson et al. (2021) does not statistically match the larger composite calibration of freshwater carbonates assembled here (Figure 3; Table 3) and also does not fit well with data from freshwater carbonates from certain latitudes and environments (Figure 2), we proceed by testing the hypothesis that there could be material-specific calibrations. We derive calibrations for biologic carbonates (bivalves and gastropods), biologically-mediated carbonates (microbialites and tufas), micrites, and travertines, and test whether regression parameters differed between these groups of materials (Table 2; Figure 3).
3.2.1.1 Biogenic Carbonates
Our biogenic carbonate calibration was developed using 137 analyses from 23 samples with Δ47 values and independently constrained water temperatures ranging from 0.573-0.643‰ and 7 - 29°C, respectively. This dataset includes 16 new samples, alongside reprocessed data from Huntington et al. (2015) and Wang et al. (2021) that have been brought onto the I-CDES reference frame. Our calibration shows a significant temperature dependence (Figure 4a; p = <0.0001) between the clumped isotope signal and temperature and samples demonstrate agreement with our linear model (r2 = 0.7811).
Our results show that biogenic carbonates record more depleted Δ47 values relative to other freshwater samples in this study (Fig. 3) and the resulting calibration line has a lower intercept relative to the Petersen and Anderson calibrations (Figure 4A). Despite visually appearing to be offset from the rest of the data (Figure 3), we find no statistical difference in slopes between our Δ47-T regression of biogenic carbonates and other carbonate groups within this study (Table 3). However, an ANCOVA analysis finds significant differences in intercepts between biogenic carbonates and micrite (pintercept = <0.0001) and biologically mediated carbonates (pintercept = 0.0047) (Table 3). When comparing our calibration results to the calibrations presented in Anderson et al. (2021), H. Li et al (2021), Bernasconi et al. (2018), and Petersen et al. (2019), our ANCOVA indicates shows no difference in slope between our biogenic carbonate calibration, but differences in intercept between our study and the authigenic carbonate calibration presented in H. Li et al. (2021)  (pintercept = 0.0215) and the ‘universal’ calibration derived by Petersen et al. (2019) (pintercept = 0.0728) (Table 3). We find no differences in either slope or intercept between biogenic carbonates and the recent I-CDES calibration of Anderson et al. (2021)
The depletion in Δ47 observed within these biologic samples relative to micrites, travertines, tufas, and microbialites could stem from changes in growth rate as a function of season, or other unidentified factors. As the sample size requirements for clumped isotopes is relatively large, it often requires the analyses of a complete shell or the majority of a shell for analyses, effectively integrating seasonal signals recorded in the shell and potentially leading to a more muted temperature sensitivity of the calibration than if seasonally resolved sampling could be carried out. Additionally, there is potential for a mismatch of temperature between in-situ measured temperatures, which is representatitve of a multi-year average, and the temperature range experienced by biologic samples considering that the lifespan when shell growth would occur is limited to a mush shorter timeframe. Another possibility is that kinetic isotope fractionations may manifest in freshwater gastropod and bivalve shells, as have been constrained in other biocarbonates such as coral skeletons (Ghosh, Adkins, et al., 2006; Saenger et al., 2012). However, more research is needed to draw this conclusion for freshwater biologic carbonates including potentially performing culturing experiments at controlled temperatures as well as examining other geochemical indicators such as Δ48 measurements. We note that although in-depth study of clumped isotope fractionations in aquatic freshwater gastropods is limited to this study, work with modern land snails has shown that a majority of these samples also are offset to lighter Δ47 values than the Petersen et al. (2019) calibration (Dong et al., 2021). Culture experiments on species of freshwater gastropods and bivalves may help to better constrain the origin and impact of these effects.
3.2.1.2 Micrites
In our study we present 2 new samples of micrite, and reprocess data from 33 samples from Li et al. (2021) and 3 samples from Huntington et al. (2010) to be on the I-CDES reference frame. Micrites in this study include water temperatures between 9.8 and 29.0°C and Δ47 values from 0.596 to 0.682‰. Micrites evaluated in this study demonstrate a significant temperature dependence (p<0.0001; Figure 4b), however, our samples demonstrate significant variability (r2 = 0.5736).
Comparing our derived micrite parameters to other carbonate groups in this study, we find no significant difference in slopes between materials, but find significant differences in intercept between micrites and biogenic carbonates (pintercept = <0.0001), biologically mediated carbonates (pintercept = 0.0379), and travertines (pintercept = 0.0050). Visually, we find that the micrite regression is positively offset relative to both the Anderson et al. (2021) and Petersen et al. (2019) calibrations (Figure 4b). In contrast to the agreements in slope, our ANCOVA analysis finds significant differences in intercept between a published travertine calibration (Bernasconi et al., 2018; pintercept = 0.0264), an authigenic carbonate calibration (H. Li et al., 2021; pintercept = 0.0014), a large calibration dataset (Petersen et al., 2019; pintercept = <0.0001), and a recently published calibration on the I-CDES scale (Anderson et al., 2021; pintercept = <0.0001).
Prior work analyzing clumped isotope composition suggests that Δ47 values of authigenic carbonates precipitate near equilibrium, and are not impacted by disequilibrium fractionations related to carbonate precipitation rate or water chemistry (H. Li et al., 2020). Thus, the variability in Δ47 that we observe for micrite is potentially due to uncertainty in the timing of surface carbonate precipitation events at each site. Micrite precipitation is enhanced by biological processes such as algal blooms and temperature effects which can peak at different times throughout the year, and behavior of precipitation events varies depending on characteristics of the lake (i.e. open or closed; location; stratification/ventilation; etc.)(Hren & Sheldon, 2012). Additionally, we note that the samples from UCLA were sieved through 212 μm mesh, which may include juvenile or fragments of mature ostracodes, and it is unclear if any screening for additional fossil material occurred for samples first published in Huntington et al. (2010) was performed. However, the majority of the samples recalculated in this synthesis from H. Li et al. (2021) were filtered through a 45 μm mesh and screened for ostracode valves. Ostracode valves in the sediment may bias temperature estimates derived by clumped isotope analysis, given that different factors control organism growth, thus, the inclusion of potential fragments of fossilized material may be a source of the increased scatter we see in the Δ47-temperature dependence for micrites.
3.2.1.3 Biologically Mediated Carbonate
The calibration for biologically mediated carbonates is constructed with 255 analyses of 24 samples, including 7 new samples, 13 reprocessed samples from Santi et al. (2020), Petryshyn et al. (2015), Huntington et al. (2015), Huntington et al. (2010), and Bernasconi et al. (2018) that were converted into I-CDES, and 4 samples taken from Anderson et al. (2021). Water temperatures for biologically mediated samples span 18.9°C (10.1 - 29.0°C) and Δ47 values range between 0.585-0.666‰. We find significant variability in our dataset (Figure 4c; r2 = 0.5669) and a significant relationship between Δ47 and temperature (p = <0.0001).
Although we do not see statistically significant differences in slopes between biologically-mediated carbonates and other freshwater carbonate types, an ANCOVA detects differences in intercept between biologically mediated carbonates and biogenic carbonates (pintercept= 0.0047) and micrite (pintercept = 0.0379). We also find significant differences in intercept between the biologically mediated regression and the I-CDES calibration of Anderson et al. (2021) (pintercept = <0.0001).
Overall, the biologically-mediated regression results in warmer temperature predictions, in particular at higher temperatures, relative to biogenic carbonates and travertines analyzed in this study as well as the Anderson calibration (Table 2; Supplemental Table 3), suggesting that biologic processes may influence observed Δ47-temperature relationships (also could be a source of scatter; r2 =0.5669). Similar discrepancies between tufa and synthetic samples were observed in Kato et al. (2019), who reported values from tufa samples predicted by synthetic calibrations that were higher than modern environmental temperatures. However, the modern tufa data from Kato et al. (2019) is not included in this synthesis due to discrepancies between standard values for Carrara Marble and NBS-19 relative to what was reported by Bernasconi et al. (2021) and Uphadhyay et al. (2021), although we note their calibration falls within our 95% confidence interval of our biologically-mediated calibration.
3.2.1.4 Travertines
Although we did not add new data, we created a regression for travertine samples containing 543 analyses from 23 samples. The travertine dataset includes data from 15 recalculated samples from previous publications to be on the I-CDES reference frame (Bernasconi et al., 2018; Kele et al., 2015) following methodology in Bernsconi et al. (2021) and 8 new published measurements (Anderson et al., 2021), to analyze them within the same statistical framework used here. Travertine samples encompass the largest range of independently measured water temperatures (5 - 95°C) and Δ47 values (0.409-0.637‰). Similarly to the other groups of carbonate considered in this study, we find a significant temperature dependence (slope; p = <0.0001) and a high degree of agreement between the fitted values and calibration data points (r2 = 0.9487). Travertines display the highest r2 values relative to biogenic carbonates, biologically mediated carbonates, and micrites, which may arise if they have the least complex precipitation mechanism with little biological influence relative to the other groups.
ANCOVA tests indicate the travertine linear regression did not have a statistically significant slope compared to other groups of freshwater carbonates in this study, but does indicate a statistically different intercept to the micrite regression (pintercept = 0.0050; Table 3). The newly-derived regression on the updated I-CDES reference frame is statistically indistinguishable from the previous travertine calibration presented in Bernasconi et al. (2018), but has significant differences in intercept from the calibration presented in Petersen et al. (2019) (pintercept = 0.0354), suggesting that applying a ‘universal’ calibration may not be appropriate. Additionally, we find no significant differences in either slope or intercept between travertines and the Anderson et al. (2021) calibration or authigenic lacustrine carbonate calibration of H. Li et al. (2021) (Table 3).
3.2.1.5 Comparison of material-specific and composite calibrations
Overall, we observe no statistically significant difference between the calibration slopes derived from different materials and previously published calibrations (Table 3) when freshwater carbonates are divided into groups to account for differences in their precipitation (e.g. seasonality, ecology, etc.), calibrations converge on a common temperature dependence (slope) for clumped isotope measurements. A similar convergence of slopes was found in Petersen et al. (2019) when comparing 14 different clumped isotope studies of both biogenic and abiogenic carbonates using updated parameter values for Δ47 calculation. Anderson et al. (2021) also found a convergence of slopes between their new data, the Petersen calibration, and recalculated calibration lines using updated carbonate standardization procedures for 4 recent calibration studies. However, our ANCOVA analyses also indicate statistically different intercepts for most of our calibrations from groups of freshwater carbonates (Table 3). Our findings are unchanged if we only consider samples that were analyzed at UCLA.
In order to evaluate goodness of fit between the two types of models presented in this study, we use root mean square error (RMSE) to evaluate the differences between our directly measured and Δ47-derived measurements. Applying our composite calibration to biogenic samples results in a RMSE of 4.4°C, while applying the biogenic calibration results in a RMSE of 2.9°C, showing a better fit when using the material-specific calibration. Temperatures derived from a micrite-specific calibration results in a lower RMSE than a composite calibration (3.9°C and 4.6°C, respectively). Contrastingly, the composite calibration outperforms the material specific calibrations for biologically mediated carbonates and travertines, resulting in a lower RMSE than their material specific counterparts (tufa: 4.4°C and 5.1°C, travertine: 6.5°C and 7.1°C). Figure 5 shows the impact of the applied calibration on temperature reconstructions using both the composite and material specific calibrations derived in this study, showing a decrease in residuals when utilizing material-specific regressions for all material types. Thus, it may be more appropriate to use material-specific calibrations when reconstructing paleotemperatures. However, we also note that the application of material-specific calibrations will necessitate using fewer data points (minimum n = 22) over a more limited temperature range in most cases (except for travertines), both factors of which could increase uncertainty in the calibration. We recommend using material specific calibrations for samples that fall within the original observation range, given that application of material specific calibrations to samples from more extreme temperatures could necessitate calibration extrapolation.