2.3.2 The residual error assumption of the crop model
The relationship between the observations \(\mathbf{Y}\) and crop model
simulations \(\hat{\mathbf{Y}}\left(\bullet\right)\) can be presented
as,
\(Y_{t,i}={\hat{Y}}_{t}\left(\theta\right)+\varepsilon_{t}+\sigma_{t,i}\)(5)
where \(Y_{t,i}\) represents the \(i\)th replicated observation at the
field destructive sampling date \(t\), and\({\hat{Y}}_{t}\left(\theta\right)\) is the simulation given the crop
model parameter set \(\theta\) at \(t\). \(\varepsilon_{t}\) represents
the residual error \(\mathbf{\varepsilon}\) of the simulation process at\(t\) and \(\sigma_{t,i}\) is the observation error for \(Y_{t,i}\). As\(\sigma_{t,i}\) cannot be separated, especially in circumstances where
there are replications for observations (Kennedy and O’Hagan, 2001), for
simplicity, the averages of field destructive sampling data\({\overset{\overline{}}{Y}}_{t}\) were applied instead and the
estimation of \(\sigma_{t,i}\) was not included. As local information,
including weather, soil and field managements, has been recorded in
great detail and treated as a priori knowledge, the errors caused
by those forcing inputs are reduced (Hansen and Jones, 2000). Thus, the
residual error \(\mathbf{\varepsilon}\) in this study represents mainly
the structural error of the crop model. Accordingly, Eqn 5 is rewritten
as,
\({\overset{\overline{}}{Y}}_{t}={\hat{Y}}_{t}\left(\theta\right)+\varepsilon_{t}\)(6)
To stabilize \(\varepsilon_{t}\) and reduce heteroscedasticity, the
Box-Cox transformation of the measurements and simulations are
introduced at first (Box and Cox, 1964),
\(\tau\left(Y,\lambda\right)=\left\{\par
\begin{matrix}\frac{\left(Y^{\lambda}-1\right)}{\lambda}\text{\ \ \ \ \ \ }\text{if}\ \lambda\neq 0\\
\ln\left(Y\right)\text{\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ if}\ \lambda=0\\
\end{matrix}\right.\ \) (7)
where \(\lambda\) is the transformation parameter and varies in the
range from 0 to 1.
To further deal with the heteroscedasticity and nonnormality, residual
errors are proposed as according to Schoups and Vrugt (2010):
\(\Phi_{p}\varepsilon_{t}=\eta_{t}a_{t}\) with\(a_{t}\sim\text{SEP}(0,\ 1,\ \xi,\beta)\) (8)
where \(\Phi_{p}\) represents the \(p\)th order autoregressive model,\(\eta_{t}\) and \(a_{t}\) account for the heteroscedasticity and
non-Gaussian residual error at \(t\), respectively. The
heteroscedasticity is assumed to increase linearly with\(\hat{\mathbf{Y}}\) (\(\eta_{t}=\sigma_{1}{\hat{Y}}_{t}\)), where\(\sigma_{1}\) ranges from 0 to 1. \(a_{t}\) denotes a random error with
zero mean and unit standard deviation (STD), described by a standardized
skew exponential power (SEP) density with parameter \(\xi\) and\(\beta\) to account for the nonnormality. The skewness parameter\(\xi\) (\(\xi>0\)) affects the asymmetry of pdf and varies from 0.1
to 10, in which the density is symmetric when \(\xi=1\) and positively
or negatively skewed when \(\xi>1\) or \(\xi<1\), respectively. The
kurtosis parameter \(\beta\) determines the peakedness of the pdf and
varies between \(-1\) and \(+1\). In the case of a symmetric density,
the pdf is a uniform distribution when \(\beta=-1\), a Gaussian
distribution when \(\beta=0\), and a Laplace or double-exponential
distribution when \(\beta=1\). The marginal pdf for autoregressive
models with Laplace innovations is typically quite complicated and the
commonly used approximating approach is only valid for moderate to large
sample sizes (Schoups and Vrugt, 2010), not suitable for the situations
with limited field observations. Thus, in this study, the autoregressive
models were excluded and β was assumed to be zero.
Consequently, the simplified pdf of the SEP(0, 1, \(\xi\), 0) in Eqn 8
is expressed as:
\(p\left(a_{t}\middle|\xi\right)=\frac{1}{\sqrt{2\pi}}\frac{2\sigma_{\xi}}{\xi+\xi^{-1}}\exp\left\{-\frac{1}{2}{|a_{\xi,t}|}^{2}\right\}\)(9)
where\(a_{\xi,t}=\xi^{-sign(\mu_{\xi}+\sigma_{\xi}a_{t})}(\mu_{\xi}+\sigma_{\xi}a_{t})\),\(\mu_{\xi}=M_{1}\left(\xi+\xi^{-1}\right)\) and\(\sigma_{\xi}=\sqrt{\left(M_{2}-M_{1}^{2}\right)\left(\xi^{2}+\xi^{-2}\right)+2M_{1}^{2}-M_{2}}\).
For the standardized exponential power pdf, \(M_{1}\) and \(M_{2}\) can
be obtained as \(\frac{2}{\sqrt{2\pi}}\) and 1, respectively (Fernández
and Steel, 1998; Schoups and Vrugt, 2010).
The resulting expression for the log-likelihood function of Eqns 6-9 is:
\(\mathcal{l}\left(\theta,\lambda,\sigma_{1},\xi|\overset{\overline{}}{Y}\right)=-\frac{T}{2}\ln\left(2\pi\right)-\sum_{t=1}^{T}{\ln\frac{\left(\tau\left({\overset{\overline{}}{Y}}_{t},\lambda\right)-\tau\left({\hat{Y}}_{t}\left(\theta\right),\lambda\right)\right)}{\sigma_{1}{\hat{Y}}_{t}\left(\theta\right)}}+T\ln\frac{2\sigma_{\xi}}{\xi+\xi^{-1}}-\frac{1}{2}\sum_{t=1}^{T}\left|a_{\xi,t}\right|^{2}\)(10)
where \(T\) is the times of field sampling across the whole crop growing
season.