2.4.1 Ensemble Kalman filter
In EnKF, the relation of the observation \(\mathbf{Y}_{t}\) to the model
simulated state \({\hat{\mathbf{Y}}}_{t}\) can be described as:
\(\mathbf{Y}_{t}=\mathbf{H}{\hat{\mathbf{Y}}}_{t}+\mathbf{\varepsilon}_{t}\)(12)
where \(\mathbf{H}\) is the measurement operator that maps the model
state to the observation. The ensembles of observations and simulations
of model state at \(t\) are stored in \(\mathbf{Y}_{t}\) and\({\hat{\mathbf{Y}}}_{t}\), respectively. Both have a dimension of\(N_{y}\times N_{\text{ens}}\), in which \(N_{y}\) and\(N_{\text{ens}}\) represent the dimension of observed states and the
ensemble size, respectively.
The observations \(\mathbf{Y}_{t,j}\) in ensemble \(\mathbf{Y}_{t}\)were drawn from a \(N_{y}\)-variate Gaussian distribution with mean
equal to the observation, \(\mathbf{Y}_{t}^{y}\), and covariance equal
to \(\mathbf{R}_{t}\),
\(\mathbf{Y}_{t,j}=\mathbf{Y}_{t}^{y}+\mathbf{\sigma}_{t}^{y}\) (13)
in which\(\mathbf{\sigma}_{t}^{y}\mathcal{\sim N}\left(0,\mathbf{R}_{t}\right)\).
The model error covariance \(\mathbf{P}_{t}^{f}\) of\({\hat{\mathbf{Y}}}_{t}\) is calculated, using:
\(\mathbf{P}_{t}^{f}=\left(N_{\text{ens}}-1\right)^{-1}\sum_{j=1}^{N_{\text{ens}}}{\left({\hat{\mathbf{Y}}}_{t,j}-{\overset{\overline{}}{\mathbf{Y}}}_{t}^{f}\right)\left({\hat{\mathbf{Y}}}_{t,j}-{\overset{\overline{}}{\mathbf{Y}}}_{t}^{f}\right)^{T}}\)(14)
in which \({\hat{\mathbf{Y}}}_{t,j}\) and\({\overset{\overline{}}{\mathbf{Y}}}_{t}^{f}\) represent the single
simulation trajectory and the mean of the ensemble\({\hat{\mathbf{Y}}}_{t}\), respectively and the superscript T denotes
the transpose of the matrix.\({\overset{\overline{}}{\mathbf{Y}}}_{t}^{f}\) is calculated as\(\frac{\left(\sum_{j=1}^{N_{\text{ens}}}{\hat{\mathbf{Y}}}_{t,j}\right)}{N_{\text{ens}}}\).
Under the linear assumptions, the updated analysis state\(\mathbf{Y}_{t}^{a}\) and its error covariance \(\mathbf{P}_{t}^{a}\)are calculated following:
\(\mathbf{Y}_{t}^{a}=\ {\hat{\mathbf{Y}}}_{t}+\mathbf{K}\left(\mathbf{Y}_{t}-\mathbf{H}{\hat{\mathbf{Y}}}_{t}\right)\)(15)
\(\mathbf{P}_{t}^{a}=\left(\mathbf{I}-\mathbf{\text{KH}}\right)\mathbf{P}_{t}^{f}\)(16)
in which \(\mathbf{I}\) is the identity matrix, and \(\mathbf{K}\) is
the Kalman gain, defined as:
\(\mathbf{K}=\mathbf{P}_{t}^{f}\mathbf{H}^{T}\left(\mathbf{H}\mathbf{P}_{t}^{f}\mathbf{H}^{T}+\mathbf{R}_{t,cov}\right)^{-1}\)(17)
where\(\mathbf{P}_{t}^{f}\mathbf{H}^{T}=\left(N_{\text{ens}}-1\right)^{-1}\sum_{j=1}^{N_{\text{ens}}}{\left({\hat{\mathbf{Y}}}_{t,j}-{\overset{\overline{}}{\mathbf{Y}}}_{t}^{f}\right)\left(\mathbf{H}{\hat{\mathbf{Y}}}_{t,j}-\mathbf{H}{\overset{\overline{}}{\mathbf{Y}}}_{t}^{f}\right)^{T}}\).
In summary, in the light of an ensemble of model trajectories, the EnKF
approximates the probability density of the model states at each time
step \(t\). The updated mean of this ensemble represents the “best”
state estimate, whereas the spread of the updated ensemble members
provides a measure of the output uncertainty (Evensen, 1994).