Results
A genetic algorithm (GA) is used to optimize the three scaling factors
for the RDII impulse response functions (IRFs). The same method was used
to calibrate the total sewer flow simulated by the SWMM RTK method for a
comparison. The efficiency of both RDII estimation methods is compared
using the modified Nash-Sutcliffe coefficient.
\(E_{j}=1-\frac{\sum_{t=1}^{T}{W_{t,j}{(Q_{0}^{t}-Q_{m}^{t})}^{2}}}{\sum_{t=1}^{T}{W_{t,j}{(Q_{0}^{t}-{Q_{0}})}^{2}}}\) (15)
where \(Q_{0}^{t}\) is observed discharge at time t [T],\(Q_{m}^{t}\) is modeled discharge at time t[L3T-1], and \({Q_{0}}\) is
the average of observed discharge [L3T-1]. The
coefficient ranges from -∞ to 1 and E = 1 corresponds to a
perfect match between the observed discharge and the modeled discharge.j is a weighting factor (j = 1, 2, and 3).Wj is a weighting factor with the index j = 1 is applied to low flows, j = 2 is applied to medium flows,
and j = 3 is applied to peak flow values. In the conventional
Nash-Sutcliffe method, all three weighting factors are identical
(W1 = W2 = W3). By using the modified Nash-Sutcliffe method,
smaller runoff values are under-emphasized and larger peaks are
over-emphasized.
The calibration period was from May 9, 2009 to June 7, 2009 and the
validation period was from June 9, 2009 to July 8, 2009. The IRF method
has three parameters to calibrate: roof connection scaling factor (R),
sump pump connection scaling factor (S), and leaky lateral scaling
factor (L). The RTK method has nine parameters to calibrate: R1, R2, R3,
T1, T2, T3, K1, K2, and K3. R is a ratio of I&I discharge volume to the
rainfall volume: R1 is for a fast inflow element, while R2 and R3
represent slower infiltration elements. T is the time to peak in each
hydrograph (typically expressed in hours), and K is the ratio of time of
recession to the time to peak.
For the GA optimization conditions, size of the population was set as
100 and the maximum number of generations was set as 300 for both models
approaches. Value 0.95 is selected as the probability of crossover for
both IRF and RTK calibration. The probability of mutation is set as
0.06.
The calibrated parameter solutions for the IRF and RTK methods are
presented in Table 1. The Nash-Sutcliffe model efficiency coefficient of
the IRF solution is 0.534 in the calibration period and 0.560 in the
validation period. The modified Nash-Sutcliffe coefficients for the IRF
solution were 0.892 for the calibration period and 0.866 for the
validation period when the Nash-Sutcliffe weighting factors were set as W1 = 3 for Q > 90-th
percentile, W2 = 2 for 80- < Q < 90-th percentile, W3 = 1 for Q < 80-th percentile. Assigning larger weighting factors for
high flows improved the model fit significantly. The Nash-Sutcliffe
coefficient of the best RTK solution was 0.848 in the calibration period
and 0.795 in the validation period.
Though the model fitness was improved by using the modified
Nash-Sutcliffe method, model efficiency based on the RTK method was
higher since the RTK method has three times more parameters to adjust,
nine instead of three parameters. However, in the validation period,
model efficiency was increased for the IRF solution while it was
decreased for the RTK solution. This may imply the pitfall of the RTK
method that the method is not consistent and may not be very robust.
The optimal solution of the IRF scaling factors using the GA is: R =
3,359 for roof, S = 22,653 for sump pump, and L = 19,985 for lateral.
These values can be interpreted as RDII volume contribution of each RDII
source (Table 1). Contributing flow volume of each RDII source is
derived by multiplying the per-unit-area flow volume of IRFs and the IRF
weighting coefficients (Table 2). Then the contributing RDII volume from
the roof, sump pump, and lateral become 9,710 m3,
22,653 m3, and 32,543 m3,
respectively, and they are 15%, 35%, and 50% of total estimated RDII
flow volume. This simple calculation shows that IRF result can be
interpreted as RDII volume contribution of different RDII sources, which
shows the most problematic RDII contributor in the system volume-wise.
These values need to be interpreted with a caution as the IRF model
application in this study is only one realization of a real system and
each sewershed is unique in terms of factors that contribute to RDII.
However, this result still can provide insights to RDII behavior of the
system by providing physical meaning of the solutions.
The IRF approach tends to be more robust because three parameters adjust
three IRF that represent processes based on physics. Each IRF shape is
defined independently using physics-based models and the weighting
parameters reflect the contribution from each of the three IRF. The IRF
solutions are a unique solution no matter how randomly the initial
population was selected. In contrast, RTK method gives different
solutions every time the model runs. As an example, 30 sets of three RTK
hydrograph solutions display widely variable results as presented in
Figure 5. Within the user specified range for each hydrograph, the
solution can be vastly different for each run. The Nash-Sutcliffe
coefficient of the best case was 0.848 and that of the worst case was
0.681. Depending on the user-specified ranges of each parameter, the
results can vastly differ and the performance is not guaranteed.
RTK method has many local optimal solutions, which indicates that nine
coefficients are not independent. Thus the starting points or
constraints of the parameters cause other parameters to adjust to obtain
a local optimum that behaves similarly good for calibration data. Box
plots of the nine RTK parameters from the 30 model runs are presented in
Figure 6. Greater variability is observed in RTK parameters for the
second and third triangular hydrographs, especially the third one. This
is because the model tries to adjust these parameters according to the
given constraints of earlier parameters. Technically, different RTK
local solutions can result in the same model fitness. Change in one
hydrograph affects other two hydrographs to simply achieve the best
fitness. This indicates the problem of the RTK method that physical
processes are not reflected in the modeling.
Figure 7 shows the prediction of the monitored flow hydrograph using the
IRF solution and the best case of the RTK solutions during the
calibration period (Figure 7(a)) and the validation period (Figure
7(b)). In June 24, both methods predict flow peaks but the peak is not
observed in the monitored flow record. The flow peak might have happened
in such a short time period and the flow monitor might have failed to
capture the peak. Overall, RTK method tends to follow the monitored
hydrograph well especially at the falling limbs of peaks while IRF tends
to underestimate the flow at the falling limbs.
The volume and the peak flow values for the estimated DWF, observed
sewer flow, IRF model result, and RTK model result are summarized in
Table 3. Flowrate 0.3 m3/s is selected to define the
beginning and the end of each storm. The observed sewer flow, IRF
results, and RTK results are compared to the estimated DWF using the
following equation.
\(\text{Compare\ to\ DWF}=\frac{\text{Observed\ sewer}}{\text{Estimated\ DWF}}\times 100\)(16)
The observed sewer flow is three to four times of DWF in volume and
three to six times in peaks during the storms. Considering the
monitoring location is sanitary only, a great deal of RDII exists in the
area.
The IRF result and RTK result are compared to the observed sewer flow
using the following equation.
\(\text{Compare to observed RDII}=\frac{\text{Predicted RDII}- \text{Observed RDII}}{\text{Observed RDII}}\times100\)(17)
Both models underestimated the flow volume; IRF method underestimates
flow volume by 9% to 28% and RTK method underestimates flow volume by
4% to 26% compare to monitoring volume. In terms of flow peaks, IRF
method overestimated peak flowrate for May 13, May 27, and June 11
storms by 19%, 25%, and 9%, respectively. At the same time IRF method
underestimated peak flowrate for May 15, and June 16 by 15% and 8%,
respectively. RTK method overestimated peak flowrate consistently from
1% to 16%.
Residual plots of the IRF and the best RTK solutions for the calibration
period and the validation period are presented in Figure 8. Residuals
are the difference between the observed value of the dependent variable
and the predicted value. Each data point has one residual and is defined
with the following equation.
Residual = Observed value – Predicted value (18)
Residuals are plotted against the observed value in the x axis.
There are clusters of points at low flowrate, which represent tails in
the hydrographs. In Figure 8(a), IRF underestimates the peaks as most of
the residuals are in the positive side. These points are from the storms
in May 15, 2009 and May 27, 2009. This trend is also observed in the
validation period and the outliers are from the storms in June 11, 2009
and June 16, 2009 (Figure 8(b)). In validation period, RTK also
underestimated peaks as most of high flow points are in the positive
side. This means the best RTK solution for the calibration period loses
the efficiency in the validation period. This explains the decrease of
Nash-Sutcliffe coefficient of RTK method in the validation period as
presented in Table 1 and supports that RTK method is more of a curve
fitting method with limited physical meaning.