Identification of buffered data in time series preprocessing:
application on surface river temperature
Abstract
With the growing number of sensors technologies, the production of
numerous types of data allows finer observations of our environment.
Among them, time series represent a valuable heritage by the time spent
on their recording and the information they contain. However, the
analysis of time series produced by a monitoring network generally
requires preprocessing steps to separate data with meaningful
information from sensors’ dysfunctions or measurement particular
conditions. In this context, outliers are already well studied and
several methods are already developed to identify them. In this paper,
we propose a complementary method to identify buffered data. Buffered
data are characterized by a lower amplitude than the rest of the time
series and can be naturally caused (groundwater influence for example)
or caused by measurement defects (sensor covered by sediment movements).
The necessity to identify buffered signals came with the use of data
coming from several databases with different level of qualification.
Buffered signals are not necessarily filtered with conventional
preprocessing methods and can affect the analysis when not related to
the studied phenomena. The identification method proposed in this study
relies on a normalized diurnal range index. It was developed on surface
river temperature time series recorded in metropolitan France to cover a
wide variety of regional climates and measurement environments. The
method is able to highlight buffered data inside a time series.
Furthermore, it is able to separate (naturally caused or not) occasional
or regular buffered signal periods in a time series. The study then uses
preprocessed time series to analyze the distribution of regular buffered
data according to the season of occurrence and a climate typology.