Abstract
Saltwater disposal (SWDs) has been linked to the recent increase of
earthquakes in various regions of the United States. In some cases, the
strong temporal and spatial associations have provided unequivocal
evidences to the scientific community that wastewater injection is one
of the dominant causal factors to the onset seismicity. In addition,
numerous physical models have suggested that the increase in pore
pressure from wastewater injection is capable to induce fault slips,
providing further physical evidences. Another growing body of literature
sorts to rigorously prove causality with statistical analysis where they
propose statistical frameworks with parametric regression models to
evaluate whether the observed earthquakes were occurring more often than
by random chances and tested the statistical significance of the
observed occurrences of earthquake to arrive at causal interpretations.
We propose causal inference frameworks with the potential outcomes
perspective to explicitly define what we meant by causal effect with
mathematical formulations and declare necessary assumptions to ensure
consistency between models for model comparison. In particular, we put
considerations on two common difficulties in raster-based spatial
statistical analysis, the spatial correlation, which can be described by
Tobler’s first law of geography where near things are more related than
distant things, and interference, a causal inference term, where
treatments applied to some spatially indexed units affect the outcomes
at other spatially indexed units, mostly due to complex physical
processes. The study region, the Fort-Worth Basin of North Central
Texas, is discretized into non-overlapping grid blocks. The first
proposed workflow adopts a cross-sectional study design on aggregated
earthquake catalog and injection data where two statistical methods are
employed to test the significance of the causal effect between the
presence or absence of saltwater disposals and the number of the
earthquakes and to estimate the magnitude of the average causal effect.
The second proposed workflow incorporates the temporal domain which
holds more scientific interests. Finally, the analysis is repeated for
different grid configurations to directly assess the sensitivity of
statistical results.