Stochastically generated streamflow time series are used for various water management and hazard estimation applications. They provide realizations of plausible but as yet unobserved streamflow time series with the same temporal and distributional characteristics as the observed data. However, the representation of non-stationarities and spatial dependence among sites remains a challenge in stochastic modeling. We investigate whether the use of frequency-domain instead of time-domain models allows for the joint simulation of realistic, continuous streamflow time series at daily resolution and spatial extremes at multiple sites. To do so, we propose the stochastic simulation approach called Phase Randomization Simulation using wavelets (

Stochastic simulation of continuous streamflow time series using an empirical, wavelet-based, spatio-temporal model in combination with the parametric kappa distribution.

Generation of stochastic time series at multiple sites showing temporal short- and long-range dependence, non-stationarities, and spatial dependence in extreme events.

Implementation of

Stochastic models are used to generate long time series or large event sets showcasing the full variability of a phenomenon. In hydrology, we use stochastically generated time series or event sets to refine water management plans, to get a better idea of potential reservoir inflows, or to develop suitable adaptation strategies for droughts and floods. If the focus is on such extreme events, event-based instead of continuous simulation approaches are often employed

There exists a variety of continuous, stochastic modeling approaches (corresponding to discrete-time models in the stochastic literature) which differ in their capability of representing distributional and/or temporal characteristics in the data. Here, we focus on direct modeling approaches that directly simulate streamflow using a stochastic model as opposed to indirect approaches which use a hydrological model to transform stochastically generated precipitation into streamflow. The most commonly used stochastic simulation approaches belong to the two classes of parametric and nonparametric models. Parametric models include autoregressive moving average (ARMA) models and their modifications

In contrast to most time-domain models, frequency-domain models allow for the simulation of surrogate data with the same Fourier spectra as the raw data

In contrast to the Fourier transform, the wavelet transform allows for the representation of non-stationarities in time series

An alternative to these approaches where only certain signal components are modified are approaches that randomize the wavelet coefficients for all components. These approaches typically perform a discrete wavelet decomposition using real wavelet functions (as opposed to complex wavelet functions), randomize the real-valued wavelet coefficients (i.e., amplitudes), and then invert the transform to produce a new realization of a time series

We investigate whether such a wavelet-based phase randomization approach allows for a realistic representation of spatial dependence in both continuous streamflow time series and spatial extremes. To do so, we propose a continuous wavelet-based approach for the stochastic generation of streamflow time series, hereafter referred to as

We implement

Wavelet decomposition transforms a one-dimensional time series to a two-dimensional time–frequency space

The wavelet function used for the transform should reflect the features present in the time series. Because of its smooth features, the Morlet wavelet has often been used in hydrological applications

We develop and apply the stochastic simulation approach

The 671 catchments in the United States cover a wide range of discharge regimes minimally influenced by human activity

For illustration and validation purposes, we select three regions which are distinct in terms of their hydrological regimes and their flood behavior. Flood similarity regions were determined by

Location of 671 stations in the dataset and of four catchments chosen per example region: (1) Pacific Northwest (red; 590, 608, 661, and 668), (2) Texas (light green; 431, 451, 464, and 474), and (3) Mid-Atlantic (purple; 43, 104, 117, and 249).

The stochastic simulation procedure

The kappa distribution was found to be suitable for fitting observed streamflow data in US catchments

Illustration of stochastic simulation approach

Step 1 is run only once per iteration to maintain spatial dependencies in the data, while Steps 2–5 are run for each station separately.

We run the stochastic simulation algorithm for the 671 catchments in the dataset

The evaluation at individual sites encompasses a comparison of observed and simulated distributional and temporal discharge characteristics. The distributional characteristics considered are the mean annual hydrograph showing variation of flow with season, 3 years of daily data illustrating the overall behavior of the series, the seasonal distributions (winter: December–February, spring: March–May, summer: June–August, fall: September–November), and monthly mean, maximum, and minimum values. The temporal characteristics considered include the autocorrelation (acf) and partial autocorrelation (pacf) functions measuring the strength of temporal dependence for different time lags, the power spectrum indicating how power varies with frequency and showing high values at those frequencies that correspond to strong periodic components

The evaluation at multiple sites comprises both an assessment of how the general spatial dependence structure in the data is reproduced and an assessment of how the spatial dependence in high extremes is captured. The assessment of the general dependence structure encompasses a comparison of observed and simulated discharge time series for the catchments in the three example regions, a comparison of pairwise observed and simulated cross-correlations for the example stations in the Pacific Northwest region, and a comparison of variograms of the observed and simulated series across all stations

To assess how spatial dependencies in extremes are reproduced, we first compare observed and simulated times of occurrences of flood events for the catchments in the three example regions. We then compare observed to simulated F-madograms for flood events across all stations. The F-madogram is a measure of spatial dependence taking values between 0 and 1 that compares the ordering of extreme events between two time series of extreme events

Both the distributional and temporal dependence characteristics of the time series at individual sites are well modeled, as shown by comparing observed and stochastically simulated time series for the two stations on the Nehalem and Navidad rivers (Figs.

Comparison of observed (black) and simulated (orange) distributional discharge characteristics for (i) the station Nehalem River near Foss, OR (USGS 14301000, id 661) in the Pacific Northwest and (ii) the station Navidad River near Hallettsville, TX (USGS 08164300, id 464):

Comparison of observed (black) and simulated (orange) temporal discharge characteristics for (i) the station Nehalem River near Foss, OR (USGS 14301000, id 661) in the Pacific Northwest and (ii) the station Navidad River near Hallettsville, TX (USGS 08164300, id 464):

Both high and low extremes are realistically modeled as illustrated by the boxplots depicting the distributions of the above and below threshold events of the four catchments in the Pacific Northwest (Fig.

Comparison of observed (grey)

Comparison of 3 years of multi-site observations

The stochastic simulation approach

Comparison of observed (black) and simulated (orange) cross-correlation functions (ccfs) for the daily discharge values for pairs of stations in the set of four catchments (i–iv) in the Pacific Northwest.

Comparison of observed variogram (black) with 100 variograms derived from the 100 simulation runs (orange).

Spatial dependencies are maintained not only for the bulk of the distribution, by which we mean the part of the distribution excluding extremes or outliers, but also for extreme values as illustrated by the peak-over-threshold (POT) values for the different stations in the three illustrated regions (Fig.

Observed (

The F-madograms shown in Fig.

Observed (black) vs. simulated (orange) F-madograms, a measure of the strength of spatial dependence, plotted against Euclidean distance. The lower the value, the higher the dependence between a pair of stations.

Observed vs. simulated tail dependence coefficient

Similar to the Fourier transform based simulation approach

The application of the approach is not limited to observed streamflow time series. It is applicable to other variables such as precipitation if combined with a suitable distribution as well as to modeled time series. The use of streamflow time series generated with a hydrological model extends the application of

Our results show that the continuous, wavelet-based stochastic simulation approach

Comparison of observed (black) and simulated (orange) cross-correlation functions (ccfs) for the daily discharge values for pairs of stations in the set of four catchments (i–iv) in the Pacific Northwest. Twenty simulations were generated for each site individually, neglecting spatial dependence.

Comparison of observed (black) and simulated (orange) variograms for 20 simulation runs where the phases were randomized for each station individually.

The wavelet-based stochastic simulation procedure for multiple sites,

MIB developed, set up, evaluated, and implemented the stochastic simulation approach in R package

The authors declare that they have no conflict of interest.

We thank Balaji Rajagopalan and Christopher Torrence for valuable discussions, which helped to shape the simulation approach. We also thank Sandhya Patidar and an anonymous reviewer for their valuable comments.

This work was supported by the Swiss National Science Foundation via a PostDoc.Mobility grant (grant no. P400P2_183844, granted to Manuela I. Brunner). Support for Eric Gilleland was provided by the Regional Climate Uncertainty Program (RCUP), an NSF-supported program at NCAR.

This paper was edited by Patricia Saco and reviewed by Sandhya Patidar and one anonymous referee.