StocProc_KLE¶

class stocproc.StocProc_KLE(r_tau, t_max, tol=0.01, ng_fac=4, meth=’fourpoint’, diff_method=’full’, dm_random_samples=10000, seed=None, align_eig_vec=False, scale=1)[source]¶

A class to simulate stochastic processes using Karhunen-Loève expansion (KLE) method. The idea is that any stochastic process can be expressed in terms of the KLE

\[Z(t) = \sum_i \sqrt{\lambda_i} Y_i u_i(t)\]

where \(Y_i\) and independent complex valued Gaussian random variables with variance one (\(\langle Y_i Y_j \rangle = \delta_{ij}\)) and \(\lambda_i\), \(u_i(t)\) are eigenvalues / eigenfunctions of the following homogeneous Fredholm equation

\[\int_0^{t_\mathrm{max}} \mathrm{d}s R(t-s) u_i(s) = \lambda_i u_i(t)\]

for a given positive integral kernel \(R(\tau)\). It turns out that the auto correlation of the stocastic processes \(\langle Z(t)Z^\ast(s) \rangle = R(t-s)\) is given by that kernel.

For the numeric implementation the integral equation will be discretized (see stocproc.method_kle.solve_hom_fredholm() for details) which leads to a regular matrix eigenvalue problem. The accuracy of the generated process in terms of its auto correlation function depends on the quality of the eigenvalues and eigenfunction and thus of the number of discritization points. Further for a given threshold there is only a finite number of eigenvalues above that threshold, provided that the number of discritization points is large enough.

Now the property of representing the integral kernel in terms of the eigenfunction

\[R(t-s) = \sum_i \lambda_i u_i(t) u_i^\ast(s)\]

is used to find the number of discritization points and the number of used eigenfunctions such that the sum represents the kernel up to a given tolerance (see stocproc.method_kle.auto_ng() for details).

Parameters:

r_tau – the idesired auto correlation function of a single parameter tau
t_max – specifies the time interval [0, t_max] for which the processes in generated
tol – maximal deviation of the auto correlation function of the sampled processes from the given auto correlation r_tau.
ngfac – specifies the fine grid to use for the spline interpolation, the intermediate points are calculated using integral interpolation
meth – the method for calculation integration weights and times, a callable or one of the following strings ‘midpoint’ (‘midp’), ‘trapezoidal’ (‘trapz’), ‘simpson’ (‘simp’), ‘fourpoint’ (‘fp’), ‘gauss_legendre’ (‘gl’), ‘tanh_sinh’ (‘ts’)
diff_method – either ‘full’ or ‘random’, determines the points where the above success criterion is evaluated, ‘full’: full grid in between the fine grid, such that the spline interpolation error is expected to be maximal ‘random’: pick a fixed number of random times t and s within the interval [0, t_max]
dm_random_samples – the number of random times used for diff_method ‘random’
seed – if not None seed the random number generator on init of this class with seed
align_eig_vec – assures that \(re(u_i(0)) \leq 0\) and \(im(u_i(0)) = 0\) for all i

Note

To circumvent the time consuming initializing the StocProc class can be saved and loaded using the standard python pickle module. The get_key() method may be used identify the Process class by its parameters (r_tau, t_max, tol).

See also

There are various convenient functions to calculate the integration weights and times to approximate the integral over the interval [0, t_max] using ng points.

get_mid_point_weights_times(), get_trapezoidal_weights_times(), get_simpson_weights_times(), get_four_point_weights_times(), get_gauss_legendre_weights_times(), get_tanh_sinh_weights_times()

Note

It has been noticed that the performance of the various weights depends on the auto correlation function. As default one should use the ‘simpson weights’. ‘four point’, ‘gauss legendre’ and ‘tanh sinh’ might perform better for auto correlation function that decay slowly. Their advantage becomes evident for a large numbers of grid points only. So if one cares about relative differences below 1e-4 the more sophisticated weights are suitable.

stocproc.method_kle.get_mid_point_weights_times(t_max, num_grid_points)[source]¶

Returns the weights and grid points for numeric integration via mid point rule.

Parameters:	t_max – end of the interval for the time grid \([0,t_\mathrm{max}]\) num_grid_points – number of grid points N
Returns:	location of the grid points, corresponding weights

Because this function is intended to provide the weights to be used in solve_hom_fredholm() it stretches the homogeneous weights over the grid points starting from 0 up to t_max, so the term min_point is somewhat miss leading. This is possible because we want to simulate stationary stochastic processes which allows \(\alpha(t_i+\Delta/2 - (s_j+\Delta/2)) = \alpha(t_i-s_j)\).

The N grid points are located at

\[t_i = i \frac{t_\mathrm{max}}{N-1} \qquad i = 0,1, ... N - 1\]

and the corresponding weights are

\[w_i = \Delta t = \frac{t_\mathrm{max}}{N} \qquad i = 0,1, ... N - 1\]

stocproc.method_kle.get_trapezoidal_weights_times(t_max, num_grid_points)[source]¶

Returns the weights and grid points for numeric integration via trapezoidal rule.

Parameters:	t_max – end of the interval for the time grid \([0,t_\mathrm{max}]\) num_grid_points – number of grid points N
Returns:	location of the grid points, corresponding weights

The N grid points are located at

\[t_i = i \frac{t_\mathrm{max}}{N-1} \qquad i = 0,1, ... N - 1\]

and the corresponding weights are

\[w_0 = w_{N-1} = \Delta t /2 \qquad w_i = \Delta t \quad 0 < i < N - 1 \qquad \Delta t = \frac{t_\mathrm{max}}{N-1}\]

stocproc.method_kle.get_simpson_weights_times(t_max, num_grid_points)[source]¶

Returns the weights and grid points for numeric integration via simpson rule.

Parameters:	t_max – end of the interval for the time grid \([0,t_\mathrm{max}]\) num_grid_points – number of grid points N (needs to be odd)
Returns:	location of the grid points, corresponding weights

The N grid points are located at

\[t_i = i \frac{t_\mathrm{max}}{N-1} \qquad i = 0,1, ... N - 1\]

and the corresponding weights are

\[w_0 = w_{N-1} = 1/3 \Delta t \qquad w_{2i} = 2/3 \Delta t \quad 0 < i < (N - 1)/2\]

\[\qquad w_{2i+1} = 4/3 \Delta t \quad 0 \leq i < (N - 1)/2 \qquad \Delta t = \frac{t_\mathrm{max}}{N-1}\]

stocproc.method_kle.get_four_point_weights_times(t_max, num_grid_points)[source]¶

Returns the weights and grid points for numeric integration via four-point Newton-Cotes rule.

Parameters:	t_max – end of the interval for the time grid \([0,t_\mathrm{max}]\) num_grid_points – number of grid points N (needs to be (4k+1) where k is an integer greater 0)
Returns:	location of the grid points, corresponding weights

The N grid points are located at

\[t_i = i \frac{t_\mathrm{max}}{N-1} \qquad i = 0,1, ... N - 1\]

and the corresponding weights are

\[w_0 = w_{N-1} = 28/90 \Delta t\]

\[w_{4i+1} = w_{4i+3} = 128/90 \Delta t \quad 0 \leq i < (N - 1)/4\]

\[w_{4i+2} = 48/90 \Delta t \quad 0 \leq i < (N - 1)/4\]

\[w_{4i} = 56/90 \Delta t \quad 0 < i < (N - 1)/4\]

\[\Delta t = \frac{t_\mathrm{max}}{N-1}\]

stocproc.method_kle.get_gauss_legendre_weights_times(t_max, num_grid_points)[source]¶

Returns the weights and grid points for numeric integration via Gauss integration by expanding the function in terms of Legendre Polynomials.

Parameters:	t_max – end of the interval for the time grid \([0,t_\mathrm{max}]\) num_grid_points – number of grid points N
Returns:	location of the grid points, corresponding weights

stocproc.method_kle.get_tanh_sinh_weights_times(t_max, num_grid_points)[source]¶

Returns the weights and grid points for numeric integration via Tanh-Sinh integration. The idea is to transform the integral over a finite interval \(x \in [-1, 1]\) via the variable transformation

\[x = \tanh(\pi/2 \sinh(t))\]

to a integral over the entire real axis \(t \in [-\infty,\infty]\) but where the new transformed integrand decay rapidly such that a simply midpoint rule performs very well.

inspired by ‘Tanh-Sinh High-Precision Quadrature - David H. Bailey’

Parameters:	t_max – end of the interval for the time grid \([0,t_\mathrm{max}]\) num_grid_points – number of grid points N (needs to be odd)
Returns:	location of the grid points, corresponding weights

For a fixed small parameter h the location of the grid points read

\[x_i = \tanh(\pi/2 \sinh(ih)\]

with corresponding weights

\[w_i = \frac{\pi/2 \cosh(ih)}{\cosh^2(\pi/2 \sinh(ih))}\]

where i can be any integer. For a given number of grid points N, h is chosen such that \(w_{(N-1)/2} < 10^{-14}\) which implies odd N. With that particular h \(x_i\) and \(w_i\) are calculated for \(-(N-1)/2 \leq i \leq (N-1)/2\). Afterwards the \(x_i\) are linearly scaled such that \(x_{-(N-1)/2} = 0\) and \(x_{(N-1)/2} = t_\mathrm{max}\).

stocproc.method_kle.auto_ng(corr, t_max, ngfac=2, meth=<function get_mid_point_weights_times>, tol=0.001, diff_method=’full’, dm_random_samples=10000, ret_eigvals=False, relative_difference=False)[source]¶

increase the number of gridpoints until the desired accuracy is met

This function increases the number of grid points of the discrete Fredholm equation exponentially until a given accuracy is met. The accuracy is determined from the deviation of the approximated auto correlation of the Karhunen-Loève expansion from the given reference auto correlation.

\[\Delta(n) = \max_{t,s \in [0,t_\mathrm{max}]}\left( \Big | \alpha(t-s) - \sum_{i=1}^n \lambda_i u_i(t) u_i^\ast(s) \Big | \right )\]

Parameters:

corr – the auto correlation function
t_max – specifies the interval [0, t_max] for which the stochastic process can be evaluated
ngfac – specifies the fine grid to use for the spline interpolation, the intermediate points are calculated using integral interpolation
meth – the method for calculation integration weights and times, a callable or one of the following strings ‘midpoint’ (‘midp’), ‘trapezoidal’ (‘trapz’), ‘simpson’ (‘simp’), ‘fourpoint’ (‘fp’), ‘gauss_legendre’ (‘gl’), ‘tanh_sinh’ (‘ts’)
tol – defines the success criterion max(abs(corr_exact - corr_reconstr)) < tol
diff_method – either ‘full’ or ‘random’, determines the points where the above success criterion is evaluated, ‘full’: full grid in between the fine grid, such that the spline interpolation error is expected to be maximal ‘random’: pick a fixed number of random times t and s within the interval [0, t_max]
dm_random_samples – the number of random times used for diff_method ‘random’
ret_eigvals – if True, return also the eigen values
relative_difference – if True, use relative difference instead of absolute

Returns:

an array containing the necessary eigenfunctions of the Karhunen-Loève expansion for sampling the stochastic processes (shape=(num_eigen_functions, num_grid_points)

The procedure works as follows:

Solve the discrete Fredholm equation on a grid with ng points. This gives ng eigenvalues/vectors where each ng-dimensional vector approximates the continuous eigenfunction. (\(t, u_i(t) \leftrightarrow t_k, u_{ik}\) where the \(t_k\) depend on the integration weights method) For performance reasons, especially when the auto correlation function evaluates slowly it is advisable to use a method which uses equally distributed times :math;`t_k`.
Approximate the eigenfunction on a finer, equidistant grid (\(ng_\mathrm{fine} = ng_\mathrm{fac}(ng-1)+1\)) using

\[u_i(t) = \frac{1}{\lambda_i} \int_0^{t_\mathrm{max}} \mathrm{d}s \; \alpha(t-s) u_i(s) \approx \frac{1}{\lambda_i} \sum_k w_k \alpha(t-s_k) u_{ik}\]

According to the Numerical Recipes [1] this interpolation should perform better that simple spline interpolation. However it turns that this is not the case in general (e.g. for exponential auto correlation functions the spline interpolation performs better). For that reason it might be usefull to set ngfac to 1 which will skip the integral interpolation
Use the eigenfunction on the fine grid to setup a cubic spline interpolation.
Use the spline interpolation to estimate the deviation \(\Delta(n)\). When using diff_method = ‘full’ the maximization is performed over all \(t'_i, s'_j\) where \(t'_i = (t_i + t_{i+1})/2\) and \(s'_i = (s_i + s_{i+1})/2\) with \(i,j = 0, \, ...\, , ng_\mathrm{fine}-2\). It is expected that the interpolation error is maximal when beeing in between the reference points.
Now calculate the deviation \(\Delta(n)\) for sequential n starting at n=0. Stop if \(\Delta(n) < tol\). If the deviation does not drop below tol for all \(0 \leq n < ng-1\) increase ng as follows \(ng = 2*ng-1\) and start over at 1). (This update scheme for ng asured that ng is odd which is needed for the ‘simpson’ and ‘fourpoint’ integration weights)

Note

The scaling of the error of the various integration methods does not correspond to the scaling of the number of eigenfunctions to use in order to reconstruct the auto correlation function within a given tolerance. Surprisingly it turns out that in general the most trivial mid-point method performs quite well. If other methods suite bettern needs to be check in every case.

[1] Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P., 2007. Numerical Recipes 3rd Edition: The Art of Scientific Computing, Auflage: 3. ed. Cambridge University Press, Cambridge, UK ; New York. (pp. 990)

Parameters:	r – correlation matrix \(R(t_j-s_i)\) w – integrations weights \(w_i\) (they have to correspond to the discrete time \(t_i\))
Returns:	eigenvalues, eigenvectos (eigenvectos are stored in the normal numpy fashion, ordered in decreasing order)