StocProc_KLE¶
-
class
stocproc.
StocProc_KLE
(r_tau, t_max, tol=0.01, ng_fac=4, meth=’fourpoint’, diff_method=’full’, dm_random_samples=10000, seed=None, align_eig_vec=False, scale=1)[source]¶ A class to simulate stochastic processes using Karhunen-Loève expansion (KLE) method. The idea is that any stochastic process can be expressed in terms of the KLE
\[Z(t) = \sum_i \sqrt{\lambda_i} Y_i u_i(t)\]where \(Y_i\) and independent complex valued Gaussian random variables with variance one (\(\langle Y_i Y_j \rangle = \delta_{ij}\)) and \(\lambda_i\), \(u_i(t)\) are eigenvalues / eigenfunctions of the following homogeneous Fredholm equation
\[\int_0^{t_\mathrm{max}} \mathrm{d}s R(t-s) u_i(s) = \lambda_i u_i(t)\]for a given positive integral kernel \(R(\tau)\). It turns out that the auto correlation of the stocastic processes \(\langle Z(t)Z^\ast(s) \rangle = R(t-s)\) is given by that kernel.
For the numeric implementation the integral equation will be discretized (see
stocproc.method_kle.solve_hom_fredholm()
for details) which leads to a regular matrix eigenvalue problem. The accuracy of the generated process in terms of its auto correlation function depends on the quality of the eigenvalues and eigenfunction and thus of the number of discritization points. Further for a given threshold there is only a finite number of eigenvalues above that threshold, provided that the number of discritization points is large enough.Now the property of representing the integral kernel in terms of the eigenfunction
\[R(t-s) = \sum_i \lambda_i u_i(t) u_i^\ast(s)\]is used to find the number of discritization points and the number of used eigenfunctions such that the sum represents the kernel up to a given tolerance (see
stocproc.method_kle.auto_ng()
for details).Parameters: - r_tau – the idesired auto correlation function of a single parameter tau
- t_max – specifies the time interval [0, t_max] for which the processes in generated
- tol – maximal deviation of the auto correlation function of the sampled processes from the given auto correlation r_tau.
- ngfac – specifies the fine grid to use for the spline interpolation, the intermediate points are calculated using integral interpolation
- meth – the method for calculation integration weights and times, a callable or one of the following strings ‘midpoint’ (‘midp’), ‘trapezoidal’ (‘trapz’), ‘simpson’ (‘simp’), ‘fourpoint’ (‘fp’), ‘gauss_legendre’ (‘gl’), ‘tanh_sinh’ (‘ts’)
- diff_method – either ‘full’ or ‘random’, determines the points where the above success criterion is evaluated, ‘full’: full grid in between the fine grid, such that the spline interpolation error is expected to be maximal ‘random’: pick a fixed number of random times t and s within the interval [0, t_max]
- dm_random_samples – the number of random times used for diff_method ‘random’
- seed – if not None seed the random number generator on init of this class with seed
- align_eig_vec – assures that \(re(u_i(0)) \leq 0\) and \(im(u_i(0)) = 0\) for all i
Note
To circumvent the time consuming initializing the StocProc class can be saved and loaded using the standard python pickle module. The
get_key()
method may be used identify the Process class by its parameters (r_tau, t_max, tol).See also
Details on how to solve the homogeneous Fredholm equation:
stocproc.method_kle.solve_hom_fredholm()
Details on the error estimation and further clarification of the parameters ng_fac, meth, diff_method, dm_random_samples can be found at
stocproc.method_kle.auto_ng()
.-
__call__
(t=None)¶ evaluates the stochastic process via spline interpolation of the discrete process \(z_k\)
Parameters: t – time to evaluate the stochastic process at, float of array of floats, if t is None return the discrete process \(z_k\) which corresponds to the times \(t_k\) given by the integration weights method Returns: a single complex value or a complex array of the shape of t that holds the values of stochastic process at times t
-
get_num_y
()[source]¶ The number of independent random variables Y is given by the number of used eigenfunction to approximate the auto correlation kernel.
-
get_time
()¶ Returns the time \(t_k\) corresponding to the values \(z_k\)
These times are determined by the integration weights method.
-
get_z
()¶ Returns the discrete process \(z_k\).
-
new_process
(y=None, seed=None)¶ generate a new process by evaluating
calc_z()
with new random variables \(Y_i\)Parameters: - y – independent normal distributed complex valued random variables with \(\sigma_{ij}^2 = \langle y_i y_j^\ast \rangle = 2 \delta_{ij}\)
- seed – if not None set seed to seed before generating samples
When y is given use these random numbers as input for
calc_z()
otherwise generate a new set of random numbers.
-
set_scale
(scale)¶
-
stocproc.method_kle.
solve_hom_fredholm
(r, w)[source]¶ Solves the discrete homogeneous Fredholm equation of the second kind
\[\int_0^{t_\mathrm{max}} \mathrm{d}s R(t-s) u(s) = \lambda u(t)\]Quadrature approximation of the integral gives a discrete representation which leads to a regular eigenvalue problem.
\[\sum_i w_i R(t_j-s_i) u(s_i) = \lambda u(t_j) \equiv \mathrm{diag(w_i)} \cdot R \cdot u = \lambda u \]Note: If \(t_i = s_i \forall i\) the matrix \(R(t_j-s_i)\) is a hermitian matrix. In order to preserve hermiticity for arbitrary \(w_i\) one defines the diagonal matrix \(D = \mathrm{diag(\sqrt{w_i})}\) with leads to the equivalent expression:
\[D \cdot R \cdot D \cdot D \cdot u = \lambda D \cdot u \equiv \tilde R \tilde u = \lambda \tilde u\]where \(\tilde R\) is hermitian and \(u = D^{-1}\tilde u\).
Parameters: - r – correlation matrix \(R(t_j-s_i)\)
- w – integrations weights \(w_i\) (they have to correspond to the discrete time \(t_i\))
Returns: eigenvalues, eigenvectos (eigenvectos are stored in the normal numpy fashion, ordered in decreasing order)
See also
There are various convenient functions to calculate the integration weights and times to approximate the integral over the interval [0, t_max] using ng points.
get_mid_point_weights_times()
,get_trapezoidal_weights_times()
,get_simpson_weights_times()
,get_four_point_weights_times()
,get_gauss_legendre_weights_times()
,get_tanh_sinh_weights_times()
Note
It has been noticed that the performance of the various weights depends on the auto correlation function. As default one should use the ‘simpson weights’. ‘four point’, ‘gauss legendre’ and ‘tanh sinh’ might perform better for auto correlation function that decay slowly. Their advantage becomes evident for a large numbers of grid points only. So if one cares about relative differences below 1e-4 the more sophisticated weights are suitable.
-
stocproc.method_kle.
get_mid_point_weights_times
(t_max, num_grid_points)[source]¶ Returns the weights and grid points for numeric integration via mid point rule.
Parameters: - t_max – end of the interval for the time grid \([0,t_\mathrm{max}]\)
- num_grid_points – number of grid points N
Returns: location of the grid points, corresponding weights
Because this function is intended to provide the weights to be used in
solve_hom_fredholm()
it stretches the homogeneous weights over the grid points starting from 0 up to t_max, so the term min_point is somewhat miss leading. This is possible because we want to simulate stationary stochastic processes which allows \(\alpha(t_i+\Delta/2 - (s_j+\Delta/2)) = \alpha(t_i-s_j)\).The N grid points are located at
\[t_i = i \frac{t_\mathrm{max}}{N-1} \qquad i = 0,1, ... N - 1\]and the corresponding weights are
\[w_i = \Delta t = \frac{t_\mathrm{max}}{N} \qquad i = 0,1, ... N - 1\]
-
stocproc.method_kle.
get_trapezoidal_weights_times
(t_max, num_grid_points)[source]¶ Returns the weights and grid points for numeric integration via trapezoidal rule.
Parameters: - t_max – end of the interval for the time grid \([0,t_\mathrm{max}]\)
- num_grid_points – number of grid points N
Returns: location of the grid points, corresponding weights
The N grid points are located at
\[t_i = i \frac{t_\mathrm{max}}{N-1} \qquad i = 0,1, ... N - 1\]and the corresponding weights are
\[w_0 = w_{N-1} = \Delta t /2 \qquad w_i = \Delta t \quad 0 < i < N - 1 \qquad \Delta t = \frac{t_\mathrm{max}}{N-1}\]
-
stocproc.method_kle.
get_simpson_weights_times
(t_max, num_grid_points)[source]¶ Returns the weights and grid points for numeric integration via simpson rule.
Parameters: - t_max – end of the interval for the time grid \([0,t_\mathrm{max}]\)
- num_grid_points – number of grid points N (needs to be odd)
Returns: location of the grid points, corresponding weights
The N grid points are located at
\[t_i = i \frac{t_\mathrm{max}}{N-1} \qquad i = 0,1, ... N - 1\]and the corresponding weights are
\[w_0 = w_{N-1} = 1/3 \Delta t \qquad w_{2i} = 2/3 \Delta t \quad 0 < i < (N - 1)/2\]\[\qquad w_{2i+1} = 4/3 \Delta t \quad 0 \leq i < (N - 1)/2 \qquad \Delta t = \frac{t_\mathrm{max}}{N-1}\]
-
stocproc.method_kle.
get_four_point_weights_times
(t_max, num_grid_points)[source]¶ Returns the weights and grid points for numeric integration via four-point Newton-Cotes rule.
Parameters: - t_max – end of the interval for the time grid \([0,t_\mathrm{max}]\)
- num_grid_points – number of grid points N (needs to be (4k+1) where k is an integer greater 0)
Returns: location of the grid points, corresponding weights
The N grid points are located at
\[t_i = i \frac{t_\mathrm{max}}{N-1} \qquad i = 0,1, ... N - 1\]and the corresponding weights are
\[w_0 = w_{N-1} = 28/90 \Delta t\]\[w_{4i+1} = w_{4i+3} = 128/90 \Delta t \quad 0 \leq i < (N - 1)/4\]\[w_{4i+2} = 48/90 \Delta t \quad 0 \leq i < (N - 1)/4\]\[w_{4i} = 56/90 \Delta t \quad 0 < i < (N - 1)/4\]\[\Delta t = \frac{t_\mathrm{max}}{N-1}\]
-
stocproc.method_kle.
get_gauss_legendre_weights_times
(t_max, num_grid_points)[source]¶ Returns the weights and grid points for numeric integration via Gauss integration by expanding the function in terms of Legendre Polynomials.
Parameters: - t_max – end of the interval for the time grid \([0,t_\mathrm{max}]\)
- num_grid_points – number of grid points N
Returns: location of the grid points, corresponding weights
-
stocproc.method_kle.
get_tanh_sinh_weights_times
(t_max, num_grid_points)[source]¶ Returns the weights and grid points for numeric integration via Tanh-Sinh integration. The idea is to transform the integral over a finite interval \(x \in [-1, 1]\) via the variable transformation
\[x = \tanh(\pi/2 \sinh(t))\]to a integral over the entire real axis \(t \in [-\infty,\infty]\) but where the new transformed integrand decay rapidly such that a simply midpoint rule performs very well.
inspired by ‘Tanh-Sinh High-Precision Quadrature - David H. Bailey’
Parameters: - t_max – end of the interval for the time grid \([0,t_\mathrm{max}]\)
- num_grid_points – number of grid points N (needs to be odd)
Returns: location of the grid points, corresponding weights
For a fixed small parameter h the location of the grid points read
\[x_i = \tanh(\pi/2 \sinh(ih)\]with corresponding weights
\[w_i = \frac{\pi/2 \cosh(ih)}{\cosh^2(\pi/2 \sinh(ih))}\]where i can be any integer. For a given number of grid points N, h is chosen such that \(w_{(N-1)/2} < 10^{-14}\) which implies odd N. With that particular h \(x_i\) and \(w_i\) are calculated for \(-(N-1)/2 \leq i \leq (N-1)/2\). Afterwards the \(x_i\) are linearly scaled such that \(x_{-(N-1)/2} = 0\) and \(x_{(N-1)/2} = t_\mathrm{max}\).
-
stocproc.method_kle.
auto_ng
(corr, t_max, ngfac=2, meth=<function get_mid_point_weights_times>, tol=0.001, diff_method=’full’, dm_random_samples=10000, ret_eigvals=False, relative_difference=False)[source]¶ increase the number of gridpoints until the desired accuracy is met
This function increases the number of grid points of the discrete Fredholm equation exponentially until a given accuracy is met. The accuracy is determined from the deviation of the approximated auto correlation of the Karhunen-Loève expansion from the given reference auto correlation.
\[\Delta(n) = \max_{t,s \in [0,t_\mathrm{max}]}\left( \Big | \alpha(t-s) - \sum_{i=1}^n \lambda_i u_i(t) u_i^\ast(s) \Big | \right )\]Parameters: - corr – the auto correlation function
- t_max – specifies the interval [0, t_max] for which the stochastic process can be evaluated
- ngfac – specifies the fine grid to use for the spline interpolation, the intermediate points are calculated using integral interpolation
- meth – the method for calculation integration weights and times, a callable or one of the following strings ‘midpoint’ (‘midp’), ‘trapezoidal’ (‘trapz’), ‘simpson’ (‘simp’), ‘fourpoint’ (‘fp’), ‘gauss_legendre’ (‘gl’), ‘tanh_sinh’ (‘ts’)
- tol – defines the success criterion max(abs(corr_exact - corr_reconstr)) < tol
- diff_method – either ‘full’ or ‘random’, determines the points where the above success criterion is evaluated, ‘full’: full grid in between the fine grid, such that the spline interpolation error is expected to be maximal ‘random’: pick a fixed number of random times t and s within the interval [0, t_max]
- dm_random_samples – the number of random times used for diff_method ‘random’
- ret_eigvals – if True, return also the eigen values
- relative_difference – if True, use relative difference instead of absolute
Returns: an array containing the necessary eigenfunctions of the Karhunen-Loève expansion for sampling the stochastic processes (shape=(num_eigen_functions, num_grid_points)
- The procedure works as follows:
Solve the discrete Fredholm equation on a grid with ng points. This gives ng eigenvalues/vectors where each ng-dimensional vector approximates the continuous eigenfunction. (\(t, u_i(t) \leftrightarrow t_k, u_{ik}\) where the \(t_k\) depend on the integration weights method) For performance reasons, especially when the auto correlation function evaluates slowly it is advisable to use a method which uses equally distributed times :math;`t_k`.
Approximate the eigenfunction on a finer, equidistant grid (\(ng_\mathrm{fine} = ng_\mathrm{fac}(ng-1)+1\)) using
\[u_i(t) = \frac{1}{\lambda_i} \int_0^{t_\mathrm{max}} \mathrm{d}s \; \alpha(t-s) u_i(s) \approx \frac{1}{\lambda_i} \sum_k w_k \alpha(t-s_k) u_{ik}\]According to the Numerical Recipes [1] this interpolation should perform better that simple spline interpolation. However it turns that this is not the case in general (e.g. for exponential auto correlation functions the spline interpolation performs better). For that reason it might be usefull to set ngfac to 1 which will skip the integral interpolation
Use the eigenfunction on the fine grid to setup a cubic spline interpolation.
Use the spline interpolation to estimate the deviation \(\Delta(n)\). When using diff_method = ‘full’ the maximization is performed over all \(t'_i, s'_j\) where \(t'_i = (t_i + t_{i+1})/2\) and \(s'_i = (s_i + s_{i+1})/2\) with \(i,j = 0, \, ...\, , ng_\mathrm{fine}-2\). It is expected that the interpolation error is maximal when beeing in between the reference points.
Now calculate the deviation \(\Delta(n)\) for sequential n starting at n=0. Stop if \(\Delta(n) < tol\). If the deviation does not drop below tol for all \(0 \leq n < ng-1\) increase ng as follows \(ng = 2*ng-1\) and start over at 1). (This update scheme for ng asured that ng is odd which is needed for the ‘simpson’ and ‘fourpoint’ integration weights)
Note
The scaling of the error of the various integration methods does not correspond to the scaling of the number of eigenfunctions to use in order to reconstruct the auto correlation function within a given tolerance. Surprisingly it turns out that in general the most trivial mid-point method performs quite well. If other methods suite bettern needs to be check in every case.
[1] Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P., 2007. Numerical Recipes 3rd Edition: The Art of Scientific Computing, Auflage: 3. ed. Cambridge University Press, Cambridge, UK ; New York. (pp. 990)