0

Lets say I simulate a immigration-death process:

$P(X(t + \delta t) = x+ 1 | X(t) = x) = \lambda \delta t$

$P(X(t + \delta t) = x-1 | X(t) =x) = v x \delta t$

using a Gillespie simulation - I pick a waiting time using

$\tau = \frac{1}{\lambda + v x(t)} \times ln(1/r_1)$ where $r_1 \sim U(0,1)$

then select a reaction using

$$x(t + \tau) = \begin{cases} x(t) + 1, \text{ if } r_2 < \lambda / (\lambda + v x(t)) \\ x(t) - 1, \text{ if } r_2 \geq \lambda / (\lambda + v x(t)) \end{cases} $$

where $r_2 \sim U(0,1) $.

If I run this for long enough it will eventually oscillate around a steady state. Lets say I want the variance around that steady state. In this simple case I could calculate it analytically, but that quickly becomes insoluble as the system becomes more complex.

I can calculate the mean by calculating a weighted mean of the value of $x$ multiplied by the amount of time spent at $x$ and dividing by the total length of time the system is simulated for (after a burn in to reach "steady state").

But what about the variance? I can do a biased estimate of the variance with

$$ \hat\sigma^2 = \frac{\Sigma_i \tau_i(x_i - \bar x)^2}{\Sigma_i \tau_i}$$

where $x_i$ is the progression of number of x's and $\tau_i$ is the length of time at each step.

Is there an unbiased estimate? The answer here suggsets it depends if the weights are probability weights or frequency weights. The weights, $\tau_i$ seem like they are fulfilling the role of a frequency weight. They are not normalized to 1 and they say how often $x_i$ was observed. But they are not unitless. So if we unbaised by making the denominator $\Sigma_i \tau - 1$, then the answer would be very different depending on whether $\tau$ was in hours, minutes or seconds.

Does it even matter?

  • How do you define the "variance around a steady state" when the process is serially correlated? – whuber Apr 20 '22 at 19:50
  • Thanks, thats an interesting point I hadn't considered. I guess what I'm trying to measure is if I sample the at any random point, how far am I likely to be from the mean (I'd like to know how noisy the system is). I suppose what I'm really after is an estimate of the variance of X(t). For simple systems like the immigration death process, its possible to solve this and show that the limiting distribution is X(t)~Poisson(l/v). But this is not possible for more complex systems. It is possible to simulate, and there is a quantity that is the average squared distance from the mean for that run. – Ian Sudbery Apr 21 '22 at 08:12
  • Is it the case that this mean squared distance from the mean is not a variance because the process is serially correlated? Or that it is not a good estimate of mean squared distance from the mean in other runs of the simulation? Or that variance over time is not a good estimate of the variance at a particular time t, even if the limiting distribution of X(t) is independent of time? One can imagine running the simulation to a time t 1000 times and calculating the variance, but if each run takes a minute, thats awfully slow. – Ian Sudbery Apr 21 '22 at 08:16
  • You could also imagine taking samples at sufficiently distant time points that you could assume they were not correlated. Less wasteful, I guess, but more complex, and still some wasteful. – Ian Sudbery Apr 21 '22 at 08:17

0 Answers0