5

Suppose we have a collection of bivariate random variables $X_{1i}$ and $X_{2i}$ indexed by a continuous variable $t$ such that, for the vector ${\bf{X}} = (X_{1i} \ X_{2i})^T$ we can assume

\begin{equation*} {\bf{X}} \sim {\bf{N(\mu}}(t_{i}), {\bf{\Sigma}}(t_{i})) \end{equation*}

for some unknown 2-dimensional vector ${\bf{\mu}}(t_{i})$ and a 2x2 covariance matrix ${\bf{\Sigma}}(t_{i})$, which must be estimated.

How can we estimate the form of the covariance matrix ${\bf{\Sigma}}(t)$ using independent observations $(X_{1i} \ X_{2i})^T$ measured at $t_1, ..., t_n, \ 1 \leq i \leq n$?

One approach I have considered is data binning, i.e. diving the range of values of $t$ into $m$ intervals and calculating $\bf{\Sigma}$ for data with $t$ in each interval $[a_j,b_j], \ 1 \leq j \leq m, \ 1 < m < n$. This is undesirable given the specific relationship I am dealing with but it could work.

The specific problem I am looking at is TLE covariance. The TLE is a set of numbers sent out by a satellite that can be used to approximate a satellite's position and velocity at a time at, before, or after the TLE is generated. Therefore, given a TLE and a time $t$ (measured from the TLE epoch, i.e. TLE generation time), the expected position of the satellite in 3D Euclidean space depends on $t$ as does the error ellipsis (which expands with increasing $t$). The position uncertainty (i.e., information about the error ellipsis) is not public but attempts have been made to determine this from public data in papers like this one:

H. Yurong, L. Zhi and H. Lei, "Covariance propagation of two-line element data," 2016 Chinese Control and Decision Conference (CCDC), Yinchuan, China, 2016, pp. 3836-3841, doi: 10.1109/CCDC.2016.7531654.

This paper specifically uses data binning and mentions using "the quadratic polynomial fitting error standard deviation of propagation time function in least squares sense" to estimate the error evolution function (i.e. how the error estimate of the estimated position varies with time). I'm interested in ways to generate a function $\Sigma(t)$ to capture this evolution with time and investigate the behaviour of error ellipses at different times. I have stated the problem in 2 dimensions and with a Gaussian PDF as I think this would be a good starting point for understanding the theory.

Tim C
  • 61
  • 2
    I'm not sure that I understood your question. The means of random variables should not affect their covariances. So I would say that the covariance of X_1 and X_2 is simply Sigma and does not depend on t. – Pohoua Jul 11 '23 at 20:12
  • @Pohoua - you are right. Edited – Tim C Jul 11 '23 at 20:21
  • 1
    Your setup is unclear. Do you have independent observations $(X_{1i},X_{2i})$, $i=1,…,n$, measured at time points $t_1,\ldots, t_n$? And is $(X_{1i},X_{2i})$ distributed as bivariate normal with mean given by the 2-dimensional vector $\mu(t_i)$ and variance given by the 2x2 matrix $\Sigma(t_i)$? – Rachel Altman Jul 11 '23 at 21:08
  • @RachelAltman yes, sorry for being unclear. I have edited the question. – Tim C Jul 12 '23 at 08:34
  • 1
    As things stand you only (at most) have a single observation $(X_{1t}, X_{2t})$ at time point $t$. Obviously this doesn't allow you to estimate a covariance matrix. You will need a model that tells you how to learn about time point $t$ from observations at other time points. Binning implicitly assumes that the covariance matrix is the same in some interval. This is doable but not necessarily the best thing in the given situation. My point is just that you need to make some decision of this kind. I'd try to use information about the meaning of the data to get an idea. – Christian Hennig Jul 12 '23 at 09:18
  • I don't think by the way that in such a setup one would refer to the X as "explanatory variables"; the title is rather misleading. What about something like "estimating covariance varying over time"? – Christian Hennig Jul 12 '23 at 09:20
  • @ChristianHennig yes, that I suppose is the core of the question. Estimating $\mu(t)$ is straightforward enough e.g. using regression, but it is not obvious to me how we can estimate $\Sigma(t)$. – Tim C Jul 12 '23 at 12:18
  • @ChristianHennig note that $t$ is the explanatory variable in question here. I have edited the title to make this clearer. – Tim C Jul 12 '23 at 12:27
  • 1
    recent related question: https://stats.stackexchange.com/q/621126/237561 – Ute Jul 12 '23 at 13:08
  • Your notation is still problematic. You actually have a collection of bivariate random variables. Try starting with "Suppose we have a bivariate time series, ${ (X_{1t},X_{2t}) }$, $t=1,\ldots,n$...". Now that you've told us that you have a time series, can you actually reasonably assume that the $ (X_{1t},X_{2t})$'s are independent? – Rachel Altman Jul 12 '23 at 14:48
  • Instead of binning you could use a kernel estimator, or just a moving average. Then the result ($\Sigma(t)$ as a function of $t$) will look smoother. – Ute Jul 12 '23 at 17:23
  • @RachelAltman sorry my notation is problematic. I have found some time to elaborate on the question. The most recent edit hopefully makes things clearer. – Tim C Jul 14 '23 at 18:16
  • This sounds remarkably like the three-dimensional problem discussed at https://stats.stackexchange.com/questions/34396 where a rolling PCA is used. It is tempting to model the first differences, which -- provided they are small -- can be decomposed into infinitesimal rotation matrices and differences of the log variances. It would be well to understand how your particular data might evolve over time, because there are at least nine parameters to track in the 3D case. A generic EDA solution will work only if the changes over time are very slow. – whuber Jul 14 '23 at 18:42
  • You're still starting your post with the idea of TWO random variables. But you actually have $n$, bivariate random vectors! The notation $X_1$ and $X_2$ doesn't make sense in this setting; you should use $X_{i1}$ and $X_{2i}$ throughout. – Rachel Altman Jul 14 '23 at 18:53
  • @RachelAltman I've completely overlooked this, you are right. Edited. – Tim C Jul 15 '23 at 06:36
  • @Ute thanks for the link and the suggestion. Will do some thinking and update. – Tim C Jul 15 '23 at 06:36

0 Answers0