Why Gaussian process has marginalisation/consistency property?

Question

According to the book GPML, " ... A Gaussian process is defined as a collection of random variables. Thus, the definition automatically implies a consistency requirement, ...". Can I know why this definition automatically defines the consistency requirement? Which is also the marginalisation property?

Marginalization property is just taking the univariate marginal distribution of one component of the Gaussian multivariate distribution. See for example http://fourier.eng.hmc.edu/e161/lectures/gaussianprocess/node7.html theorem 4 part a. — mugen, Apr 01 '17 at 14:10

score 4 · Answer 1 · answered Mar 30 '22 at 18:29

A Gaussian process $\{X(t)\colon t \in \mathbb T\}$ is not defined as just a collection of Gaussian random variables; there is also the requirement that

for every $n \geq 1$, every finite collection $\{X(t_1), X(t_2), \cdots, X(t_n)\colon t_1, t_2, \cdots, t_n \in \mathbb T\}$ of $n$ random variables from the process enjoys a multivariate Gaussian (also called jointly Gaussian) distribution.

The more facile definition of a Gaussian process used by the OP restricts $n$ to be just $1$. Now, if $\{X(t_1), X(t_2), \cdots, X(t_n)\colon t_1, t_2, \cdots, t_n \in \mathbb T\}$ are jointly Gaussian, then so does any nonempty subset of two or more of these variables enjoy a jointly Gaussian distribution, and of course, each of the random variables is individually (that is, marginally) Gaussian. Furthermore, these marginal distributions are consistent: the distribution of $X(t_1)$ as obtained via marginalization from the joint distribution of $\{X(t_1), X(t_2)\}$ cannot be different from the the distribution of $X(t_1)$ as obtained via marginalization from the joint distribution of $\{X(t_1), X(t_3)\}$ because both are obtained from marginalization of the jointly Gaussian trivariate distribution of $\{X(t_1), X(t_2), X(t_3)\}$.

Thus, the consistency requirement is baked into the correct definition of a Gaussian process.

sicheng mao · Answer 2 · 2022-03-31T06:50:49.407

It is actually a good question which shows a subtlety of the definition of a general(not necessarily Gaussian) stochastic process. And I hope it is not too late for you.

In GPML, it says A stochastic process is defined as a collection of random variables with a law. Since these random variables are themselves mappings from a probability space to a measurable space, there are already probability measure on this probability space on which the stochastic process defined. Therefore the law of the stochastic process is already implied by the collection of random variables. This is guaranteed by Kolmogorov extension theorem

This theorem has multiple names:

1.Kolmogorov extension theorem: focusing on the fact that the law of this stochastic process is (naturally) extended from the law of the collection of random variables.

2.Kolmogorov existence theorem: focusing on the fact that the stochastic process exists, in the sense that it is really something "random" that equipped with a law. (Not just a plain collection of random variables)

3.Kolmogorov consistency theorem: focusing on the fact that if we assume the stochastic process exists, then its law should be consistent with the laws of its components (the random variables)

Applying the theorem to this particular question: when defining a Gaussian process, we define its law through the law of any finite dimensional subset of the collection, where it suffice to specify the covariance function.(the mean function is not essential for the law since it is just a translation).

So

Can I know why this definition automatically defines the consistency requirement?

it is not automatic, Kolmogorov extension theorem is at behind. (The third aspect).

Which is also the marginalisation property?

In the same Wikipage, the consistency conditions (2) says:

for all measurable sets $F_{i} \subseteq \mathbb{R}^{n}, m \in \mathbb{N}$ $$ \nu_{t_{1} \ldots t_{k}}\left(F_{1} \times \cdots \times F_{k}\right)=\nu_{t_{1} \ldots t_{k}, t_{k+1}, \ldots, t_{k+m}}(F_{1} \times \cdots \times F_{k} \times \underbrace{\mathbb{R}^{n} \times \cdots \times \mathbb{R}^{n}}_{m}) $$

It means basically if I measure a subset, there is no difference whichever measure I use, no matter a joint one or a marginal one, as long as the subset is measurable (contained in the measurable space where the measure is defined)

I strongly recommend you read the Wikipage for more insights.

That's all.

@whuber thank you for reminds, I have corrected.

CrossValidated does support inline $\TeX,$ as in this very sentence. Use $$$ as delimiters. Help is available when you are in the editing dialog. — whuber, Mar 30 '22 at 16:48

Why Gaussian process has marginalisation/consistency property?

2 Answers2