2

During my studies, I stumbled upon the following exercise:

We have the following joint probability distribution: $$p(x,y) = p(x) p(y|x)$$ $$p(x) = \mathcal{N}(0,1), p(y \mid x) = \frac{1}{2} \delta(y -x) + \frac{1}{2} \delta(y+x)$$ where $\delta(\cdot)$ is the Direc delta function. The exercise then asks to find the principal components of $p(x,y)$. It is hinted that this is equivalent to finding the parameters $\theta \in [0, 2 \pi [$ that maximize the variance of the projected data: $z(\theta) = x \cos(\theta) + y \sin(\theta)$, since in linear component analysis for two-dimensional probability distributions the set of possible directions to look for in $\mathbb{R}^2$ is given by: $\{ \begin{pmatrix} \cos(\theta) \\ \sin(\theta) \end{pmatrix}, 0 \leq \theta \leq 2\pi \}$.

Usually I would take the Lagrangian and the derive the maximum, but I don't know how the Langrangian would look like in this case. How would I go about solving this?

MLStudent
  • 21
  • 2

2 Answers2

2

Since $E[z(\theta)]=0$, we have $$\operatorname{var}(z(\theta))=E[z(\theta)^2]=E[\cos(\theta)^2x^2+\sin(\theta)^2y^2+2\cos\theta\sin\theta xy]$$ Here, by definition, $E[x^2]=E[y^2]=1$. So, the expression reduces to: $$\operatorname{var}(z(\theta))=1+\sin(2\theta) E[xy]$$ We also have $E[xy]=E[E[xy|x]]=E[xE[y|x]]=0$, which means $\operatorname{var}(z)=1$. This is an interesting dataset, the samples follow $y=x$ and $y=-x$ lines but we can't find these axes using this method. More importantly, $x$ and $y$ turns out to be uncorrelated.

gunes
  • 57,205
2

4-fold rotational symmetry

The distribution looks like

n = 10^3
x = rnorm(n,0,1)
y = x*(rbinom(n,1,0.5)*2-1)

plot(x,y, pch=21, col=rgb(0,0,0,0.1),bg=rgb(0,0,0,0.1))

example

It has a 4-fold rotational symmetry which means that a quarter rotation leaves the distribution unchanged. That is $p(x,y) = p(-y,x)$.

For any bivariate distribution with a 4-fold rotational symmetry, there are no unique principle components.

Motivation

For any rotational transformation with an angle by $\theta$ we have that the sum of the variance of the transformed coordinates is invariant.

$$ \text{VAR}(X_\theta) + \text{VAR}(Y_\theta) = \text{VAR}(X) + \text{VAR}(Y) \tag{1}$$

Because of the symmetry we have that $$\text{VAR}(X_{\theta + 0.5 k \pi }) = \text{VAR}(X_{\theta}) \\ \text{VAR}(Y_{\theta + 0.5 k \pi }) = \text{VAR}(Y_{\theta}) \tag{2}$$

Because the axes are 90 degrees rotated from each other we also have

$$\text{VAR}(X_{\theta+ 0.5 \pi}) = \text{VAR}(Y_{\theta}) \\ \text{VAR}(Y_{\theta-0.5 \pi}) = \text{VAR}(X) \tag{3}$$

combining equations 2 and 3 it follows that for any rotation $\theta$

$$\text{VAR}(X_{\theta}) = \text{VAR}(Y_{\theta})$$

and because of equation 1, the variance must sum up to a constant, we have that after any rotation we will have that the variance of a coordinate is equal to halve the total variance.

The consequence is that we can not choose any unique direction for the components

  • Be careful, because there may be a flaw in this logic. Everything hinges on the exact definition of "symmetrical." The fact that the variances in the two coordinates are equal does not imply the principal components have equal variances! What, then, are you claiming is symmetrical and what is the symmetry group? – whuber Feb 27 '23 at 15:33
  • @whuber if any two orthogonal transformated coordinates have equal variance, then the same will be true for the two principal components, which are orthogonal as well. – Sextus Empiricus Feb 27 '23 at 16:38
  • @whuber I get your comment when you are saying that I am not explaining the idea correctly/accurately, but the idea is correct right, or not? – Sextus Empiricus Feb 27 '23 at 16:40
  • As you know, I love demonstrations by symmetry or other insightful means that avoid calculations. But that assertion in your first comment appears incorrect, at least as I interpret it. As a counterexample I'm thinking of a Bivariate Normal distribution with equal unit variances and non-zero correlation coefficient (as illustrated in my post at https://stats.stackexchange.com/a/71303/919). Specifically, the log density is $-(x^2+y^2-2\rho xy)/(2(1-\rho^2))+C.$ The unique PCs are parallel to $(1,1)$ and $(1,-1)$ but they have different variances. – whuber Feb 27 '23 at 20:19
  • @whuber ah, ok, but the 90° rotational symmetry makes this different right? Do I just need to explain that better? I was thinking that any PC 2 must be 90° rotated to PC 1 and should be, due to the rotational symmetry, have the same variance. The distribution is the same after a 90° rotation. – Sextus Empiricus Feb 27 '23 at 21:26
  • Well, that's what I was wondering: what has rotational symmetry here? Not the distribution $p$! That has a $D_4$ symmetry group only. – whuber Feb 27 '23 at 21:54
  • @whuber any orthogonal transformation will have the axes at a 90° angle, and the variance of the coordinates along those axes will be the same because the distribution is invariant under a 90° rotation. – Sextus Empiricus Feb 28 '23 at 06:48
  • I hate to keep repeating, but -- orthogonal transformation of what? Your conclusion about the variance seems to contradict the behavior of bivariate Normal distributions, where the variances along the coordinates do change under orthogonal transformations. – whuber Feb 28 '23 at 13:54
  • @whuber, I don't believe that that is the case if the bivariate normal distribution has a 4-fold rotational symmetry. – Sextus Empiricus Feb 28 '23 at 14:02
  • That is correct: but it needs proof (which admittedly is simple). The result does not directly follow from some vague assertion of "rotational symmetry." – whuber Feb 28 '23 at 14:10
  • @whuber so you agree that the idea is correct? And it is only not sufficiently explained/proven? In the case of the latter, what is exactly vague? The idea that the distribution $p(x,y)$ has a 90 degrees rotational symmetry, or that this symmetry implies that the variance of the coordinates along two orthogonal coordinate axes are gonna be the same? – Sextus Empiricus Feb 28 '23 at 14:14
  • I agree that the idea is correct but just needs a clearer explanation. (And +1 for supplying it!) – whuber Feb 28 '23 at 14:16