I need help with the line of my thinking, and how to conclude it because I'm unsure about my conclusion. A bivariate normal distribution with correlation coefficient between the random variables=1 does not have a pdf. But suppose we try to sketch the function by finding the probabilities obtained in different intervals where Y=aX+b, for all a,b(which are constants) from the real line (because X and Y are linearly related). If we sketch the function, we get a straight line in 3D over all intervals, we get a probability -say p- over the intervals where X and Y are linearly related (p belongs to [0,1]) and probability everywhere else is 0. My question is, can I conclude that a bivariate normal pdf with correlation coefficient between the random variables=1 can have a function drawn, despite not having a pdf?
-
1Your ideas about what is meant by a pdf, whether univariate or bivariate, are very much incorrect, and unless you understand what is wrong and correct them, any answers to your question will not make much sense to you. – Dilip Sarwate Aug 20 '16 at 16:19
-
Oh right! Probability is measured in intervals for a continuous RV! My bad. Will correct it right away. Thank you. – Sweta95 Aug 20 '16 at 16:21
2 Answers
For a bivariate normal PDF, you can visualize the shape in terms of the covariance ellipse. If $x$ and $y$ are uncorrelated and have equal variances, this will be a circle (centered around the 2D mean of the PDF). As you increase the correlation the ellipse will become more anisotropic, in the limit converging to a line segment with orientation +45 degrees and "width" 0. (The principle axes of the ellipse correspond to principle components.)
The resulting infinitely high linear ridge could be considered a PDF or not, depending on the definitions used.
For example, as noted on Wikipedia, the conditional PDF of $X_1$ given that $X_2=x_2$ will be a normal distribution
$X_1\mid X_2=x_2 \ \sim\ \mathcal{N}\left(\mu_1+\frac{\sigma_1}{\sigma_2}\rho( x_2 - \mu_2),\, (1-\rho^2)\sigma_1^2\right)$,
where $\rho$ is the correlation coefficient. So in the perfectly correlated case where $\rho$ goes to 1, the normal distribution has a zero variance. Is this a "PDF"? The limit of a Gaussian as variance goes to zero is in fact one standard way of defining the Dirac delta function. As to whether or not this is a "valid PDF", opinions vary.
Of course in practical terms, the marginal PDFs for $x_1$ and $x_2$ will each just be univariate normal.
-
In linear algebra terms, the covariance matrix is rank-deficient, so singular, corresponding to the 0 principle variance (singular value). This means it has a zero determinant, making the normalization factor for the Gaussian infinite (1/0). However the pseudo-inverse of the covariance can be used to define the Mahalanobis distance in exp() argument. So the "likelihood" part could be formally evaluated, I think. – GeoMatt22 Aug 20 '16 at 17:44
-
Thank you very much. That also answered my unasked questions regarding how the correlation affects the graph of a bivariate normal distribution. – Sweta95 Aug 20 '16 at 17:55
-
1The "resulting infinitely high linear ridge" is an incomplete description of the distribution. You also need to supply a density along that line for it to have any meaning. – whuber Aug 21 '16 at 15:14
-
@whuber My comment was as far as I got along those lines mathematically. I think the pseudo-inverse of the covariance matrix should just be $\Sigma^+=\Sigma/\sigma^4$, where $\Sigma$ is the covariance matrix and $\sigma^2$ is the norm of the diagonal of $\Sigma$ (= the non-zero singular value). – GeoMatt22 Aug 21 '16 at 16:18
Consider two variables defined as $u_1=m+n$ and $u_2=m+b+n$ where $m$ and $b$ are fixed constants and $n$ is normally distributed $n\sim \mathcal{N}(0,\sigma^2)$. Then the joint distribution $p(u_1,u_2|m,b)=p(u_2|u_1,b)p(u_1|m)$ and $u_2$ and $u_1$ are perfectly correlated since $u_2=u_1+b$. We can write the joint distribution then as $p(u_1,u_2|m,b)=\mathcal{N}(u_1;m,\sigma^2)\delta(u_2-(u_1+b))$ where $\delta(\cdot)$ denotes the dirac delta function.