5

Here is the situation. (This is not a homework problem.)

I am writing a program that does Cool And Interesting Things starting with a correlation matrix among 3 variables: call them $X$, $Y$, and $Z$. I want the user to be able to specify the correlation matrix using any combination of 3 simple or partial correlations. That is to say, the user supplies the correlation between each pair of variables, but each of those 3 correlations may be either simple or partial.

For example, one possibility would be that the user supplies the simple correlation between $Y$ and $X$, the simple correlation between $X$ and $Z$, and the partial correlation between $Y$ and $Z$ (controlling for $X$). The program should deduce the simple correlation matrix (i.e., convert the one partial correlation to a simple correlation) and then proceed from there.

The program should be able to handle any possible combination of inputs (as long as it ultimately specifies a valid simple correlation matrix.) There are basically 4 possible types of input, namely:

  • 3 simple correlations
  • 2 simple correlations, 1 partial correlation
  • 1 simple correlation, 2 partial correlations
  • 3 partial correlations

I have only the 2 of the 4 cases worked out. In the first case, obviously if the user just supplies the 3 simple correlations, then there is no problem to solve. In the last case, where the user supplies 3 partial correlations, I can obtain the simple correlation by basically reversing the procedure described HERE. But I am having a hard time working out the 2 more interesting cases. I wonder if anyone can help point me in the right direction. Thanks!

NOTE: I have cross-posted this question on the TalkStats forum, where I am an active member, HERE. Please check for answers there before duplicating another's effort.

Jake Westfall
  • 12,557
  • 1
    I am curious about the assertion following "obviously": there are non-trivial relationships the three correlations have to satisfy, so it is quite possible for a user to supply three simple correlations that correspond to no random variables at all. It would seem that there is a problem to solve even in this case: you should at least report that the inputs are invalid! – whuber Oct 29 '14 at 23:13
  • Thanks for feedback @whuber, you are absolutely right, although I thought I covered this with my parenthetical note "(as long as it ultimately specifies a valid simple correlation matrix)." Currently the program verifies that the simple correlation matrix has a non-negative determinant. – Jake Westfall Oct 29 '14 at 23:17
  • Thanks for pointing out that parenthetical remark, Jake: I did indeed overlook it. Allow me to point out something I hope is equally obvious: up to permutations--which are simple to handle--there are only four types of input, of which you have solved two, leaving only two problems rather than six. – whuber Oct 29 '14 at 23:18
  • 1
    From the well-known formula it follows that in case of two simple and one partial correlation the simple correlation corresponding to the latter is easily computed. As for case "two partial and one simple correlations" I hasitate to say right now. – ttnphns Oct 30 '14 at 08:48
  • @whuber Yes, the question is perhaps more clear if I write the 4 types of input rather than all permutations. Will edit. – Jake Westfall Oct 30 '14 at 15:59
  • @ttnphns Thanks...I am embarrassed I overlooked the solution for that case. – Jake Westfall Oct 30 '14 at 16:00

1 Answers1

1

I have now solved each of the cases that I described, so I thought I'd post the solutions here for posterity.

Case 1: Three simple correlations

The only thing really to do in this case is to just verify that the 3 simple correlations form a valid correlation matrix. This is done by verifying that the correlation matrix has a non-negative determinant.

Case 2: Two simple correlations, One partial correlation

In this case (as pointed out by @ttnphns in a comment) we can can compute the one missing simple correlation by taking the well-known formula for writing a partial correlation coefficient in terms of the simple correlations, $$ r_{ab.c}=\frac{r_{ab}-r_{ac}r_{bc}}{\sqrt{1-r^2_{ac}}\sqrt{1-r^2_{bc}}}, $$ and solving it for the simple correlation term $r_{ab}$, which yields $$ r_{ab}=r_{ab.c}\sqrt{1-r^2_{ac}}\sqrt{1-r^2_{bc}}+r_{ac}r_{bc}. $$

Case 3: Three partial correlations

As explained in the link I posted in my question, to go from a simple correlation matrix to a partial correlation matrix, we simply invert the simple correlation matrix, divide the off-diagonal elements by the square roots of the corresponding diagonal elements (as if we were converting a covariance matrix to a correlation matrix), and multiply each off-diagonal by $-1$. So to reverse this process, we take the partial correlation matrix, multiply the off-diagonals by $-1$, take the matrix inverse, and then divide each off-diagonal by the square roots of the corresponding diagonals as we did before. If you work through these matrix computations by hand (actually I used this Wolfram Alpha widget), we can see that this leads to the following equation for writing a simple correlation in terms of a triplet of partial correlations: $$ r_{ab}=\frac{r_{ab.c}+r_{ac.b}r_{bc.a}}{\sqrt{(r^2_{ac.b}-1)(r^2_{bc.a}-1)}}. $$

Case 4: One simple correlation, Two partial correlations

For this case we can get the one missing partial correlation by taking the formula introduced for Case 3 and solving it for the $r_{ab.c}$ term, which yields $$ r_{ab.c}=r_{ab}\sqrt{(r^2_{ac.b}-1)(r^2_{bc.a}-1)}-r_{ac.b}r_{bc.a}. $$ After solving for the missing partial correlation, Case 4 is reduced to Case 3, which we can solve as described just above.

Jake Westfall
  • 12,557