1

If I have two variables X, Y and I already have correlations for subsets of the data that that are mutually exclusive and exhaustive, can I compute the overall correlation directly from this?

It seems intuitively we should be able to just take the weighted average correlation but I’m not sure this is totally sound. I have a feeling we probably need to have extra assumptions that the means and st.devs etc are constant among the different subsets?

  • 4
    The answer is a definite no, even making those extra assumptions: you can find discussions under the heading of "Simpson's Paradox," among other things. – whuber Dec 28 '20 at 20:32
  • The thread at https://stats.stackexchange.com/a/51927/919 shows what it takes to combine two covariances. To combine correlations, you have to convert the correlations into covariances, combine the covariances, and compute the resulting correlation. There is no algebraic simplification available. – whuber Mar 27 '21 at 15:53

0 Answers0