2

Canonical correlation is used as a measure of dependence between multivariate vectors $\mathbf{X}$ and $\mathbf{Y}$, by finding vectors $\mathbf{a}$ and $\mathbf{b}$, respectively, such that $corr(\mathbf{a}^T \mathbf{X}, \mathbf{b}^T \mathbf{Y})$ is maximized.

Another measure of multivariate dependence is multivariate concordance, an extension of bivariate concordance which computes the probability that $\mathbf{X}$ and $\mathbf{Y}$ both increase and/or decrease simultaneously.

My question is regarding the validity of using CCA to measure the dependence between $\mathbf{X}$ and $\mathbf{Y}$. In CCA, we are modifying the data to maximize the correlation, where as in multivariate concordance, we are not modifying the data but rather measuring how the two vectors are progressing together. Does the fact that we are modifying the data in CCA make it a valid approach to measuring dependence between random vectors? Are there references/examples where CCA is valid and where it is not?

Some References for Multivariate Concordance:

  1. The most referenced paper on Multivariate Concordance by Harry Joe, but unfortunately behind a pay wall
  2. Paper 2 -- Open Access -- provides theory and estimators
  3. Paper 3 -- Open Access -- provides theory
  4. Paper 4 -- Open Access -- provides theory
Kiran K.
  • 862
  • In CCA, we are modifying the data to maximize the correlation. We do not "distort" the data in CCA anyhow. Because the method is linear and decomposing one, the correlation appears multidimensional. With 2 sets by 2 variables, 2 canonical correlations are produced, the 1st is maximized, but the 2nd is compensatory smaller. – ttnphns Sep 30 '16 at 23:20
  • 1
    ...It is true that a canonical corr. is between (linear) constructs and not between manifest variables as they are, but from multivariate paradigm standpoint, correlated variables can be replaced by informationally equivalent ones. Consider, for example, PCA (PCA, regression and CCA are compared here). It is rotation of data cloud in space: nothing is lost about the multivariate shape of the cloud when we replace original variables by the components, provided that we use all the principal components extracted. – ttnphns Sep 30 '16 at 23:20
  • ...Nor we add anything: the components do not leave the variables' space. Same general evidence is true for CCA either. – ttnphns Sep 30 '16 at 23:20
  • BTW, it would be great if you add links to sources or papers about multivariate concordance measure. – ttnphns Sep 30 '16 at 23:25
  • @ttnphns I agree, "distort" was a bad word to use in this case, but is it the case that $\mathbf{a^TX}$ and $\mathbf{b^TY}$ are "informationally equivalent"? Intuitively, it doesn't seem like informationally equivalent if we are maximizing something, but would love further thoughts on this concept? Stated in another way, if $\mathbf{a}$ and $\mathbf{b}$ maximizes the canonical correlation, then what theorem and/or concept says that $\mathbf{a}^TX$ and $\mathbf{b}^TY$ are informationally equivalent, versus $\mathbf{a'}^TX$ and $\mathbf{b'}^TY$? – Kiran K. Oct 01 '16 at 12:34
  • The Joe's paper is not behind a pay wall, i can read it form the link you sent without issues – rep_ho Oct 01 '16 at 13:18

0 Answers0