2

We know that Pearson correlation tells you the linear correlation (proportionality) between two variables, meanwhile Spearman rank correlation tells you how far you have a monotonic relation (if one grows, the other variable grows or declines too).

Both of them would fail terribly to tell dependence if for example $Y$ = $X^2$, because there is no linear correlation and neither does the function show monotonicity.

Is there any coefficient that measures the degree of dependency? Is it implemented in Python?

mkt
  • 18,245
  • 11
  • 73
  • 172
  • 5
    Minor point but perhaps worth being aware that $Y= X^2$ is only uncorrelated with X if X is 'centred on 0' in a specific sense. Typically $X$ and $X^2$ are correlated – Glen_b Jul 30 '23 at 05:04
  • 1
    You’ve tagged this with [tag:regression-coefficients]. What connection do you see to regression? – Dave Jul 30 '23 at 07:25
  • Linear relationship $y = a + bx$ and proportionality $y = bx$ are not synonymous. I guess you intended to imply that changes iin prediction $\delta y$ are proportional to changes $\delta x$ because the gradient is constant. – Nick Cox Jul 30 '23 at 14:18

1 Answers1

2

Mutual information captures the entire dependence structure between two marginal distributions: linear, nonlinear, monotonic, whatever. Normalized mutual information might be easier to interpret on a standardized scale, so mutual information could be thought of as analogous to covariance and normalized mutual information as analogous to Pearson correlation (or squared Pearson correlation). There appears to be an implementation of each of these in the Python package sklearn.

An issue to keep in mind is that two variables can have zero mutual information (zero dependence) yet have considerable, perhaps even perfect, conditional mutual information, conditioned on yet a third variable. This is analogous to how two variables can be uncorrelated yet have considerable correlation conditioned on a third variable. In the example I give here, the $X_1$ and $X_2$ are uncorrelated, but there is very high correlation once you condition on color.

Dave
  • 62,186
  • Stronger claims have been made for Chatterjee's correlation. Google xicor so long as you also specify Chatterjee. – Nick Cox Jul 30 '23 at 14:20