5

I was reading the following book

Han J, Pei J, Kamber M. Data mining: concepts and techniques. Elsevier; 2011 Jun 9. (Third Edition)

On page 96, at the first line of the last paragraph it says (here)

If the resulting value is equal to $0$, then $A$ and $B$ are independent and there is no correlation between them.

where the resulting value above corresponds to the following formula (correlation coefficient)

$$ r_{A,B}=\frac{\sum_{i=1}^n (a_i - \overline{A}) (b_i - \overline{B})}{n\sigma_A\sigma_B}. \tag{3.3} $$

However, on the next page on the last paragraph, it says

If $A$ and $B$ are independent (i.e., they do not have correlation), then ... $Cov(A,B) = \ldots = 0$.

Up to here, everything looks good, however by the following relation $$ r_{A,B} = \frac{Cov(A,B)}{\sigma_A\sigma_B} \tag{3.5} $$ the correlation and covariance are related and as far as I remember, if the covariance of two random variables tend to be zero, it is not necessary that they are independent. However, the book says if $r_{A,B} = 0$ , then $A$ and $B$ are independent. Am I right that the book is wrong? or there is something else happening here.

2 Answers2

17

Zero correlation does not imply independence. Either:

  1. There is a typo/mistake and the book is wrong or
  2. The book made additional assumptions previously, for example, that the joint distribution of A and B were bivariate normal. There exist additional conditions such that zero correlation and these conditions would imply independence.
Matthew Gunn
  • 22,329
9

Your book is wrong. Correlation zero is not a sufficient condition for independence. You can have Pearson correlation zero for variables that are not independent.

The independent variables will have both covariance and correlation zero, provided their variances are non-zero. There's no contradiction here.

Aksakal
  • 61,310