In the book of "Practical Statistics for Data Scientists" it used the second formula(image down below) to calculate the coefficient correlation, while when I did some research in blogs and watched some YouTube tutorials most of them only use the pearson's formula .
So I'm asking is there any different between them , or any cases where we should use one better than the other
the Pearson's formula :

Asked
Active
Viewed 19 times
0
the_yaz2000
- 15
-
2They are the same, assuming that the second uses $s_x^2=\frac1{n-1}\sum (x_i-\bar x)^2$ and similarly for $s_y^2$ – Henry Jul 08 '22 at 22:01
-
See https://stats.stackexchange.com/questions/70969, which gives 15 formulas or procedures to find $r.$ You are comparing the first two formulas (modulo the comment by @Henry). The first formula is used only in textbooks because it is subject to (much) more floating point cancellation error than the second one. The first one isn't terribly insightful, either: it's just a jumble of arithmetic. The second begins to reveal some of the underlying concepts by exposing $r$ as a mean product of standardized residuals. – whuber Jul 08 '22 at 22:13
-
Thanks for the answers . I appreciate it . – the_yaz2000 Jul 08 '22 at 22:16
