I'm trying to calculate correlation using a formula in Statistics 4th Edition by Freedman:
r = average of (x in standard units) * (y in standard units)
If I try this out ...
x = 1:7
y = c(6,7,5,4,3,1,2)
x.z = scale(x)
y.z = scale(y)
prod = x.z * y.z
mean(prod)
[1] -0.7959184
However, if I use the builtin cor I get a different answer:
cor(x, y)
[1] -0.9285714
Looking through the worked examples in the book, the standard values for x and y seem to be rounded to the nearest 0.5, so I round my values and I get the expected answer:
x.z.round = round(x.z/0.5)*0.5
y.z.round = round(y.z/0.5)*0.5
prod.round = x.z.round * y.z.round
mean(prod.round)
[1] -0.9285714
Why do the x and y scaled values seemingly need to be rounded to the nearest 0.5?
cordoes not implement the correlation coefficient as defined in your reference textbook. It's important to consult its documentation (type?cor) and compare its definition to that your book is using. – whuber Dec 20 '18 at 14:32