2

I am using R and I want to scale some data. The code looks like this:

data <- read.table(file_name, header = TRUE)
rates <- scale(data[8])
mean <- mean(rates)
sd <- sd(rates)

My understanding is that this scale function should scale the data so the mean is 0 and the standard deviation is 1. The standard deviation seems correct but the mean is not 0. What causes this? And what is the solution to making the mean 0? Or am I interpreting something wrong?

  • 7
    Floating point arithmetic isn't exact. https://stats.stackexchange.com/a/525766/22311 The print out says that the mean is less than $2 \times 10^{-16}$ units away from zero. How much closer to zero do you need it to be? – Sycorax Jan 26 '23 at 20:56
  • What is the solution for this. It doesnt make sense to me that everywhere it says that scale makes the mean 0 and this function does not do that. – slipperypete Jan 26 '23 at 21:03
  • 7
    Floating point arithmetic isn't exact & the inexactness of floating point arithmetic can't be fixed. The statement "scale makes the mean zero" means "scale makes the mean a value that is close to 0, relative to the machine precision of floating point representation." You can learn information about floating point arithmetic in its technical standard, IEEE 754. – Sycorax Jan 26 '23 at 21:05
  • 7
    Have you ever computed $1/3$ on a calculator and then multiplied by $3$ to obtain $0.99999999$? According to the laws of arithmetic, the difference between this answer and the original value of $1$ is zero. The calculator thereby "proves" that $0 - 1 - 0.9999999 = 10^{-8}.$ The computer is just a big calculator and is subject to the same breakdown of mathematical laws. It is important to understand such things so you can use a computer wisely and well. – whuber Jan 26 '23 at 21:11
  • You should never be reporting your results out from the R terminal anyway. If you were producing values to include in a paper or table, you'd need to use the ?formatC to specify such things as significant figures, rounding rules, etc. – AdamO Jan 26 '23 at 21:14
  • 2
    what do you get if use instead round(mean(x),10) ? – utobi Jan 26 '23 at 21:21
  • 5
  • See What every programmer should know about floating-point arithmetic: https://floating-point-gui.de/ 2. For R specifically see the Note section in ?Comparison. Also see the references in ?Arithmetic
  • – Glen_b Jan 26 '23 at 21:22