0

consider the measurements: $x_1, x_2,x_3$.

Each is the average value of some sample $x_{s1}$ of the physical quantity $x$ that is measured on a nodal point $i$.

$x_1 = \frac{1}{N_1} \sum_{i=1}^{N_1}x_{s1}[i]$

Let the "error" in $x_1$ be the standard deviation $\sigma_1$ of the quantity $x_1$.

$x_1 \pm \sigma_1 \quad \sigma_1^2 = \frac{1}{N_1-1}\sum_{i=1}^{N_1}(x_1 -x_{s1}[i])^2$

Let us now compute the average of the measurements:

$x_{avg} = (x_1 + x_2 + x_3)/3$

However, what is the error $\sigma_{avg}$ such that reporting $x_{avg} \pm \sigma_{avg}$ makes sense?

I know that that when adding/subtracting numbers that you can add the associated error like

$\delta x_{avg}^2 = \delta x_1^2 +\delta x_2^2 +\delta x_3^2$

However, when N becomes large (i.e. $N \neq 3$), the number $\delta x_{avg}$ can grow so large that it is no longer a reasonable estimate of the error in the quantity $x_{avg}$.

Glen_b
  • 282,281
gsandhu
  • 103
  • By confusing sums and means, you have used excellent reasoning to derive the wrong conclusion: although "$\delta x_\text{avg}$" indeed grows large, its square root compared to the sum of the $x_i$ grows small. – whuber Jul 07 '16 at 21:03
  • I'm not sure what you mean, but I mean that for the term: $x_{avg} \pm \delta x_{avg}$, as N gets bigger, $x_{avg}$ should ideally remain the same, but $\delta x_{avg}$ will get only bigger and bigger. – gsandhu Jul 07 '16 at 21:30
  • But that's not correct: Your formula, as you state in the question, is for "adding/subtracting" numbers, not for averaging them! It's the right formula, but by confusing the sum with the average you draw the wrong conclusion. – whuber Jul 07 '16 at 21:55

1 Answers1

2

Let $X_{avg}=\frac{\sum_{i=1}^nX_i}{N}$. In order to find the variance, we have:

\begin{eqnarray*} Var[X_{avg}] &=& Var\left[\frac{\sum_{i=1}^NX_i}{N}\right] \\ &=& \frac{1}{N^2}Var\left[\sum_{i=1}^NX_i\right] \\ &=& \frac{1}{N^2}\left[\sum_{i=1}^N\sum_{j=1}^NCov[X_i,X_j]\right] \\ &=& \frac{1}{N^2}\left[\sum_{i=1}^NVar[X_i]+ \sum_{i=1}^N\sum_{j=i+1}^NCov[X_i,X_j]\right] \end{eqnarray*}

Note that the last line occurred because $Cov[X_i,X_i]=Var[X_i]$. Since the standard deviation is simply the square root of the derivative, we have:

\begin{eqnarray*} SD[X_{avg}] &=& \sqrt{Var[X_{avg}]} \\ &=& \sqrt{\frac{1}{N^2}\left[\sum_{i=1}^NVar[X_i]+ \sum_{i=1}^N\sum_{j=i+1}^NCov[X_i,X_j]\right]} \end{eqnarray*}

Note $SD[X_{avg}]$ is the standard deviation of $X_{avg}$.

If the $X_i$ are independently distributed, then $Cov[X_i,X_j]=0$ for all $i\neq j$ and all that is left are the variance terms. That is,

\begin{eqnarray*} \frac{1}{N^2}\left[\sum_{i=1}^NVar[X_i]+ \sum_{i=1}^N\sum_{j=i+1}^NCov[X_i,X_j]\right] &=& \frac{1}{N^2}\left[\sum_{i=1}^NVar[X_i]\right] \\ \Rightarrow SD[X_{avg}] &=& \sqrt{V[X_{avg}]} \\ &=& \sqrt{\frac{1}{N^2}\left[\sum_{i=1}^NVar[X_i]\right]} \\ \end{eqnarray*}

If you are interested in creating a confidence interval, I suggest using the form $x_{avg}\pm t^*_{df}SD[x_{avg}]$, where $t^*_{df}$ is the critical $t$-score associated with your preferred level of confidence and with $df$ degrees of freedom. $df$ should be $N-1$, where you have $N$ observations. $t^*_{df}$ should be easily calculated with most statistical programs or online references.

Matt Brems
  • 2,853
  • So, basically the error would be the variance one would get from simple addition/subtraction error propagation considerations but divided by the total number of measurements. – gsandhu Jul 07 '16 at 21:25
  • @gsandhu I'm not quite sure what you mean by "simple addition/subtraction error propagation considerations." – Matt Brems Jul 08 '16 at 03:04