0

I am trying to prove the equality of $$\rm MSE(\langle I\rangle)=Var(\langle I \rangle)+Bias(\langle I \rangle)^2$$ but obviously I got something wrong as they don't equal in my calculation:

So here is the example. I use monte carlo to estimate this integral:

$$I = \int_0^1 5x^4~\mathrm dx$$

The value of this integral is 1. Assuming samples are computed from a uniform probability distribution my estimator is:

$$\langle I\rangle = \frac1N\sum _1^N 5x_i^4$$

And the variance of the estimator can be analytically computed as:$$ \textrm{Var}(\langle I\rangle )= \frac1N\int _0^1(5x^4-1) ^2~\mathrm dx=\frac{16}{9N}$$

So Var here is the variance of the estimator, is that right? where, up to each iteration I calculate it as:

$$ \textrm{Var}(\langle I\rangle )= (\mathrm E[\langle I\rangle ^2] - \mathrm E[\langle I\rangle]^2) $$

and in the code:

float Ie = sum / (i + 1); //estimator
float avg2 = sum2 / (i + 1);
float var = avg2 - (Ie * Ie);
var /= i + 1;

and the bias is simply:

float bias2 = (trueValue - Ie);
bias2 *= bias2;

Here is the full code:

#include <iostream>

const int nsamp = 100;

int main() { float trueValue = 1; float data[nsamp]; float sum = 0; float sum2=0; float mse = 0; float sqErr=0;

for (int i = 0; i &lt; nsamp; i++)
{
    float x = rand1();
    data[i] = 5 * x*x*x*x;
    sum += data[i];
    sum2 += data[i] * data[i]; 

    float Ie = sum / (i + 1); //estimator
    float avg2 = sum2 / (i + 1);
    float var = avg2 - (Ie * Ie);
    var /= i + 1;

    float bias2 = (trueValue - Ie);
    bias2 *= bias2;

    sqErr += (trueValue - Ie) * (trueValue - Ie);
    mse = sqErr / (i+1);

    printf(&quot;\nI=%f Var=%f Bias2=%f MSE=%f&quot;, Ie, var, bias2, mse);
}

}

And the output where the mse doesn't equal var+bias2:

I=0.000000 Var=0.000000 Bias2=1.000000 MSE=1.000000
I=0.252220 Var=0.031807 Bias2=0.559176 MSE=0.779588
I=0.170473 Var=0.018592 Bias2=0.688114 MSE=0.749097
I=0.662600 Var=0.192099 Bias2=0.113839 MSE=0.590282
I=0.647206 Var=0.123133 Bias2=0.124464 MSE=0.497118
I=0.583528 Var=0.088888 Bias2=0.173449 MSE=0.443174
I=0.510921 Var=0.069824 Bias2=0.239198 MSE=0.414034
.
.
.
I=0.984586 Var=0.001662 Bias2=0.000238 MSE=0.005417
I=0.983600 Var=0.001659 Bias2=0.000269 MSE=0.005411
I=0.982616 Var=0.001657 Bias2=0.000302 MSE=0.005406
I=0.985189 Var=0.001660 Bias2=0.000219 MSE=0.005401
I=0.984248 Var=0.001658 Bias2=0.000248 MSE=0.005396
I=0.983362 Var=0.001655 Bias2=0.000277 MSE=0.005391
ali
  • 740
  • 4
  • 14
  • How did you pull out a $\frac{1}{N}$ out of $(E[I])^2 = 1^2 = 1$? – lightxbulb Aug 27 '22 at 15:28
  • for the variance term? The variance of an estimator is the variance of the function divided by N, is this right? Here is a snapshot from "Advance Global Illumination" book page 56 https://ibb.co/Gt79yC3 – ali Aug 27 '22 at 16:00
  • There was a mistake on my variance formula above. So I changed E[I]^2 with E[]^2. However the code was correct even then and didn't use 1 to calculate the mean. So the results aren't changed and so not as expected :/ – ali Aug 27 '22 at 16:11
  • $Var[\langle I \rangle] =Var\left[\frac{1}{N}\sum_{k=1}^N5x^4\right] = \frac{1}{N}Var[5x^4]$ is true provided the correlation is zero yes. But this is not what you wrote, you wrote $Var[\langle I\rangle] = \frac{1}{N}(E[\langle I\rangle^2] - E[\langle I \rangle]^2)$ which is not true. You have to drop the $\frac{1}{N}$ expression in the latter. – lightxbulb Aug 27 '22 at 17:15
  • aaah yes! that is right I will change it. But still it doesn't change the results I get as the code was using the formula you posted. – ali Aug 27 '22 at 17:27
  • So how about the code? what part of the code is wrong? is it the variance calculation? – ali Aug 27 '22 at 17:38
  • See my edits. The whole idea behind your code is wrong. – lightxbulb Aug 27 '22 at 19:12

1 Answers1

1

Here's a proof of the equality, it doesn't require choosing a specific function or numerical computations:

\begin{align} E[(I_N - I)^2] &= E[(I_N-E[I_N]+E[I_N]-I)^2] \\ &= E[(I_N-E[I_N])^2 + (E[I_N]-I)^2 + 2(I_N-E[I_N])(E[I_N]-I)] \\ &= E[(I_N-E[I_N])^2] + E[(E[I_N]-I)^2] + E[2(I_N-E[I_N])(E[I_N]-I)] \\ &= Var[I_N] + (E[I_N]-I)^2 + 2(E[I_N]-E[I_N])(E[I_N]-I) \\ &= Var[I_N] + (Bias(I_N))^2 + 0 \end{align}

The 3rd line holds by linearity of $E$, the 4th holds because $E[I_N]-I$ is not a random variable. The last holds since by definition $Bias(I_N) = E[I_N] - I$.

Edit:

In your code you are not computing the variance or the bias, you are computing estimates of those. By definition the bias, variance, and MSE are:

\begin{equation} Bias(\langle I\rangle, I) = E[\langle I \rangle] - I \\ Var(\langle I\rangle) = E[(\langle I \rangle - E[\langle I\rangle])^2] = E[(\langle I \rangle)^2] - (E[\langle I \rangle])^2 \\ MSE(\langle I \rangle, I) = E[(\langle I \rangle - I)^2] = Var[\langle I \rangle] + (Bias(\langle I \rangle, I))^2 \end{equation}

Now let's look at what you are computing in your code. You obviously know the true solution $I=1$. However in code you only compute $\langle I \rangle = \frac{1}{N}\sum_{k=1}^N g(x_k)$ and not $E[\langle I \rangle]$. This means that you don't compute the squared bias $(E[\langle I \rangle] - I)^2$ but instead you computed $(\langle I \rangle - I)^2$. That alone already introduced approximation error. Then for an estimate of the variance you have used $\frac{1}{N}\left(\frac{1}{N}\sum_{k=1}^N (g(x))^2 - (\frac{1}{N}\sum_{k=1}^N g(x))^2\right)$, but note that this is not equal to the variance, it's an estimate of it. And it's even a biased estimate (you need Bessel's correction for example for an unbiased one).

Finally let's address the elephant in the room: I don't think what you are doing for the MSE estimate makes any sense:

\begin{equation} mse = \frac{1}{N}\sum_{k=1}^N(I_k-I)^2 \end{equation}

In fact I am fairly certain that the above "estimator" isn't even consistent (i.e. it doesn't even estimate the MSE), since if I make a large error at the beginning then it will stick around forever.

Either way it should be clear now that you never computed the expectations in the first place, and just approximations of those, and the estimator for the mse that you wrote is wrong.

lightxbulb
  • 2,226
  • 1
  • 6
  • 14
  • Yes I can understand this proof and it makes sense. But my question is why does it not hold on my computation above. Where did I get it wrong? – ali Aug 27 '22 at 15:51
  • Thank you for your answer. So I think you are right that I calculate the estimate of the variance and not the variance itself(though I am not 100% certain here). But how do you then calculate the variance of this estimator here?? What does it look like in terms of code? Also regarding the bias you mentioned that I calculated the rather than E[]. OK but isn't it that E[] = E[g(x)] (that is expectation of the estimator is equal to expectation of the funtion) and E[g(x)] = 1/N Sigma(g(xi)) which is the estimator itself. – ali Aug 27 '22 at 20:26
  • So if the above definition of the mse is not correct , then what is the definition? Wikipedia has the same formula I believe https://en.wikipedia.org/wiki/Mean_squared_error – ali Aug 27 '22 at 20:27
  • It gets bit confusing when it comes to the expectation or variance of the estimator. in any case, there is this equality of mse=var+bias^2. And we have this estimator as above. How would you write it in code so the equailty holds? Would you mind correcting the code? – ali Aug 27 '22 at 20:30
  • 1
    @ali The expectation in your case is defined as $E[\langle I \rangle] = \int_0^1 5x^4,dx$ while your estimator is a sum. In general the expectation is $E[g(X)] =\int_{\Omega}g(x)p(x),dx$ where $p$ is the probability density function for sampling X. Wikipedia's definition of expectation is not what you wrote, you're misunderstanding the notation there. Also I have no idea what the purpose of your code is so I cannot fix it. – lightxbulb Aug 27 '22 at 20:39
  • So having this estimator, how do you write a code for this equality to hold? I mean you showed that my calculation is wrong. Could you correct it please. – ali Aug 27 '22 at 20:47
  • 1
    @ali The equality should hold if you compute things properly. For example you can analytically compute the variance as you did, which is $\frac{16}{9N}$, then you can also compute the bias for your estimator, which is zero. Then you can also compute the MSE from the definition: $MSE(\langle I \rangle, I) = E[(\langle I \rangle - I)^2] = E[(\langle I \rangle - E[\langle I \rangle])^2] = Var(\langle I \rangle)$. A more interesting case would involve a biased estimator, e.g. multiply your estimator by $(1+C/N)$, then the bias would be $C/N$, and you can compute the rest and see it's equal. – lightxbulb Aug 27 '22 at 21:34
  • So does it mean that if I don't have the analytical answer for the variance(for whatever reason) then the closest I can get to is an estimation of variance (for which you provided the formula and what the code does)? Let's say if I am to calculate variance of a pixel lumiance in a progressive monte carlo rendering. We don't know the exact function result and hence we won't be able to calculate the true Var() ? Is that right? – ali Aug 27 '22 at 22:15
  • Imagine for a single pixel value resulted from N iterations. How do you calculate the variance (and mse)? a pixel value from an iteration is the result of the estimator . So what would be Var()? If the answer is = E[^2] - E[]^2 , then I would love to know how the code looks like? – ali Aug 27 '22 at 22:20
  • @ali In practice you usually cannot compute the MSE because you usually do not know the true solution. You can estimate the variance however. Your approach is somewhat fine, but it's a biased variance estimator so you can apply Bessel's correction. See wikipedia's page on the topic. If you knew the true solution you could directly compute the approximation error btw. – lightxbulb Aug 27 '22 at 22:22
  • We don't know the true value so we cannot compute MSE. Fine. But how about variance? can we not compute pixel variance instead of estimating it? – ali Aug 27 '22 at 22:25
  • 1
    @ali Variance can be written as $Var[I_N] = E[(I_N-E[I_N])^2] = E[(I_N)^2] - (E[I_N])^2$. So if you can compute the variance this means you know $E[I_N]$. But then this begs the question why are you even estimating $E[I_N]$ in the first place. At least I can't think of a scenario where you know the variance but not the expectation. Maybe someone can come up with a contrived example, but I doubt you'd find such a scenario in practice. – lightxbulb Aug 27 '22 at 22:28
  • yes :) so it boils down to the fact that we don't know E[] and we only can estimate it? – ali Aug 27 '22 at 22:31
  • 1
    @ali Yes. If you knew $E[I_N]$ then why would you even try to estimate it stochastically through $I_N$? Note that if your estimator is biased then $E[I_N]\ne I$. You can still have a consistent estimator though such that $p\lim_{N\to\infty}I_N = I$, i.e. the bias goes to zero for infinitely many samples. – lightxbulb Aug 27 '22 at 22:33
  • That was really nice @lightxbulb. And I appreciate your answer me patiently! – ali Aug 27 '22 at 22:38
  • All this comes from the confusion I am having on this page https://rmanwiki.pixar.com/display/RFM24/AOVs for pixar render man. Right at the bottom of the page it defines variance & mse of a pixel rather vaguely. – ali Aug 27 '22 at 22:42
  • It says "MSE: Like variance, but diminishes towards zero as the number of samples increases. Though somewhat noisy itself, this can provide an estimate for the amount of mean-squared-error versus a hypothetical ground-truth image." how do you calculate mse as you said if you don't know the true value? – ali Aug 27 '22 at 22:48
  • @ali They mention a hypothetical ground-truth image there, so I assume you either provide this image, or maybe their application produces a denoised version and pretends that is the ground truth. – lightxbulb Aug 27 '22 at 23:13