0

everyone, this is my first time posting on this site, so if I am in violation of any standards, kindly let me know.

So I am attempting to prove that $\mu=E(X)$ is the choice of $a$ that minimizes $E([X-a]^{2})$

I have always accepted this fact without much thought, but I am hoping to show that using the properties of expectation, $a=\mu=E(X)$ is the value that minimizes mean squared error.

So we have our objective function as:

$E([X-a]^2)$

$=E([X-\mu+\mu -a]^{2}])$

$=E([(X-\mu)+(\mu-a)]^2)$

$=E([(X-\mu)^2+2(\mu-a)(X-\mu)+(\mu-a)^2])$

$=E(X-\mu)^2 + 2(\mu-a)E(X-\mu)+(\mu-a)^2$

$= E(X-\mu)^2 + 2(\mu-a)[E(X)-\mu]+(\mu-a)^2$

$= E(X-\mu)^2 + 2(\mu-a)[E(X)-E(X)]+(\mu-a)^2$

$= E(X-\mu)^2 + 2(\mu-a)[0]+(\mu-a)^2$

$= E(X-\mu)^2 +(\mu-a)^2$

And this function is now minimized at $a=\mu$, which is equal to $E(X)$

Therefore, mean squared error is minimized for $a=E(X)=\mu$

Were the adjustments I made correct with properties of expectations? Or am I making any incorrect assumptions?

user345
  • 109
  • The idea is correct but the algebra is belabored and wrong at times. For instance, the third line is both superfluous and incorrect, and the sixth and seventh lines are really overdoing the obvious. Is this a problem? Only if you seek to understand what's going on. Simplicity is your friend, because it helps you see the crux of the matter. – whuber Jul 20 '17 at 17:05
  • @whuber Funnily enough, the third line is actually correct which is shown later, but I think that's just a coincidence. It's not obvious at all from the way it's presented. – Bridgeburners Jul 20 '17 at 17:09
  • @whuber I agree that I added more lines than would really be necessary to understand what was going on. I am wondering if you could tell me how the third line was wrong? – user345 Jul 20 '17 at 17:12
  • @Bridge That's right. Although the third line is a correct statement, it does not follow from its precedent nor does its consequent follow from it. Indeed, it accomplishes nothing in the derivation. It seems to reflect an algebraic error of the form "$(x+y)^2=x^2+y^2$". – whuber Jul 20 '17 at 17:12
  • @whuber wow, not sure how I did not notice that earlier. My edit reflects what I had meant. Is this correct now? I agree it still has more detail and lines than strictly would be necessary – user345 Jul 20 '17 at 17:18
  • 3
    If you want to try another way you can do the following. Take the derivative of $E([X-a]^2)$ with respect to $a$ (note that derivatives and expectations are both linear operators so they commute) and find the root of that derivative. Then take the second derivative to verify that the root is a minimum. – Bridgeburners Jul 20 '17 at 17:26
  • @ap2010 "and this function is now minimized at $a = \mu$, which is equal to $E[X]$" <- you are correct to think of this MSE as a function in $a$, however plugging in $\mu$ for $a$ does not set the function equal to $E[X]$. When you find the right $a$ (maybe you have already), you should get a constant number that's a lower bound for this function (a function of $a$). This this lower bound is obtainable, it must be the minimum. – Taylor Jul 20 '17 at 17:53
  • The notation is dreadful. From the fifth line onwards, the standard interpretation of $E(X-\mu)^2$, with parentheses having higher priority than exponentiation (remember PEMDAS?), is that the expected value of $X-\mu$ is being squared. The expected value of $X-\mu$ is, of course, $0$. What you need to write is $E([X-\mu]^2)$, etc. – Dilip Sarwate Jul 20 '17 at 18:32

0 Answers0