16

This should be a relatively simple question. I'm trying to confirm my understanding of the subscript notation on expectations when the subscript denotes a conditioning. In the example $$E_{Y|X}[(Y-f(X))^2|X]$$ the subscript denotes the distribution over which you take the expectation, so we would want to use $p_{Y|X}(y|X)=P(Y=y|X)$ as the distribution in our expectation and sum over the density of $y$. And the argument $$[(Y-f(X))^2|X]$$ means that when we take the expectation we should consider the value of $X$ in $(Y-f(X))^2$ to be given. This would imply that the statement can be rewritten: $$ E_{Y|X}[(Y-f(X))^2|X]=\int_{-\infty}^{\infty} [y-f(X)]^2 p_{Y|X}(y|X) dy. $$ And if I wanted to find $E_{X}E_{Y|X}[(Y-f(X))^2|X]$ I could rewrite that as: $$ E_{X}E_{Y|X}[(Y-f(X))^2|X]=\int_{-\infty}^{\infty} \left( \int_{-\infty}^{\infty} [y-f(X)]^2 p_{Y|X}(y|X) dy]) \right) p_{X}(x)dx. $$ Is this all correct?

Correction Updated last line to fix an error that was caught in the comments: $$ E_{X}E_{Y|X}[(Y-f(X))^2|X]=\int_{-\infty}^{\infty} \left( \int_{-\infty}^{\infty} [y-f(x)]^2 p_{Y|X}(y|x) dy]) \right) p_{X}(x)dx. $$

tjnel
  • 1,042
  • 2
    They are logically and consistently used, but sometimes notation is a matter of convention. You may also want to look up this Q&A : http://stats.stackexchange.com/questions/72613/subscript-notation-in-expectations/72614#72614 – Alecos Papadopoulos Nov 09 '13 at 00:26
  • @AlecosPapadopoulos thanks. I read that Q&A, which helped me to interpret $E_{Y|X}[(Y-f(X))^2|X]$ in the manner I have. I just want to confirm that I'm correctly understanding the term. The last equation I specified is referred to as the Expected Prediction Error of the model $f(X)$ in the book I'm reading. – tjnel Nov 09 '13 at 00:40
  • 2
    You are -and in a more clear manner than one usually finds, which is good. – Alecos Papadopoulos Nov 09 '13 at 00:42
  • 2
    The last line makes no sense. It does after substituting $f(x)$ for $f(X)$ and $p_{Y|X}(y|X=x)$ for $p_{Y|X}(y|X)$. – Stéphane Laurent Nov 09 '13 at 13:53
  • @StéphaneLaurent good catch. I updated the post to reflect what I meant the last line to be. – tjnel Nov 09 '13 at 23:25

1 Answers1

1

This is right and what you have in the end can be simplified, e.g.: $$ E_{X}\big[E_{Y|X}[(Y-f(X))^2|X]\big] \quad=\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} [y-f(x)]^2 p_{X,Y}(x,y) \ dx\,dy \quad =E_{X,Y}[(Y-f(X))^2], $$ and manipulated: $$ =E_{Y}[Y^2] -2 E_{X,Y}[Y.f(X)] + E_{X}[f(X)^2] . $$ The expectation notation is just ... notation, whereas the mathematical notation is more explicit/universal and the "safest" way to consider things.

I don't believe the condition inside the expectation square brackets $E_{Y|X}[\cdot|X]$ is necessary or adds anything when the distribution is explicit in the subscript, i.e. $E_{Y|X}[g(X,Y)|X]=E_{Y|X}[g(X,Y)]$, whereas it would be necessary if the subscript were omitted (as often the case): $E[g(X,Y)|X]\neq E[g(X,Y)]$, since the latter would typically be an expectation over $p(X,Y)$ and the former over $p(Y|X)$.

Carl
  • 208