1

In Bishop's "Pattern Recognition and Machine learning" in chapter 1.2.2, the author introduces the concept of expectations of functions of two random variables. In pg 20, introduces the following term,

$$ E_x[f(x,y)] $$

And goes on to remark on it, saying that it-

denotes the average of the function f(x, y) with respect to the distribution of x. Note that Ex[f(x, y)] will be a function of y.

Why is the expectation a function of y?

1 Answers1

3

The notation is ambiguous because (a) it is using lower cases and (b) $X$ and $Y$ are both random variables, possibly dependent random variables. One then wonders whether it should be $$\mathbb E_{X|Y=y}[f(X,y)]=\dfrac{\int_\mathfrak X f(x,y) p_{X,Y}(x,y)\,\text d x}{\int_\mathfrak X p_{X,Y}(x,y)\,\text d x}$$ or $$\mathbb E_{X}[f(X,y)]=\int_\mathfrak X f(x,y) \int_\mathfrak Y p_{X,Y}(x,z)\,\text d z\,\text d x$$ where $p_{X,Y}$ denotes the joint density of $(X,Y)$.

Xi'an
  • 105,342
  • 1
    (1) I don't understand why case might matter: authors should be allowed to adopt whatever typographical conventions they find convenient. (2) In light of that, why isn't the verbal description sufficiently clear? Blindly applying the definition of expectation with respect to a distribution gives $$E_X(f(X,Y))=\int_{\mathbb R}f(x,Y),\mathrm dF_X(x)$$ where $F_X$ is the distribution function of $X.$ Whatever "$Y$" might represent, regardless whether it's written in upper or lower case or even Egyptian hieroglyphics, the right hand side is explicitly a function of $Y.$ – whuber May 03 '23 at 15:25
  • 1
    @whuber While all notations are welcome in their diversity, I find Bishop's $$\mathbb E_x[f(x,y)]=\sum_x p(x)f(x,y)$$confusing for beginners... – Xi'an May 03 '23 at 16:10
  • 1
    Hmm...that looks like an approach based on Nonstandard Analysis ;-). – whuber May 03 '23 at 16:37