5

Theorem:

$E(X|Y)=E(X) \iff E[Xh(Y)]=E[X]E[h(Y)]$ for all bounded measurable $h$

Does this theorem also hold for all odd moments of $X$, i.e.

$E(X|Y)=E(X) \iff E[X^nh(Y)]=E[X^n]E[h(Y)]$ for all bounded measurable $h$, and $n$ is an odd integer?

1 Answers1

4

No, it does not hold for odd moments (nor for the even moments).

Let's interpret the left hand sides: $E(X\mid Y)$ is the regression of $X$ against $Y$. If it's a constant, as implied by $E(X\mid Y) = E(X)$, that means the expectation of the response $X$ does not depend on the regressor $Y$.

How could we make the expectation of a power of $X$ depend on the regressor even when the mean response does not depend on it? Answer: make the scale or shape of the response depend on the regressor.

Pictures might be even clearer than an algebraic demonstration, so consider this. Let $X$ have a Beta distribution. Make the Beta parameters $\alpha$ and $\beta$ depend on $Y$ while keeping the mean $\alpha/(\alpha+\beta)$ constant. One way is to let $\alpha=\beta=Y$. Let's draw $Y$ uniformly from, say, the interval $(0,4)$ so that we can see a nice spectrum of Beta behavior on the part of $X$: when $\alpha=\beta\lt 1$, the distribution is U-shaped, then as $\alpha=\beta$ increase, it become more and more Normal-looking, centered at $E(X\mid Y)=\alpha/(\alpha+\alpha) = 1/2$. Because the shape changes so dramatically, we could inspect just about any function of $X$--never mind an odd power--and find that this function does vary with $Y$. Let's try the third power, for instance.

Figure

On the left is a plot of $(y_i, x_i)$ for 10,000 independent realizations $i=1,2,\ldots, 10000$ from this bivariate distribution. The red line is a rolling conditional mean, confirming that the mean of $X$ (shown on the vertical axis!) does not appear to vary with $Y$ (on the horizontal axis).

On the right is a plot of $(y_i, x_i^3)$ for the same realizations. The rolling conditional mean is significantly tilted, showing that $X^3$ does vary with $h(Y)=Y$. This should be clear: near $Y=0$, $X$ is practically a Bernoulli variable, so half the time it's near $1$ and the other half it's nearly $0$, whence its mean cube is close to $(1^3+0^3)/2 = 1/2$. For large $Y$, $X$ is concentrated close to $1/2$, whence its mean cube should be concentrated near $(1/2)^3 = 1/8$.

This strong interdependence of $X^3$ and $Y$ implies it's unlikely the expectation of the product $X^3Y$ will be the product of the expectations of $X^3$ and $Y$ separately. Indeed, in this simulation the first expectation is $0.36$ while the product of the other two is $0.46$: they are far apart.

One more thing: a "bounded measurable $h$" is basically a transformation of the regressor $Y$. Think of it as stretching and squeezing the plot horizontally. That helps show that if $E(X^n\mid Y)$ varies with $Y$, then for most transformations $h$, $E(X^n\mid h(Y))$ will vary, too, whence $E(X^n h(Y))$ is unlikely to equal $E(X^n)E(h(Y))$.

Here's the R code for further experimentation.

set.seed(17)
n <- 1e4
y <- runif(n, 0, 4)
x <- rbeta(n, y, y)

add.smooth <- function(x, y) {
  library(zoo)
  i <- order(y)
  n <- length(y)
  x.ts <- ts(x[i])
  y.ts <- ts(y[i])
  x.fit <- as.vector(rollapply(x.ts, n/5, mean, partial=TRUE))
  y.fit <- as.vector(rollapply(y.ts, n/5, mean, partial=TRUE))
  lines(y.fit, x.fit, col="Red", lwd=2)
}

par(mfrow=c(1,2))
plot(y, x, pch=19, col="#00000010", cex=0.5)
add.smooth(x, y)

plot(y, x^3, pch=19, col="#00000010", cex=0.5)
add.smooth(x^3, y)

(c(mean(x^3*y), mean(x^3)*mean(y)))
whuber
  • 322,774
  • Even moments were obvious as we have the GARCH conditional heteroskedasticity model for example. I just couldn't visualize for the odd moments. – Cagdas Ozgenc Jan 14 '17 at 21:38