I was reading a computer vision paper and the authors approximated $\left<f(X)\right>_{Q}$ with $f( \left< X \right>_Q)$ where $f(\cdot)$ is nonlinear. Are there any rules of thumb for this kind of approximation? In which cases we can use such an approximation? I am aware that this is a very general question, so any examples will do just fine.
1 Answers
I will use $E$ for expectation, rather than angle brackets.
First of all, $E(f(X))$ can always be "approximated" by $f(E(X))$; the only question is the accuracy and adequacy for purpose of that approximation, which can be very context-specific.
If $f$ is linear (or more generally, affine), $E(f(X)) = f(E(X))$, and so the order of evaluating the function $f$ and expectation can be interchanged without introducing any error. To the extent that $f$ is "almost" linear, then $E(f(X))$ may be almost equal to $f(E(X))$; ultimately it comes down to quantifying this in a given case.
If $X$ takes on a specific constant value with probability 1, then $E(f(X)) = f(E(X))$, regardless of $f$. To the extent that $X$ has a distribution very close to being equal to a specific constant with probability 1, then $E(f(X))$ may be almost equal to $f(E(X))$; ultimately it comes down to quantifying this in a given case.
Depending on the function $f$ and the probability distribution $X$, there could be other combinations which also allow interchange of order without introducing any error. Here is a simple contrived family of examples. Let $f(x)$ = some function of $x$ for $x > 0$, $f(0) = 0$, and $f(x)= -f(-x)$ for $x < 0$. Let $X$ be a random variable which is symmetric about 0, and assume that $E(f(X))$ exists. Then $E(f(X)) = f(E(X))$, which happens to equal zero.
If $f$ is convex or concave, then Jensen's inequality https://en.wikipedia.org/wiki/Jensen's_inequality can be used to provide a one-sided bound on the error in interchanging expectation and a nonlinear function. Specifically, if $f$ is convex, then $f(E(X)) \le E(f(X))$. If $f$ is concave, then $f(E(X)) \ge E(f(X))$.
- 13,342
- 1
- 37
- 58
-
1Nice answer. In some circumstances, the value of the second order term in a Taylor expansion can give a rough idea of the size of the error in treating the function as linear. – Glen_b Jan 31 '16 at 00:00