Implication of $E(XY)\neq 0$

Question

Consider two real-valued scalar random variables $X,Y$ such that $$ (1) \quad E(XY)\neq 0,\quad E(X)= 0, \quad E(Y)= 0 $$ Let $g: \mathbb{R}\rightarrow \mathbb{R}$ be a function. Under which conditions on $g$, (1) implies $$ E(Xg(Y))\neq 0 \quad ? $$

PS: The comments below gives an insightful answer for the case where $E(XY)\neq 0, E(X)\neq 0, E(Y)\neq 0$

What kind of conditions are you looking for? This inequality is so general that's hard to conceive of what would constitute an effective answer. Please note that neither inequality says anything at all about correlation or lack thereof, making one wonder whether this is the question you intended to ask. — whuber, Jul 08 '22 at 18:45
I've deleted the word correlation from my question. I've added domain and codomain of $g$. I was hoping for something like: (1) implies $E(X g(Y))\neq 0$ for any $g: \mathbb{R}\rightarrow \mathbb{R}$. But I'm not sure this is true. — Star, Jul 08 '22 at 19:04
When $X$ and $Y$ are independent, $0\ne E(XY)=E(X)E(Y)$ implies $E(X)\ne 0\ne E(Y).$ Your conclusion will hold if and only if $E[g(Y)]\ne 0,$ which covers a huge set of functions that really can't be any more succinctly characterized than that. In particular, if $h$ is any measurable function for which $b=E[Xh(Y)]$ is finite, then for any number $a\ne b,$ setting $g(Y)=h(Y)-a$ gives $E[Xg(Y)]=(b-a)E[X]\ne 0.$ In this sense, almost every function $g$ has the property you seek. When $X$ and $Y$ are not independent, there's even less one can say. — whuber, Jul 08 '22 at 19:25
"there's even less one can say" means we cannot really say when (1) implies the condition I desire? — Star, Jul 08 '22 at 19:41
I have just given a full characterization of what is going on. As you can see, $g$ is perfectly arbitrary subject to mild constraints (namely, finiteness of the expectations involved). Very roughly, to use intuitive language, if you were to take an arbitrary function $g,$ $E(Xg(Y))$ would be nonzero and it's only purely by accident this expectation would equal $0.$ — whuber, Jul 08 '22 at 19:46
@whuber So $\mathbb E\big[Xg(Y)\big] \ne 0$ iff $\mathbb E\big[g(Y)\big]\ne 0?$ — Dave, Jul 12 '22 at 16:44
@Dave That wouldn't be correct. Please note that the OP made substantial changes to the question after those comments were posted. The nature of the problem hasn't changed, though. Assume only that $E[XY]\ne 0.$ Given any measurable function $g$ and number $c$ define $g_c(y)=g(y)+cy$ and compute $$E[Xg_c(Y)]=E[X(g(Y)+cY)]=E[Xg(Y)]+cE[XY].$$ Setting $c=-E[Xg(Y)]/E[XY]$ gives $E[Xg_c(Y)]=0$ and otherwise $E[Xg_c(Y)]\ne 0.$ That's all one can say in such generality. You decide whether this is trivial or not. — whuber, Jul 12 '22 at 17:21

score 4 · Accepted Answer · answered Jul 12 '22 at 23:40

Two conditions need to hold.

First, $Xg(Y)$ must be a random variable. It would suffice for $g$ to be a measurable function, because that guarantees $g(Y)$ is a random variable, whence $Xg(Y)$ is. However, that's an overly strong condition. To see why, consider (for example) the case where $X = Y.$ Let $h$ be a non-measurable function defined on the zeros of $X$ and define

$$g(y) = \left\{\begin{aligned} y,& & X\ne 0 \\ h(y), & & X = 0.\end{aligned}\right.$$

This $g$ is not measurable by construction, but $XY = Xg(Y)$ is measurable.

Consequently, we must only require that the function $Xg(Y)$ be a random variable, which is a subtle restriction on $g.$

Second, the set of functions $g:\mathbb R\to \mathbb R$ satisfying the first criterion form a vector space $V_{X,Y}.$ Given any two such functions $g_1$ and $g_2$ and two real numbers $\lambda_1$ and $\lambda_2,$ the function $$X(\lambda_1 g_1(Y) + \lambda_2 g_2(Y)) = \lambda_1 Xg_1(Y) + \lambda_2 Xg_2(Y)$$ is measurable because it is a linear combination of measurable functions.

Take any $g\in V_{X,Y}.$ There are three cases to consider: either $E[Xg(Y)]$ does not exist; it is infinite; or it is finite. Trivially $E[Xg(Y)] \ne 0$ in the first two cases. The third case forms a vector subspace $V^0_{X,Y}\subset V_{X,Y},$ as is straightforward to check (use linearity of expectation).

Assume $E[XY]\ne 0$ is finite.

Suppose $g\in V^0_{X,Y}.$ For every (finite) number $c$ define

$$g_c(y) = g(y) + cy.$$

These are all measurable functions and they are all in $V^0_{X,Y}$ because

$$E[Xg_c(Y)] = E[X(g(Y) + cY)] = E[Xg(Y)] + cE[XY]\tag{*}$$

is finite, being the sum of two finite values. Geometrically, this set of functions $$\langle g \rangle_Y = \{g_c\mid c\in\mathbb R\} = \{g(Y) + cY\mid c\in\mathbb R\}$$ is an affine line in $V^0_{X,Y}.$

Because $E[XY]\ne 0$ is finite,

$$c^*(g) = -E[Xg(Y)]\ /\ E[XY]$$

is defined, unique, and by $(*)$ assures

$$E[Xg_{c^*(g)}(Y)] = E[Xg(Y)] - \frac{E[Xg(Y)]}{E[XY]} E[XY] = 0.$$

Consequently,

when $E[XY]\ne 0$ is finite, every function $g$ in the punctured line $\{g_c\mid c\ne c^*(g)\}$ satisfies $E[Xg(Y)]\ne 0.$

This is a sketch of the vector space $V^0_{X,Y}.$ The identity function $\iota: y\to y$ is one of the vectors, shown as the red arrow. Some of the punctured lines parallel to $\iota$ are drawn for background. The set of "special" functions $g_{c^*(g)}$ for which $E[Xg_{c^*(g)}(Y)]=0$ has been removed, leaving the white trace. One random dot at the lower left represents a generic function $g.$ Its projection in the $\iota$ direction onto $g_{c^*(g)}$ is shown to its upper right.

In this picture of the situation,

the functions $g$ for which $E[Xg(Y)]\ne 0$ consist of all points not on the white curve.

The exact nature of the white curve depends on the joint distribution of the random variables $(X,Y).$ What we have seen is that it intersects each of the affine lines parallel to $\iota$ exactly once: they form a line bundle.

Finally, I haven't dealt with the possibility that $E[XY]$ is infinite. If there exists a nonzero measurable function $f$ for which $E[Xf(Y)]$ is finite and nonzero, we may employ $f$ in defining $g_c(y) = g(y) + cf(y)$ and proceed as before. Unfortunately, there needn't exist such an $h$ in general (although there usually is in any statistical application).

The assumptions $E[X] = 0 = E[Y]$ were never used. Unfortunately, they don't help us out of the problems with infinities: it is possible for these additional assumptions to hold, yet for $E[XY]$ to be infinite. For instance, let $X$ have a Student $t(3/2)$ distribution and $Y=X.$

Thanks. How can I reconcile your answer with the answer below? Is it correct to say that the function $g(Y)=Y^k$ with $k$ even does not belong to $V^0_{X,Y}$ and, therefore, is not part of your claim? — Star, Jul 13 '22 at 10:41
There is no inconsistency in results. The other answer points out some special situations where $c^*(g)=0.$ That, however, is not a characterization of all such $g$ and as such is only a partial (and limited) answer to your question. Whether $g(y)=y^k$ is in $V^0_{X,Y}$ depends on the joint distribution. For instance, when $X$ is Normal, $X=Y,$ and $k\gt -1,$ all such $g$ are in $V^0.$ — whuber, Jul 13 '22 at 11:31
Does a similar relation holds when working with covariances? That is, can we show that $cov(X,Y)\neq 0$ implies $cov(X,f(Y))\neq 0$ for a large class of functions $f$? — Star, Aug 10 '22 at 16:39
That's too general to answer. You would need to specify what family of random variables $(X,Y)$ you have in mind. It is not generally true that a nonzero covariance for $(X,Y)$ implies a nonzero covariance for $(X,f(Y))$ unless you somehow restrict $(X,Y)$ and $f.$ — whuber, Aug 10 '22 at 20:53

score 3 · Answer 2 · answered Jul 12 '22 at 18:50

As the comments pointed out, the degree of generality sought by the OP is quixotic.

Let $(X,Y)$ have a joint symmetric bivariate distribution in the sense that

$$(X, Y) \sim_d (-X, -Y).$$

This already implies $E(X) = E(Y) = 0$, and we assume also that $E(XY) \neq 0$.

Let $g(z) = z^k$, $k$ is an even number. Then (see https://stats.stackexchange.com/a/352218/28746)

$$E[Xg(Y)] = 0,\;\;\; \forall k.$$

So, no, eq. (1) of the OP cannot imply that this coproduct will have non-zero expected value for any $g$ (as the OP hoped for in a comment), since for this class of distributions and for this $g$ this expected value becomes zero.

An exception is when $g(x) = a+bx$. – Sextus Empiricus Jul 12 '22 at 18:57 — Sextus Empiricus, Jul 12 '22 at 18:57

Implication of $E(XY)\neq 0$

2 Answers2

Linked