0

Given the model $y = f(x) + \epsilon, f(x) = Wx$, I want to find an estimate of $Var(Y)$. Note here I don't account for the randomness in the input $x$, but rather I consider it a deterministic value.

  1. Since $\epsilon\sim \mathcal{N}(0,\sigma)$ then $Y|X \sim N(f(x),\sigma)$. First, we know that $\mathbb{E}[Y|X] = f(X)$, then $\mathbb{E}[Y] = \mathbb{E}_X\mathbb{E}_Y[Y|X] = \mathbb{E}[f(X)] = f(x)$.

  2. Finding $\mathbb{E}[Y^2]$: $\mathbb{E}[Y^2] = \mathbb{E}_X[\mathbb{E}[Y^2|X]]$, note that $y^2 = f^2(x) + 2f(x)\epsilon + \epsilon^2$, which means $Y^2|X \sim \mathcal{N}(f^2(X),\sigma (\sigma + 4f^2(X)))$. That is, $\mathbb{E}[Y^2] = \mathbb{E}[f^2(X)] = f^2(x)$

  3. $\mathbb{E}[(Y - \mathbb{E}[Y])^2] = \mathbb{E}[Y^2] - \mathbb{E}[Y]^2 =f(x)^2 - f(X)^2 $ = 0.

Obviously, the result is wrong, but I cannot capture the error that I made.

Another attempt is to use the law of total variance:

$Var(Y) = \mathbb{E}[Var(Y|X)] + Var(\mathbb{E}(Y|X)) = \mathbb{E}[\sigma] + Var(f(x))$. Note that if we consider x as a random variable, then I guess this refers to the error of the input data (The variance of the input distribution) (please correct me if I am wrong), but since we consider it as a deterministic value, we have $Var(Y) = \sigma$.

I am pretty sure I have done penalties of mistakes that I am not aware of. Can you please help me spotting the mistakes I have made?

rando
  • 236
  • If $X$ is deterministic, please explain what you might mean by "$\operatorname{Var}(Y).$" That doesn't appear to have any definition. – whuber Apr 29 '23 at 20:30
  • @whuber In the case of linear regression, and based on my knowledge, I did not see anyone to account for $P(X)$, so I assumed they consider it a deterministic quantity. Is my understanding wrong? – rando Apr 29 '23 at 21:13
  • 1
    Note that in a regression (whether linear or nonlinear) you condition on the X's. This should simplify many of your expressions. e.g. in that case $E(f(x)\epsilon) = f(x) E(\epsilon)$ ... etc – Glen_b Apr 30 '23 at 05:16
  • 1
    As @Glen says, you either condition on $X$ or treat it as deterministic. In either case, there is no relevant probability distribution that applies, and so $Y$ at best can be considered a collection of random variables: it doesn't have any "variance" in any sense. Indeed, by viewing $X$ as deterministic you're really treating $Y$ as a stochastic process in which $X$ is the parameter. – whuber Apr 30 '23 at 14:42
  • @whuber. Excuse my ignorance, and I hope you bear with me. Assume that $y = wx+ \epsilon$, and both $w$ and $x$ are deterministic, $\epsilon \sim \mathcal{N}(0,\sigma)$, does not that mean that $Y\sim \mathcal{N}(wx,\sigma)$? Here $Y$ does have a distribution and does have a variance. Thank you. – rando Apr 30 '23 at 14:48
  • 1
    I suspect the notation might be confusing you: the "$wx$" expression means that each $Y$ has a different distribution for each different value of $wx.$ What would "the variance" of a collection of differing distributions possibly mean? If you want to know the variance of $Y$ for any particular value of $wx,$ it's right in front of you: that's what $\sigma^2$ is. – whuber Apr 30 '23 at 14:52
  • @whuber Oh, I see. Thanks for clarifying my confusion. If we assumed that $w$ is deterministic and $X\sim P$ (Using machine learning terminology, that is called the generative distribution). Then, my conclusion that the unconditional variance of $Y$ is right? That is, $Var(Y) = \mathbb{E}[Var(Y|X)] + Var(\mathbb{E}(Y|X)) = \mathbb{E}[\sigma] + Var(f(x)) = \sigma + Var(f(x)) $. – rando Apr 30 '23 at 14:59
  • 1
    Right: when $(X,Y)$ is a joint random variable, you simply apply the usual rules of conditional variance to obtain the unconditional variance of $Y.$ – whuber Apr 30 '23 at 15:06

0 Answers0