3

I'm reading a textbook. In the chapter about least square regression I red that A simple linear least square model can be described as \begin{equation} Y = \alpha + \beta x + e \end{equation} where Y is the mean response, x is an independent variable, and e is a random variable representing an error. This raises a question to me.

Question 1. If Y is the 'mean' response, why do you need a random error on the right hand side?

Now, the textbook says that when there are $n$ pairs of ($x_{i}$, $Y_{i}$) where $i$ is the data point index. If the estimators of $\alpha$ is A and that of $\beta$ is B, then the estimator of the $Y_{i}$ is $A+Bx$. This is also confusing.

Question 2. Isn't it true that $Y_{i} = A+Bx_{i}+e_{i}$ ?


It seems that I misunderstood the meaning of $Y$ in the first equation. I thought that is the mean of the response, but it might be actually just the response which is a random variable. Could someone comment on this? Then I would delete the Question 1 and Question 2 and then leave only the thirds question and select the answer. That might be reasonable, because Question 1 and 2 are based on a wrong assumption.

Nownuri
  • 409

2 Answers2

0

The real life model can be described as:

\begin{equation} Y = \alpha + \beta x + e \end{equation}

If we compute $E(Y\vert X)$ we would have:

\begin{equation} E(Y\vert X) = A + B x \end{equation}

For second question:

\begin{equation} E(Y_i\vert X_i) = \alpha + \beta x_i \end{equation}

and in the original model:

\begin{equation} Y_i = \alpha + \beta x_i + e_i \end{equation}

Dave
  • 62,186
Duck
  • 261
  • @Dave many thanks for editing my post! – Duck Dec 13 '20 at 17:54
  • For question 1, Y is already a mean value. Then what's the meaning of its expectation value? For question 2, the textbook says that $A+Bx_{i}$ is an estimator of $Y_{i}$, but it does not specify what kind of estimator it has to be. Then isn't the expectation value in your explanation only an example? – Nownuri Dec 14 '20 at 15:45
  • You use expectation for Y, so no clear notation I think! – Duck Dec 14 '20 at 16:38
0

Just saw this, so I will answer the third question only. Indeed, the point you made was correct; the main assumption we make in simple linear regression as you point out is that given $X = x$,

$$Y = \alpha + \beta x + \epsilon.$$

Here, I am treating $X$ as a random variable whose value we observe as $x$. Therefore, $Y$ is random because of our random error $\epsilon$; $Y$, in a sense, "inherits" the randomness from $\epsilon$.