3

Suppose we model the random variable $Y$ as follows: $$\mathbb{E}[Y]=\beta_0+\beta_1x_1.$$ Now many statistics textbooks treat $\beta_i$ as parameters, which is simply constants (correct me if I am wrong), and $x_1$ as a observable varaibles. My concern is how can you do that in theory?

My rationale is that given the probability space we are running the regression on, the expected value of $Y$ should be a fixed number rather than depend on some other observable variables. Moreover, it seemes more natural to think of $x_1$ as a realization of the random variable $X_1$ and write the model specification as $$\mathbb{E}[Y\mid X_1=x_1]=\beta_0+\beta_1x_1.$$

Help is certainly appreciated! Also, any suggestions concerning texts or articles I should take a look is also appreciated.

Kun
  • 502

1 Answers1

3

You're correct that your second notation is more reasonable. In the first case, we're usually just being lazy concise and not notating the explicit dependence of $Y$ on $X$. Alternatively, you might write something like $$ Y= \beta_0 + \beta_1 X_1 + \varepsilon $$ where $\varepsilon $ is some zero-mean random variable representing noise.

Danica
  • 24,685
  • On a side note, The second notation is very explicit in for example economics. I (think) the first is more common in STEM areas where it is more natural to think of X as fixed in repeated samples – Repmat Mar 11 '16 at 07:59
  • @Repmat Hello! Can you elaborate on your comment concerning its notation in economics a bit please? In fact, I am reading both an econometrics textbook and a stats textbook(standard intro to mathematical stats). – Kun Mar 11 '16 at 16:03