0

I'm trying to interpret an early and pretty dense (to me) paper on the theory of linear regression:

Bartlett, M. S. (1934). On the theory of statistical regression. Proceedings of the Royal Society of Edinburgh, 53, 260-283.

The paper, and particularly the second half, is devoted to deriving as much as possible about the regression procedure, while making as few assumptions as possible. In the paper's notation, each of the $p$ predictor variables is denoted $\xi_i$, $i=1, 2, ..., p$. At one point, the author states:

This requires only that $\xi_{2.1}$ exists and is normal (this condition implies the regression of $\xi_2$ on $\xi_1$ is linear, and that the variance of $\xi_2$ is constant). (p. 277)

My specific interest is in the parenthetical statement (as opposed to the thing that requires the condition).

My interpretation of this statement is: If one wants to establish that the regression of $Y$ on $X$ is both linear and homoscedastic, it is sufficient to prove that the distribution of $Y$ conditional on $X$ is normal. (I'm assuming both $X$ and $Y$ are iid, and that neither has a pathological population distribution, but I place no constraints on the value of their correlation.) No proof or citation is given, and it is not entirely clear to me whether he's saying this is true generally, or only conditional on previously-stated assumptions. However, from the context, I infer he is making a general statement that stands on its own.

My question to you is: Is the statement, as I interpret it, a true statement? To be clear, I'm not asking what the author meant or what the significance of his statement is. I provide the quote/citation mainly for context. I'm only asking about the validity of my bolded statement. Is it true on its own? If not, what additional assumptions would make it true?

virtuolie
  • 528
  • Consider the simplest possible non-trivial example where $X$ may be $0$ or $1$ and $Y$ has one Normal distribution when $X=0$ and another Normal distribution when $X=1.$ What would it take for the regression to be homoscedastic? You can concoct another nice example by considering three distinct values of $X$ and considering what it would take for the regression of $Y$ against $X$ to be linear. – whuber Mar 03 '24 at 18:04

0 Answers0