What if anything is being fit in a one-sample t-test?

Question

I am revisiting some basic concepts involving t-tests and ANOVAs, and got tripped up early. I wanted to apply the concept of lack-of-fit sum of squares to the single sample t-test but wonder how this can be treated as a ordinary LS problem, if at all. In linear least squares there are adjustable fitting parameters and a sum of squared errors that is minimized. One condition that is apparently satisfied as a result of the fit is $$\sum_i \epsilon_i =\sum_i (y_i-\hat{y_i})=0 $$ where $\epsilon_i$ is the error associated with the difference between the response variable $y_i$ and the model $\hat{y_i}$. This then allows some terms to be set to zero when partitioning the SSE into pure error and regression terms, which are then used to compute the ratio on which a ratio test (a t or more generally an F-test) is based. At least that's how I understand the connection between these concepts (outlined for instance in this Wikipedia page).

However in a single or two-sample t-test there are no adjustable parameters at all, since we stipulate a rigid model (say that the population mean or difference of means equals a fixed value). How to show that the sum of errors equals 0, justifying the partition of summed squares? This seems essential to showing the connection with least-squares, or perhaps it isn't? Maybe a preliminary question is, what if anything is being fit in a one-sample t-test?

I realize there are related questions and I am going through some of these but am guessing my question differs significantly.

Re "no adjustable parameters at all:" what, then, is the mean in the one-sample test or the difference of means in the two-sample test?? — whuber, Mar 08 '24 at 15:16
You might have a look at Common statistical tests are linear models. — jdonland, Mar 10 '24 at 00:07
@whuber ok, I missed that regression and computation of a mean using a standard expression (implicitly) are the same thing. Thanks. But I am still somewhat confused about how the model is defined, for instance $y=β_0, H_0:β_0=0$ in the doc linked by jdonland. — Buck Thorn, Mar 11 '24 at 18:41
The model is more explicitly written $$y_i = \beta_0 + \varepsilon_i$$ with the assumptions that the $\varepsilon_i$ are independent, have zero mean, and a common finite variance. — whuber, Mar 11 '24 at 20:42

score 2 · Answer 1 · answered Mar 11 '24 at 15:16

2

The one-sample t-test corresponds in part to fitting an intercept-only linear model. This model does have a parameter: the intercept. See if you can convince yourself that if $X$ is the all-ones vector then the familiar ordinary least squares solution $(XX^\top)^{-1}Xy^\top$ is equal to the sample mean $\bar{y}$.

See also Common statistical tests are linear models by Jonas Kristoffer Lindeløv.

answered Mar 11 '24 at 15:16

jdonland

247

Thank you. I see now how regression and computation of a mean and minimization of variance about a parameter estimate are related and implemented in the t-test to generate a model estimate $\beta_0$. The link you provide is helpful with that. What I missed and still confuses me somewhat is the deviation between this model parameter and the actual population mean $\mu$. Afai understand the likelihood of observing that difference, using the computed variance as an estimate of the actual population variance, is what the test tests. That's ok. The question of what is the model still confuses. – Buck Thorn Mar 11 '24 at 18:36

What if anything is being fit in a one-sample t-test?

1 Answers1