0

Suppose we have a two sample t-test where the data have unequal variances and different counts. Let $X_1, \ldots, X_{N_1}$ be the first set of data and $Y_1, \ldots, Y_{N_2}$ be the second set of data. Then, a test statistic for this can be written from the Welch's Two-Sample Test as:

$$ T = \dfrac{\bar{X}- \bar{Y}}{\sqrt{\dfrac{s_1^2}{N_1} + \dfrac{s_2^2}{N_2}}} $$

where $s_1^2$ and $s_2^2$ are the sample variances.

Then, by the definition of the p-value, it is the probability of our test statistic being as extreme as our observed data. Hence,

$$ p = \mathbb{P}\left(T>t|H_0 \ \text{true}\right) $$

Now, my question is, is $t$ the computed test statistic under the formula:

$$ \dfrac{\bar{X}- \bar{Y}}{\sqrt{\dfrac{s_1^2}{N_1} + \dfrac{s_2^2}{N_2}}} $$

using the real data? Do we then MUST know the distributional form of $T$?

user321627
  • 4,474

0 Answers0