For a T-test, what would happen if one of my samples was made up of only one observation?

Question

I must be missing something, or under-thinking what goes on with a basic T Test, but I was under the impression that if I do a T-test, and one of my samples was made up of only one observation, the test would fail.

To see if this is indeed the case, I ran some code in Python code. I made up some numbers, where my first sample has 15 values and the second sample has only 1:

import statsmodels.api as sm
print "This is the result of the T-test:"
ttest = sm.stats.ttest_ind(np.array([821,823,814,815,816,817,881,891,234,354,678,765,989,435,657]), np.array([21]), alternative='larger')
print "pvalue:", ttest[1], ", at alpha = 0.05, thus the result is:", ttest[1] <= 0.05

Here's the result:

This is the result of the T-test:
pvalue: 2.72522964545e-08 , at alpha = 0.05, thus the result is: True

I am getting an answer saying that there is a significant difference between the two sample means. Is this simply a bogus answer?

I was under the impression that one observation can't be assumed to belong to a normal distribution, so this assumption for a T-test itself can not be verified. Thus the test should not be appropriate.

You can assume, sometimes with good reason, that both samples are normal variates with a common variance; but you can't check this assumption with these samples. — Scortchi - Reinstate Monica, May 18 '14 at 01:54
So, even if this assumption is made, is there any purpose to using such a T Test when a sample has only one observation anyway? I feel like it doesn't tell us anything. — tumultous_rooster, May 18 '14 at 01:58
It tells you exactly the same thing as when the sample has many observations: the probability under the null hypothesis of equal means that the t-statistic would exceed that observed. Of course its power will be lower. — Scortchi - Reinstate Monica, May 18 '14 at 02:57

Alecos Papadopoulos · Accepted Answer · 2014-05-18T02:53:05.657

Every realization comes from some distribution -there is nothing special with the case of one observation. It may very well come from a normal distribution.

The possible issue with the case of one observation, would be "how do we calculate the sample variance?" Obviously, the sample variance is zero (if calculated without the bias-correction factor), and an indeterminate form $0/0$ if we try to calculate it using the bias-correction factor The t-statistic for the equality of means between two samples of unequal sizes and different sample variances ("Welch's t-test"), is

$$t = {\overline{X}_1 - \overline{X}_2 \over s_{\overline{X}_1 - \overline{X}_2}},\qquad s_{\overline{X}_1 - \overline{X}_2} = \sqrt{{s_1^2 \over n_1} + {s_2^2 \over n_2}}$$

$s_1^2, s_2^2$ should be the sample variance expressions with the bias-correction factor, and so there should be a problem. Also the calculation for the appropriate degrees of freedom involves the magnitude "sample size minus one". So normally, the code should not result -but you have to find out exactly which "t-test" the software runs -there are many. In some of them the t-statistics can be calculated even when one of the samples is of size one.

Aha! I believe I was using the wrong t-test. I have included usevar="unequal" and now have a Welch's t-test on my hands. Thanks! — tumultous_rooster, May 18 '14 at 02:42
And does the test result, even though theoretically it should not? — Alecos Papadopoulos, May 18 '14 at 02:52
One I included the usevar="unequal" option which gave me a Welch's t-test, the result was in line with what we theoretically expected. — tumultous_rooster, May 18 '14 at 03:09

For a T-test, what would happen if one of my samples was made up of only one observation?

1 Answers1