I must be missing something, or under-thinking what goes on with a basic T Test, but I was under the impression that if I do a T-test, and one of my samples was made up of only one observation, the test would fail.
To see if this is indeed the case, I ran some code in Python code. I made up some numbers, where my first sample has 15 values and the second sample has only 1:
import statsmodels.api as sm
print "This is the result of the T-test:"
ttest = sm.stats.ttest_ind(np.array([821,823,814,815,816,817,881,891,234,354,678,765,989,435,657]), np.array([21]), alternative='larger')
print "pvalue:", ttest[1], ", at alpha = 0.05, thus the result is:", ttest[1] <= 0.05
Here's the result:
This is the result of the T-test:
pvalue: 2.72522964545e-08 , at alpha = 0.05, thus the result is: True
I am getting an answer saying that there is a significant difference between the two sample means. Is this simply a bogus answer?
I was under the impression that one observation can't be assumed to belong to a normal distribution, so this assumption for a T-test itself can not be verified. Thus the test should not be appropriate.