How can I test if a sample was created from a specific discrete distribution.
For example, if I have the following distribution
1- 0.2
2- 0.5
3- 0.3
and I get the following sample, [2,2,2,1,1,3,2,2,1] ( the order is not important )
How can I test if the sample was created from the distribution? Or how can reject the hypothesis that sample came from the distribution.
Thanks.
EDIT
What do you think about the following python code?
In the code I create 100000 samples
each sample is with size 9 and is from the discrete probability that I wrote above, for example Sample1 = [1,2,2,1,3,1,1,1,2]
Sample2 = [2,1,2,2,3,2,1,1,2]
Sample3 = [1,2,2,1,2,1,2,1,1] ...
Now I count how many repetition I have from each sample
( The order does not count. [1,1,1,1,1,1,2,2,2] == [2,2,1,1,1,1,1,2,1])
I will get the probability for each sample. after I have the probability of each sample I sum all the probabilities that are lower then the probability of my sample ( The original sample in the question ) .. this is my p value? ( I got the idea from this link http://en.wikipedia.org/wiki/Multinomial_test )
from collections import Counter
import scipy
NumberOfRuns = 100000.0
z = [tuple(sorted(random.choice(3,9,p=[0.2,0.5,0.3])+1)) for i in arange(NumberOfRuns)] # Create the sample
zz = Counter(z) # Count how many there are from each option.
Psig = 0
# Following the direction in this link http://en.wikipedia.org/wiki/Multinomial_test
for i in sort(zz.values()): #Check the sum of all probabilities that are below my sample probability
if i>z.count((1,1,1,2,2,2,2,2,3)):
print 'The sample is more common then ', Psig/NumberOfRuns, ' of all other samples, If this is above 5% you can not reject the hypothesis '
break
Psig+=i