I have two dataset that i want to compare. each dataset contain the weight of 10 different person measured for 3 different day.
I am interested in measuring the probabily that the two sample originate from the same population.
People seem to suggest doing a Kolmogorov-Smirnov test but i need a measurement.
I was thinking doing the EMD to compare the distribution for each day
EMD(dataset1.day1,dataset2.day1) + EMD(dataset1.day2,dataset2.day2) + EMD(dataset1.day3,dataset2.day3)
where dataset1.day1 is the histogram of the value for day1 in dataset 1...
But i could probably take each person as a 3d datapoint and do the EMD in 3d.
One other possibility was to do the Hausdorff distance but doing the average of the distance for each point instead of taking the maximum distance.
The two dataset have very different skewness so i was also considering using the Mann-Whitney-Wilcoxon_test.
What are the main difference between the two technique.
(2) in each sample the weight are for the same 10 people each day.
But the people in each sample are different which mean we have 20 people in total.
(3) at random (4) Hausdorff distance between the 3d point from the first dataset and the 3d point of the second dataset
– skyde Jul 01 '11 at 19:07