My question is quite similar to Chi-squared testing for say TWO samples from TWO distributions, however I want to go a little further.
So, suppose I have a N pairs of samples from two normal distributions, but I don't know which sample in each pair comes from which distribution. I want to assign samples to distributions, or decide that they come from neither. I don't think I can quite follow the method in the link above because I have more than one test I need to do to make these decisions. Potentially for each pair of samples I could construct two null hypothesis tests, that sample 1 comes from distribution 1 and sample 2 comes from distribution 2 and vice versa. And then compare the outcomes to see which hypothesis is more easily ruled out, or if I should conclude that one or both of the samples comes from neither distribution etc.
However, this involves multiple hypotheses and I expect that I am going to mess up the statistical properties of the test by performing them in a naive way like this. So perhaps there is a smarter thing I should be doing?
In principle I also want to do this for N mixed-up samples from N distributions, so issues arising from multiple testing might get especially severe there?
It's actually a bit worse than that too because actually I am not matching samples to distributions, I am matching them to each other. I.e. I just have the pairs of samples and I want to match them to each other, cluster them I guess, under the assumption that they come from a common pair of distributions.