0

Multiple testing refers to testing more than one hypothesis simultaneously. The situation typically involves having several tests that cannot be combined into a single test, and therefore inference is adjusted to account for the fact that more than one hypothesis is tested.

Suppose I have multiple viable tests of the same hypothesis, none of which dominate the other. The tests cannot be combined into a single test. To control size, I can account for the fact that I have several reasonable tests using a procedure for multiple comparisons. That the intersection of nulls is now a single hypothesis is generally not a problem for such a procedure. (I understand that power of some procedures can be improved because of this homogeneity.)

Question: Does combining multiple tests of a single hypothesis have a name?

Galton
  • 101
  • 1
    (1) How do you combine the results of multiple tests? (2) It's hard to see how multiple-comparisons corrections would be applicable, because they depend on conceptualizing the data as randomly varying, whereas here the data are fixed and the tests vary. There's no randomness to that at all! – whuber Jun 16 '22 at 16:54
  • @whuber thanks for engaging. (1) I am not using one specific method but to give a specific example, one could use Bonferroni (or Holm) and and take the min of the p value of each test and compare to level divided by the number of tests. (2) The data don't vary any less than in a regular test. Imagine a situation where we are comparing two samples of different sizes. An available test only works if the sample sizes are identical. We now have a number of ways to drop obs from the larger sample in order to apply the test. Each way of dropping obs is a new test of the same hypothesis. – Galton Jun 16 '22 at 17:46
  • Multiple testing does not refer to testing more than one hypothesis. It refers to having multiple test of the same null hypothesis. For example, tossing several coins when the null hypothesis is that all of them are fair. (If you have a different null hypothesis for each coin, that the tests are independent and there is no issue at all). – J. Delaney Jun 16 '22 at 18:11
  • 1
    @J.Delaney While the term multiple testing isn't always used very precisely, I would have to disagree with your first two sentences. Romano et al. write that "[m]ultiple testing refers to any instance that involves the simultaneous testing of more than one hypothesis." (here) I believe this reflects what people in statistics generally understand as multiple testing. – Galton Jun 16 '22 at 18:22
  • @J.Delaney That sounds like the opposite of how multiple testing is usually described, which is the testing of various hypotheses, usually based on the same data. But here, if only one specific hypothesis is concerned, it sounds like we are facing a situation very much like when someone is desperately in search of statistical significance and turns to every available test to find it. Clearly that's not the present motivation, but the practice looks the same. – whuber Jun 16 '22 at 18:23
  • Your problem even has its own tag, which i just added. Look at https://stats.stackexchange.com/questions/59360/fishers-method-for-combing-p-values-what-about-the-lower-tail and other posts in that tag. – kjetil b halvorsen Jun 16 '22 at 18:25
  • @kjetilbhalvorsen Thanks for pointing that out. Does "combining p values" correspond to any instance where p values are combined or only to instances where p values of a single hypothesis are combined? Most multiple testing procedures combine p values but don't necessarily test one single hypothesis. – Galton Jun 16 '22 at 18:30
  • Just to follow up, "combining p values" seems to be the closest description of my problem, even though this usually has the implicit assumption of independence and not necessarily the assumption of a single hypothesis. This situation where dependence is allowed and only a single hypothesis is tested appears to not have a name in the "combining p values" literature. – Galton Jun 16 '22 at 18:53
  • 2
    @Galton I agree that the term multiple testing is not always described precisely and the quote you mentioned is a good example. But in essence what it practically refers to is exactly "multiple testing" of the same null hypothesis, as in the coin toss example I mentioned (which is the scenario in which Bonferroni correction and alike are applied) – J. Delaney Jun 16 '22 at 18:58
  • 3
    Let me rephrase my statement more precisely: when a multiple testing problem is describing as "testing multiple null hypotheses", you can always equivalently describe is as a testing of a single null hypothesis, which is the hypothesis that all of the aforementioned "null hypotheses" are true. – J. Delaney Jun 16 '22 at 19:11
  • 1
    @J. That's a useful characterization. But the present problem seems like it's applying different tests to exactly the same hypothesis, or variants of it. None of the standard methods of combining p-values is likely to be appropriate for that. Another way to look at this is to view conducting multiple tests and (somehow or other) "combining" them into a single test result essentially is a new test of its own. Until we know more about what this really amounts to, it's only going to keep sounding like an ad hoc method to invent new tests--and can be assessed (size & power) in the usual way. – whuber Jun 16 '22 at 22:02

0 Answers0