When to correct for multiple comparisons (with specific reference to emmeans in R)?

Question

I notice that emmeans::emmeans() will only correct for multiple comparisons within groups and not between groups. This means that if you perform a series of contrasts that each involve a single comparison, but which is performed for multiple groups, there will be no p value or CI adjustment.

I assume the authors have valid reasoning for this. So my question is:

Is a family of comparisons requiring p/CI adjustment only those performed within a group, or is a family of comparisons all comparisons regardless of group?

For a tangible example of this, consider the following data set:

    dat = 
    tibble(
          id = factor(
            c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
              19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 1, 2, 3, 4, 5,
              6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
              23, 24, 25, 26, 27, 28, 29, 30)),
       group = factor(
         c("A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "B",
           "B", "B", "B", "B", "B", "B", "B", "B", "C", "C", "C", "C",
           "C", "C", "C", "C", "C", "C", "A", "A", "A", "A", "A", "A", "A",
           "A", "A", "A", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B",
           "C", "C", "C", "C", "C", "C", "C", "C", "C", "C")),
        time = factor(
          c("t1", "t1", "t1", "t1", "t1", "t1", "t1", "t1", "t1", "t1",
            "t1", "t1", "t1", "t1", "t1", "t1", "t1", "t1", "t1", "t1",
            "t1", "t1", "t1", "t1", "t1", "t1", "t1", "t1", "t1", "t1", "t2",
            "t2", "t2", "t2", "t2", "t2", "t2", "t2", "t2", "t2", "t2",
            "t2", "t2", "t2", "t2", "t2", "t2", "t2", "t2", "t2", "t2", "t2",
            "t2", "t2", "t2", "t2", "t2", "t2", "t2", "t2")),
          dv = c(112.3351351, 106.2767115, 85.97498519, 94.56917246,
                 102.4029377, 96.90074365, 106.6283194, 105.0811233,
                 81.82880209, 99.18720794, 123.9631567, 103.8324887, 80.28047265,
                 76.7988305, 109.7733382, 102.7802469, 114.3847556, 105.1958354,
                 101.4281409, 94.03792896, 114.4768239, 118.2030177, 114.018257,
                 90.48844963, 122.9059885, 119.6559235, 109.6761788, 123.3134245,
                 115.1970167, 98.73363312, 115.9047459, 93.03497563, 89.89520236,
                 67.40679933, 96.61396618, 109.0766327, 56.42345318, 80.97383497,
                 81.54527339, 90.61442551, 85.96806927, 91.15030977, 77.01813237,
                 88.70078778, 83.11691388, 84.83115907, 81.90959002, 103.6980138,
                 71.96358206, 73.50106612, 121.4016791, 108.4115863, 109.3652816,
                 98.99960444, 110.8002013, 111.0578472, 111.709104, 107.0648845,
                 109.0496619, 104.9821074)
)

Subject the data to a mixed ANOVA via afex::aov_ez:

model = 
  dat %>% 
  afex::aov_ez(
    id = "id", 
    dv = "dv",
    data = ., 
    between = "group", 
    within = "time")

Now compute the $t1 - t2$ contrasts within each group:

emm_int %>% 
  contrast(., method = "pairwise", by = "group")

No p value adjustment is made for multiple comparisons within groups. Of course we can perform a multiple comparison adjustment if we wish (e.g., Holm) using:

emm_int %>% 
  contrast(., method = "pairwise", by = "group") %>% 
  rbind() %>% 
  summary(adjust = "holm")

score 4 · Accepted Answer · answered Sep 05 '19 at 19:45

4

First, there seems to be a missing definition of emm_int. I think it is this:

model %>% emmeans(~ time * group) -> emm_int

(just after the model = step); so that is what I use later in illustrating the answer.

Adjustments are always made treating each distinct by group as a separate family.

However, in the example you show, note that by has two different roles:

grouping of means to be contrasted
defining families, as just described

For that second purpose, if you don't want the same families for adjustment as you want for grouping, you need to change the by when summarizing. The results when the by grouping stays the same:

emm_int %>% 
    contrast(., method = "pairwise", by = "group")

## group = A:
##  contrast estimate   SE df t.ratio p.value
##  t1 - t2     10.97 4.72 27 2.325   0.0278 
## 
## group = B:
##  contrast estimate   SE df t.ratio p.value
##  t1 - t2     17.06 4.72 27 3.617   0.0012 
## 
## group = C:
##  contrast estimate   SE df t.ratio p.value
##  t1 - t2      3.38 4.72 27 0.717   0.4795

There are no P-value adjustments because each family has but one comparison. But you can change the by variable in the summary step so that all the comparisons just generated are taken together:

emm_int %>% 
    contrast("pairwise", by = "group") %>% 
    summary(by = NULL, adjust = "holm")

##  contrast group estimate   SE df t.ratio p.value
##  t1 - t2  A        10.97 4.72 27 2.325   0.0556 
##  t1 - t2  B        17.06 4.72 27 3.617   0.0036 
##  t1 - t2  C         3.38 4.72 27 0.717   0.4795 
## 
## P value adjustment: holm method for 3 tests

Note that rbind() was not needed here.

See vignette("comparisons", "emmeans") (or equivalently, https://cran.r-project.org/web/packages/emmeans/vignettes/comparisons.html) for more details

answered Sep 05 '19 at 19:45

Russ Lenth

20,271

1

This clarifies my confusion over the behavior of emmeans multiple comparison adjustment. If I consider all contrasts computed for emm_int as referring to a single hypothesis family I can change the by variable when summarizing. Thanks! – pomodoro Sep 07 '19 at 09:34
It isn't clear to me what the answer to this question is: "Is a family of comparisons requiring p/CI adjustment only those performed within a group, or is a family of comparisons all comparisons regardless of group?". We see here how to do it both ways, but how do you choose which way to define a family? – K Li Jun 19 '20 at 17:41
The family/families by which P values are adjusted is defined via the by variables used in the summary. So with by = NULL, it's all one big family and otherwise it is smaller families. Also look at the annotations below the results, e.g., "holm method for 3 tests." which I think makes it pretty clear the size of the family. If there is bo such annotation, there is no family adjustment either because no adjustment is used, or the family size is 1. – Russ Lenth Jun 19 '20 at 18:18
Note if we had put by = NULL in the call for emm_int, we would have been asking for asking for all comparisons among the 3x2=6 factor combinations --- a family of 15 comparisons instead of the 3 we show. – Russ Lenth Jun 19 '20 at 18:26
Thanks for the response. I understand that the output annotation indicates how the families were defined in the function. I was asking a more conceptual question (which perhaps was actually different from the OP) of how we should decide to define families. As it's been demonstrated here, we have the option to define it in multiple ways, but how do we choose which one? – K Li Jun 19 '20 at 21:33
I don't think there is much of a consensus. I am sure other postings on CV will have some discussion of this. I am sure there are people with strong opinions, and I am sure they don't all agree. – Russ Lenth Jun 19 '20 at 22:37

When to correct for multiple comparisons (with specific reference to emmeans in R)?

1 Answers1

Linked