3

Question: What is an appropriate extension of Tukey's method to proportions?

Normal situation

In one-way ANOVA problems

$$ y_{ij} = \mu_i + \epsilon_{ij}, \quad i=1,\ldots,m $$

Tukey's studentized range statistic is often applied to pairwise differences $\mu_l - \mu_k$ is applied to correct for the multiple testing of $m(m-1)/2$ differences. Scheffe's book (the one I have easy access to) formulates this as follows (p. 73, search "T-method" in Amazon look inside): if the OLS estimates $\hat\mu_i$ are normally distributed and independent of each other, $$ \hat\mu_i \sim N(\mu_i,a^2\sigma^2); $$ where $a^2$ is known, $s^2$ is an independent estimate of $\sigma^2$ with $\nu$ degrees of freedom, and $q_{\alpha,m,\nu}$ is the studentized range statistic for $m$ comparisons with $\nu$ degrees of freedom, then the probability is $1-\alpha$ that all $m(m-1)/2$ differences simultaneously satisfy $$ \hat\mu_k - \hat\mu_l - Ts \le \mu_k - \mu_l \le \hat\mu_k - \hat\mu_l + Ts; \quad T = a q_{\alpha,m,\nu} $$

Scheffe argues that for the purposes of pairwise comparisons, the Tukey's method works best (as compared to Bonferroni that works for everything, and Scheffe's method that works for arbitrary contrasts). I am not particularly surprised: Tukey's method uses the specific information while other methods sacrifice power for generality.

Anyway.

I need to work with proportions, so instead of $\hat\mu_i$ that are homoskedastic in Scheffe's formulation, I have estimates of proportions $\hat p_i$ that are hopelessly heteroskedastic. (I also have a complex survey design, and the group sizes are different, so I don't see any particularly good way to stabilize the variance with arcsine or logit transformations.)

Literature review

(1) There's a related question on CV, however I am not fully comfortable using the proposed procedure without a solid reference. It was presented as a kludge rather than a methodology (i.e., I'd like to see a reference to JASA/Biometrika paper or a textbook), although I am sure it is reasonably correct. The plugs that it makes are:

  • car::ANOVA which works with asymptotically normal coefficients out of glm (which in my case would be svyglm)
  • lsmeans which is a port of a SAS LSMEANS statement in PROC LOGISTIC / PROC SVYLOGISTIC.

(2) Alternatively, the NIST website offers a description of the Marascuilo procedure -- again unfortunately without a reference. The procedure is based on referring the absolute values $\hat p_k - \hat p_l$ to

$$ r_{kl} = \sqrt{\chi^2_{1-\alpha;m-1}}\sqrt{\frac{\hat p_k(1-\hat p_k)}{n_k}+\frac{\hat p_l(1-\hat p_l)}{n_l}} $$

StasK
  • 31,547
  • 2
  • 92
  • 179

1 Answers1

3
  1. The reference for the Marascuilo procedure appears to be "Nonparametric post hoc comparisons for trend", Psychological Bulletin, 1967;67:401-412. It's not really a version of the Tukey procedure; the $\chi^2_{m-1}$ comes from comparing the variance of the $\hat p_i$ around their mean to the expected value under the strong null of all $p_i=p$.

  2. For large $m$ (and large enough $n_k\, n_l$ to approximate $t$ by Normal) critical values in the right tail will scale as $\sqrt{m}$, whereas the studentised range critical values will scale as $\log m$, so I'd expect the Marascuilo procedure to to be quite conservative in that setting.

  3. If this were actually binomial data I'd recommend simulations to evaluate the distribution of $$\max_{p\in[0,1]} \max_{k,l} |\hat p_k-\hat p_l|$$ for constant $n_i$, all $p_i=p$, or $$\max_{p\in[0,1]} \max_{k,l} \left|\frac{\hat p_k-\hat p_l}{\sqrt{1/n_k+1/n_l}}\right|$$ for varying $n_i$. (That is, not dividing by the estimated standard error, because the outer maximum is then not well-defined)

If you can't simulate under the null (and it sounds like you can't with complex sampling), I don't know what to recommend -- the useful simplification with binomial data is they are homoscedastic under the null hypothesis.