4

Here's what I know:

I have read the chapter (p347ff) in Agresti, 1990, regarding dependent two-way tables, and I believe I understand the basics. My problem is that Agresti's model-based approaches seem to rely on large-sample theory.

I have questions from 24 students where they rate something on a scale from 1-5. If I collapse 1-2=Agreement, 3=Neutrality, 4-5=Disagreement I still have relatively sparse data. The relevant question is the strength of evidence that the change in opinions between before and after is not due to random variation in response.

Currently I am using mh_test in the coin package in R. Here are some specific questions:

  1. How can I see what the mh_test is actually doing? When I type print(mh_test) it will not show me the function, even though I can use the function after loading the package.
  2. Does the distribution="approximate" use a bootstrap method to obtain a p-value, and is that a way to do with the sparseness of data?
  3. Does anyone know of an exact version of the test of marginal homogeneity in this situation, and ideally how to implement such a test in R/S?

Thanks for reading. -DB

  • @chl's thorough answer to a Q last December is relevant here http://stats.stackexchange.com/questions/5171/testing-paired-frequencies-for-independence/5258#5258 – onestop Mar 20 '11 at 21:25

1 Answers1

5

1: mh_test() is an S3 generic function, you can check what methods it has using methods("mh_test"). To show the source of a non-visible method, you can use getAnywhere() or getS3method():

library(coin)                     # for mh_test()
methods("mh_test")                # available methods for mh_test(), all non-visible ...
getS3method("mh_test", "table")   # get appropriate method -> uses SymmetryProblem ...
getS3method("mh_test", "SymmetryProblem")    # get relevant method ...

The code probably won't help you very much without reading the theory behind the coin package which is explained in vignette("coin_implementation"). To check what mh_test() does in the asymptotic $\chi^{2}$ case, just compare it with the manual calculation:

> one  <- sample(LETTERS[1:3], 24, replace=TRUE)  # observations condition 1
> two  <- sample(LETTERS[1:3], 24, replace=TRUE)  # observations condition 2
> cTab <- table(one, two)  # cross tabulation
> addmargins(cTab)         # marginal frequencies
          two
one    A  B  C Sum
  A    3  3  0   6
  B    2  1  3   6
  C    5  3  4  12
  Sum 10  7  7  24

> mh_test(cTab)            # test for marginal homogeneity
Asymptotic Marginal-Homogeneity Test
data:  response by groups (one, two) stratified by block
chi-squared = 2.6588, df = 2, p-value = 0.2646

# manual calculation following textbook formulas: S will be the estimated
# covariance matrix for the differences in marginal frequencies
> S         <- -(cTab + t(cTab))
> diag(S)   <- rowSums(cTab) + colSums(cTab) - 2*diag(cTab)     # change diagonal
> keep      <- 1:(nrow(cTab)-1)              # last category is pre-determined
> d         <- rowSums(cTab) - colSums(cTab) # differences in marginal frequencies
> (chisqVal <- t(d[keep]) %*% solve(S[keep, keep]) %*% d[keep]) # test statistic
2.658824

> (smmhDf <- nrow(cTab)-1)                   # degrees of freedom
[1] 2

> (pVal <- 1-pchisq(chisqVal, smmhDf))       # p-value from chi-square distribution
0.2646329

2: coin implements a permutation-test framework, not a bootstrapping framework. distribution=approximate(B=9999) means that instead of using all possible permutations of the data for generating the distribution of the test statistic, it only uses a random sample of size B of these permutations. The value of the test-statistic will be the same, but the p-value will differ from the $\chi^{2}$ approximation. IMHO, it's a good idea to compare the p-values.

3: An exact permuation test might be done using functions in package vegan, but I haven't tried that myself.

caracal
  • 12,009