What is the difference between independence.test in R and Cochrane and Armitage trend test?

Question

What is the difference between independence.test in R and CATT (Cochrane and Armitage) tests?

How these tests are calculated? Where do we and how do we define x=0.0 0.5 1.0 (genetic studies) for both of the tests?

What is $x$ precisely? Let me try two educated guesses: (a) You're working with genotype data and a binary outcome (e.g., you have a two-way table of observed genotypes, given as the frequency of one of the two possible alleles on the DNA strand, by disease or exposure status or whatever might be coded as 0/1); (b) independence.test refers to sth like independence_test in the coin package. Could you confirm and update your question accordingly? — chl, Mar 25 '11 at 19:41
@chl, my question is this, in books, when they explain Cochrane and Armitage test for independency (for trend) they say, for recessive model x=0.0 for codominant (additive)=1.0 and for dominant x=0.5... It is variable in the input of the independence.test function. Why do we assume these values? How does it affect the outcome ? Which value to assume if I don't know yet what kind of interaction there is between alleles. Yes it was a binomial phenotype. — suprvisr, Mar 28 '11 at 12:56
Ah, I think I understand now. So, x={0,1/2,1} would be the numerical scores corresponding to {AA,AB,BB} (with B=minor allele) for an additive model. I will provide an example later. — chl, Mar 28 '11 at 13:19

chl · Answer 1 · 2011-05-04T16:55:49.133

As a follow-up to my comment, if independence.test refers to coin::independence_test, then you can reproduce a Cochrane and Armitage trend test, as it is used in GWAS analysis, as follows:

> library(SNPassoc)
> library(coin)
> data(SNPs)
> datSNP <- setupSNP(SNPs,6:40,sep="")
> ( tab <- xtabs(~ casco + snp10001, data=datSNP) )
     snp10001
casco T/T C/T C/C
    0  24  21   2
    1  68  32  10
> independence_test(casco~snp10001, data=datSNP, teststat="quad",
                    scores=list(snp10001=c(0,1,2)))

Asymptotic General Independence Test

data:  casco by snp10001 (T/T < C/T < C/C) 
chi-squared = 0.2846, df = 1, p-value = 0.5937

This is a conditional version of the CATT. About scoring of the ordinal variable (here, the frequency of the minor allele denoted by the letter C), you can play with the scores= arguments of independence_test() in order to reflect the model you want to test (the above result is for a log-additive model).

There are five different genetic models that are generally considered in GWAS, and they reflect how genotypes might be collapsed: codominant (T/T (92) C/T (53) C/C (12), yielding the usual $\chi^2(2)$ association test), dominant (T/T (92) vs. C/T-C/C (65)), recessive (T/T-C/T (145) vs. C/C (12)), overdominant (T/T-C/C (104) vs. C/T (53)) and log-additive (0 (92) < 1 (53) < 2 (12)). Note that genotype recoding is readily available in inheritance functions from the SNPassoc package. The "scores" should reflect these collapsing schemes.

Following Agresti (CDA, 2002, p. 182), CATT is computed as $n\cdot r^2$, where $r$ stands for the linear correlation between the numerical scores and the binary outcome (case/control), that is

z.catt <- sum(tab)*cor(datSNP$casco, as.numeric(datSNP$snp10001))^2
1 - pchisq(z.catt, df = 1)  # p=0.5925

There also exist various built-in CATT functions in R/Bioconductor ecosystem for GWAS, e.g.

CATT() from Rassoc, e.g.

with(datSNP, CATT(table(casco, snp10001), 0.5)) # p=0.5925

(additive/multiplicative)

in snpMatrix, there are headed as 1-df $\chi^2$-test when you call single.snp.tests() (see the vignette); please note that the default mode of inheritance is the codominant/additive effect.

Finally, here are two references that discuss the choice of scoring scheme depending on the genetic model under consideration, and some issues with power/robustness

Zheng, G, Freidlin, B, Li, Z and Gastwirth, JL (2003). Choice of scores in trend tests for case-control studies of candidate-gene associations. Biometrical Journal, 45: 335-348.
Freidlin, B, Zheng, G, Li, Z, and Gastwirth, JL (2002). Trend Tests for Case-Control Studies of Genetic Markers: Power, Sample Size and Robustness. Human Heredity, 53: 146-152.

See also the GeneticsDesign (bioc) package for power calculation with linear trend tests.

What is the difference between independence.test in R and Cochrane and Armitage trend test?

1 Answers1

Linked