2

I have a question related to the use of chi square test for paired data. I have read that McNemar tests would be an option but in some software like R it works only for 2x2 data (may be I do not know the correct way to do it) and my data is more than a 2 x 2.

These are counts for individuals before and after applying an insecticide. Some people suggested a paired t test could work, but I am not too sure about it since my data are counts.

Data looks like the table below.

\begin{array}{|l|r|r|} &\text{Before}&\text{After}\\ \hline \text{Species A}&30&12\\ \hline \text{Species B}&30&7\\ \hline \text{Species C}&30&6\\ \hline \text{Species D}&30&2\\ \hline \text{Species E}&30&4\\ \hline \end{array}

Nick Cox
  • 56,404
  • 8
  • 127
  • 185
user12257
  • 93
  • 3
  • 10

1 Answers1

4

I think you're looking at this the wrong way. You're trying to compare the proportion of insects left after applying insecticide. The 'before' aren't a random sample, but the experimental setup. That is:

\begin{array}{l|c|c|c} & &\text{count left }&\\ &n \text{ exposed} &\text{after insecticide}&\text{proportion left}\\ \hline \text{Species A}&30&12&12/30\\ \hline \text{Species B}&30&7&7/30\\ \hline \text{Species C}&30&6&6/30\\ \hline \text{Species D}&30&2&2/30\\ \hline \text{Species E}&30&4&4/30\\ \hline \end{array}

This is in effect a straight chi-square test, or you could use a binomial GLM.

To present as a chi-squared test you'd write two columns, the number remaining and the number dead (or missing or gone or whatever it is that happened), for each species and do a test of independence in the two-way table, which serves as a test of equality of proportion.

Edit - Like so:

\begin{array}{l|r|r|r} &\text{Survived}&\text{Died}&n \text{ exposed}\\ \hline \text{Species A}&12&18&30\\ \hline \text{Species B}&7&23&30\\ \hline \text{Species C}&6&24&30\\ \hline \text{Species D}&2&28&30\\ \hline \text{Species E}&4&26&30\\ \hline \text{Total}&31&119&150 \end{array}

Edit2: Here's a chi-squared test done in R; as you see it agrees with the values in Nick Cox's comment.

 alive=c(12,7,6,2,4)
 dead=30-alive
 chisq.test(cbind(alive,dead))

        Pearson's Chi-squared test

data:  cbind(alive, dead) 
X-squared = 11.5478, df = 4, p-value = 0.02105

Edit 3: answering followup questions from comments:

I would like to know if there is a test which allows me to make post-hoc comparisons between the species

The issues are much as they are with ANOVA

(i) If you have orthogonal contrasts: You can partition the chisquare into the orthogonal contrasts to test those. These contrasts are usually obvious a priori, and specified in advance.

(ii) If you want all pairwise comparisons (I assume you meant this option): You can do a series of 2-species comparisons with, if you wish, the typical sorts of adjustments for multiple testing (Bonferroni is trivial to do, for example, but conservative; you might use Keppel's modification of Bonferroni or a number of other options). You could alternatively look at multiple comparisons via simultaneous confidence intervals (Agresti et al. 2008 Simultaneous confidence intervals for comparing binomial parameters. Biometrics no.64 p. 1270-1275.)

Note that for some 2x2 comparisons, the expected counts are low; e.g. for D vs E the expected counts in two cells are only 3. This is not as big of a problem as it's made out to be (a variety of less conservative rules from the last 4 or 5 decades would say it's fine), but you can always either simulate the discrete distribution of the test statistic, or you can do an exact calculation of the p-value by complete enumeration of the tail. Personally, for those expecteds I wouldn't bother, they're absolutely fine.

(iii) If you're more interested in "which groups stand out" ('what made this significant?'), the usual approach would be to look at some form of standardized residual (such as a Pearson residual) or a contribution to chi-square. An alternative would be to collapse the tables to do 2x2 comparisons of each one against all the others.

Glen_b
  • 282,281
  • What are you trying to find out? – Peter Flom Aug 29 '13 at 00:28
  • @PeterFlom Always a good question; I made an assumption, but it might not be the right one. – Glen_b Aug 29 '13 at 00:32
  • If you recast the data as dead and alive, I get chi-square with 4 d.f. as 11.55 (P = 0.021). That is, a null hypothesis that the five species are equally susceptible can be rejected. But this is imputing a guess on what you want here. Certainly the inclusion of after in before requires a recasting of the data. – Nick Cox Aug 29 '13 at 00:32
  • It would be rather shocking if two people couldn't get the same chi-square test answer. For any Stata people watching: I used tabchii from SSC. – Nick Cox Aug 29 '13 at 07:57
  • Oops @Glen_b I posted this question as a reply to you, I meant it as a comment to the OP – Peter Flom Aug 29 '13 at 10:36
  • @PeterFlom Yes, I guess it was intended for the OP; I was simply adding my support to asking the question. – Glen_b Aug 29 '13 at 10:48
  • Hi, As you widely assumed Glen, the question was if all species were equally affected by insecticides exposure in terms of survival. Your explanation was quite clear. Nevertheless, I would like to know if there is a test which allows me to make post-hoc comparisons between the species, I have never seen a test lik this for chi-square so I wanted to know if you know some which could work. – user12257 Aug 29 '13 at 17:15
  • see updated answer – Glen_b Aug 29 '13 at 23:26