Test for effects of two categorical variables on a binary response variable?

Question

I am looking for a test similar to a 2-way ANOVA that would work on a binary response variable. My response variable is survival of plant seedlings (alive or dead). My explanatory variables are Treatment (3 treatment groups) and Site (3 sites).

First, I would like to know whether Treatment, Site and their interaction have a significant effect on survival. Second, if either Treatment or Site is significant, I would like to test all pairs of treatment groups or sites to know which pairs of levels are significantly different, as I would normally do with an ANOVA.

I have considered several options:

Transform the response variable, for example through an arcsin transformation, and then perform an ANOVA. This does not work on my data because at one of the sites I measure 100% survival. Therefore there is 0 variability at this site and no transformation will change that.
Logistic regression with Treatment and Site recoded as dummy variables. The results do not seem to give me a test of significance of Treatment, Site and interaction term - Instead, I get the relative importance of each treatment group and each site separately. Furthermore, it seems that I cannot test all the pairs of treatment groups or sites, I can only compare one "baseline" or "default" group to each of the two remaining groups.
Chi-square test on each explanatory variable separately. This has the obvious drawback of not being able to test the interaction term. Also I suspect that I am omitting important information if I am comparing survival across the 3 treatment groups without taking into account that this survival data is grouped in 3 different sites. Does this bias the results?

Can anyone recommend a different test or what the best approach would be in my case?

UPDATE: Logistic regression can in fact give a test of significance of each independent variable. In R, I discovered I can use glm to contruct a model and then the anova function to extract p-values for each IV:

mymodel <- glm(Survival ~ Treatment*Site, data=survivaldata, family="binomial")
anova(mymodel, test="Chisq")

You want logistic regression. What you can get from it and, more, how to get it, will vary depending on what software you are using. — Peter Flom, Oct 29 '12 at 12:20
Thanks for the response. I am using R, and so far have tried the "glm" function. But I thought that it was a more generalized problem with logistic regression, the fact that you have to use dummy variables and therefore cannot get information on the original variable as a whole. And that with k levels of a variable, you are limited to k-1 contrasts. Am I mistaken? — KTWillow, Oct 29 '12 at 12:28
It's not a problem with logistic regression per se, it's a problem with dummy variables. In SAS, you do get a p value for the effect as a whole. You can't get odds ratios on the effect as a whole because they don't make any sense. With a continuous IV, the OR is "per unit" of the IV. With categorical DV with more than 2 categories, what would it be? — Peter Flom, Oct 29 '12 at 12:34
Interesting. I will try to find other functions or software that will give me p values for the effect as a whole. Since I am interested in hypothesis-testing, not prediction, I don't care about the odds ratios, only p-values. Thank-you for the hint! — KTWillow, Oct 29 '12 at 12:41

Test for effects of two categorical variables on a binary response variable?

0 Answers0

Linked