I want to perform logistic regression with the following binomial response and with $X_1$ and $X_2$ as my predictors.

I can present the same data as Bernoulli responses in the following format.

The logistic regression outputs for these 2 data sets are mostly the same. The deviance residuals and AIC are different. (The difference between the null deviance and the residual deviance is the same in both cases - 0.228.)
The following are the regression outputs from R. The data sets are called binom.data and bern.data.
Here is the binomial output.
Call:
glm(formula = cbind(Successes, Trials - Successes) ~ X1 + X2,
family = binomial, data = binom.data)
Deviance Residuals:
[1] 0 0 0
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.9649 21.6072 -0.137 0.891
X1Yes -0.1897 2.5290 -0.075 0.940
X2 0.3596 1.9094 0.188 0.851
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 2.2846e-01 on 2 degrees of freedom
Residual deviance: -4.9328e-32 on 0 degrees of freedom
AIC: 11.473
Number of Fisher Scoring iterations: 4
Here is the Bernoulli output.
Call:
glm(formula = Success ~ X1 + X2, family = binomial,
data = bern.data)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.6651 -1.3537 0.7585 0.9281 1.0108
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.9649 21.6072 -0.137 0.891
X1Yes -0.1897 2.5290 -0.075 0.940
X2 0.3596 1.9094 0.188 0.851
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 15.276 on 11 degrees of freedom
Residual deviance: 15.048 on 9 degrees of freedom
AIC: 21.048
Number of Fisher Scoring iterations: 4
My questions:
I can see that the point estimates and standard errors between the 2 approaches are equivalent in this particular case. Is this equivalence true in general?
How can the answer for Question #1 be justified mathematically?
Why are the deviance residuals and AIC different?
- Given that the 2 models give different results for deviance residuals and AIC, which one is correct or better? a) As I understand, observations with a deviance residual in excess of two may indicate lack of fit, so the absolute values of the deviance residuals matter. b) Since AIC is used to compare the fit between different models, perhaps there is no "correct" AIC. I would just compare the AICs of 2 binomial models or 2 Bernoulli models.
– A Scientist Apr 11 '15 at 04:52