Why shows a two variable $\chi^2$-test a significant p-value while a log linear analysis on the same data does not?

Question

I did an experiment to look at the influence of two categorial variables onto a categorial output. The input variables were T and P. The output variable is R. Sample size is 188 data-points.

I did a $\chi^2$-test for T and R. The result for Pearson's $\chi^2$ was significant with $\chi^2(4)=122.39, p=1.65e-25$, Cramer's V=0.57 showed a large effect.

ct=xtabs(~ R+ T, data)
chisq.test(ct)
## 
##  Pearson's Chi-squared test
## 
## data:  ct
## X-squared = 122.39, df = 4, p-value < 2.2e-16

The values for P and R were $\chi^2(6)=19.531, p=0.0034$, Cramer's V=0.228 (medium effect).

Then I did a log-linear analysis.

ct=xtabs(~ R + T + P, data)
sm=loglm( ~ P * T * R, ct)

Deleting the threeway interaction T:P:R gave $P(>Delta(Dev))=0.993$ from the saturated model, indicating that the threeway-interaction is not significant ($\chi^2(12)=3.33$).

ni=update(sm, .~. - P:T:R)
anova(sm, ni)
LR tests for hierarchical log-linear models

Model 1:
. ~ P + T + R + P:T + P:R + T:R
Model 2:
~P * T * R
Deviance df Delta(Dev) Delta(df) P(> Delta(Dev)
Model 1   3.330479 12
Model 2   0.000000  0   3.330479        12        0.99273
Saturated 0.000000  0   0.000000         0        1.00000

Deleting T:R from the model without threeway-interaction (retaining P:R) gave $P(>Delta(Dev))=0.000$, so the twosided-interaction T:R is significant:

m1=update(ni, .~. - T:R)
anova(ni, m1)
## LR tests for hierarchical log-linear models
## 
## Model 1:
##  . ~ P + T + R + P:T + P:R 
## Model 2:
##  . ~ P + T + R + P:T + P:R + T:R 
## 
##             Deviance df Delta(Dev) Delta(df) P(> Delta(Dev)
## Model 1   109.751132 16                                    
## Model 2     3.330479 12 106.420653         4        0.00000
## Saturated   0.000000  0   3.330479        12        0.99273

But deleting P:R from the model without threeway-interaction (but retaining T:R interaction) gave $P(>Delta(Dev))=0.105$, P thus not being significant to account for the output R:

m2=update(ni, .~. - P:R)
anova(ni, m2)
## LR tests for hierarchical log-linear models
## 
## Model 1:
##  . ~ P + T + R + P:T + T:R 
## Model 2:
##  . ~ P + T + R + P:T + P:R + T:R 
## 
##            Deviance df Delta(Dev) Delta(df) P(> Delta(Dev)
## Model 1   13.820419 18                                    
## Model 2    3.330479 12  10.489940         6        0.10548
## Saturated  0.000000  0   3.330479        12        0.99273

I do not understand how to interpret this: How can the $\chi^2$-test show a significant medium effect for P:R while leaving out this term from the hierarchical loglinear analysis shows that P:R is not significant?

@SextusEmpiricus Question is updated wih R-Code, hope it gives the missing informations — Claude, Jun 04 '23 at 19:00

Sextus Empiricus · Accepted Answer · 2023-06-04T20:26:34.970

The two comparisons are not the same.

With the chi-squared test you are testing two factor contingency tables.

This is like comparing the effect of a model with and without the factor P:R when the two compared models are both without the factor T:R
With the log linear model you are testing three factor contingency tables. This is like comparing the effect of a model with and without the factor P:R when the two compared models are both with the factor T:R

A difference can occur when T already fully explains the outcome R and P has not so much additional effect. But, on its own, without the effect T in the model, the effect P might be significant.

This situation can occur when P correlates with T but has no more information to add than T already does.

A simpler case is to consider ordinary linear regression like the case from the image below from this question: why does the same variable have a different slope when incorporated into a linear model with multiple x variables

^{Linear regression when the true model contains no true linear term. Compared with the full model, the linear term is not significant and deviations from zero are due to random noise. However, compared with an empty model, the linear term can make a significant improvement.}

For your type of data you could simulate data that has this effect as following:

Start with two models: One where T has a causal effect on R, and another where T has a causal effect on P.
Simulate some data according to the two models.
Perform your two different analyses

Code example:

library(MASS)

### generate data
set.seed(1)
T = sample(1:3,188, replace = TRUE)
P = sapply(T, FUN = function(Ti) sample(1:3,1,  
           prob = c(0.1+0.8*(Ti==1),
                    0.1+0.8*(Ti==2),
                    0.1+0.8*(Ti==3))))
R = sapply(T, FUN = function(Ti) sample(1:3,1,  
                                        prob = c(0.25+0.25*(Ti==1),
                                                 0.25+0.25*(Ti==2),
                                                 0.25+0.25*(Ti==3))))
data = cbind(T,P,R)
chi-squared test
ct=xtabs(~ R+T, data)
chisq.test(ct) ###  p-value = 3.649e-05
ct=xtabs(~ R+P, data)
chisq.test(ct) ### p-value = 0.008044
loglinear models
ct=xtabs(~ R + T + P, data)
sm=MASS::loglm( ~ P * T * R, ct)
ni=update(sm, .~. - P:T:R)
ni1=update(ni, .~. - P:R)
ni2=update(ni, .~. - T:R)
anova(ni1, ni) ### 0.97461
anova(ni2, ni) ### 0.01439

Here P is a copy from T but with some mistakes. T can predict R and so does P, but P has nothing to add in addition to T because it is a copy of T with some noise.

To get something equivalent to the chi-squared test, you can use:

ct=xtabs(~ R + T, data)
ct
s1=loglm( ~ R*T, ct)
s2=loglm( ~ R+T, ct)
anova(s1, s2)
ct=xtabs(~ R + P, data)
s1=loglm( ~ R*P, ct)
s2=loglm( ~ R+P, ct)
anova(s1, s2)

The chi-squared test is a test for goodness of fit which compares with a full model. Something similar is here: Linear regression: F-test for lack of fit (using ANOVA to test regression model) - intuition?

Why shows a two variable $\chi^2$-test a significant p-value while a log linear analysis on the same data does not?

LR tests for hierarchical log-linear models

Model 1:

. ~ P + T + R + P:T + P:R + T:R

Model 2:

~P * T * R

Deviance df Delta(Dev) Delta(df) P(> Delta(Dev)

Model 1 3.330479 12

Model 2 0.000000 0 3.330479 12 0.99273

Saturated 0.000000 0 0.000000 0 1.00000

1 Answers1

chi-squared test

loglinear models