1

I'm having problems using betareg. I have a dataset that always shows different results depending on how I perform the analysis. I'm using lrtest to compare the models as follows:

snf.null    <- betareg(overlay_of_niche_flowers ~ 1, data = data)
snf.tra     <- betareg(overlay_of_niche_flowers ~ treatment, data = data)
snf.tam     <- betareg(overlay_of_niche_flowers ~ size_of_net, data = data)
snf.traetam <- betareg(overlay_of_niche_flowers ~ treatment + size_of_net, data = data)
snf.traxtam <- betareg(overlay_of_niche_flowers ~ treatment * size_of_net, data = data)

lrtest(snf.nulo, snf.tam, snf.tra, snf.traetam, snf.traxtam)

When I do this, snf.traetam is the only significant model, but comparing the models independently with the null model, all of which are significant.

I also tried to make a simplification of the models:

# testing the best model
lrtest(snf.traxtam, snf.traetam) #no - continue traetam
# best model snf.traetam

simplifying - treatment

snf.traetamstr <- update(snf.traetam, . ~ . treatment) lrtest(snf.traetamstr, snf.traetam ) #yes - continue with treatment lrtest(snf.traetam) summary(snf.traetam)

#size snf.traetamsta <- update(snf.traetam, . ~ . - net_size) lrtest(snf.traetam, snf.traetamsta) #yes - size stays

I don't know if I'm approaching it the right way.

mkt
  • 18,245
  • 11
  • 73
  • 172

1 Answers1

2

If you call lrtest() with more than two models, then it computes the tests for all pairs of models. Thus, your

lrtest(snf.nulo, snf.tam, snf.tra, snf.traetam, snf.traxtam)

is equivalent to

lrtest(snf.nulo, snf.tam)
lrtest(snf.tam, snf.tra)
lrtest(snf.tra, snf.traetam)
lrtest(snf.traetam, snf.traxtam)

The log-likelihoods, test statistics, and p-values are exactly the same as in the single call. The difference is only whether everything is arranged in one table or in four tables.

Given that the likelihood ratio test is only well-defined for nested models this directly means that the line with lrtest(snf.tam, snf.tra) is nonsensical.

Also, note that the likelihood ratio test in general does not do model selection directly so it does not pick the "only significant model". You always just compare a restricted model under the null hypothesis with a more flexible (or unrestricte) model under the alternative. And then you can start adopting significant effects or dropping non-significant effects. To carry out all tests you might need for this you could do:

lrtest(snf.nulo, snf.tam, snf.traetam, snf.traxtam)
lrtest(snf.nulo, snf.tra, snf.traetam, snf.traxtam)

where the last test (significance of the interaction term) is the same in both outputs. For a worked example (based on linear regression models rather than beta regression models) you can see our teaching materials at: https://discdown.org/dataanalytics/analysis-of-variance.html#mla-2-way-analysis-of-variance

Achim Zeileis
  • 15,515
  • 2
  • 38
  • 62