2

I am using Firth logistic regression to analyze data with a rare event. In my model I have 4 continuous variables and 1 dichotomous variable. This is my code:

library(logistf)
full1F <- logistf(Stroke~log(v1)+sqrt(v2)+log(v3)+log(v4)+dich_var)
summary(full1F)
exp(cbind(OR=coef(full1F),confint(full1F)))
  1. What statistic should I report to describe model fit (i.e. akin to AIC for GLM models)?

  2. How should I interpret the p-values (i.e. sqrt(v2) is significant based on the CI, but the p-value is 1.0)?

  3. Why is the confidence interval for the dichotomous variable so wide? This is the output:

                   OR L95%      U95%
    (Intercept) 28.67 5.06    162.5
    log(v1)      0.88 0.80      0.97
    sqrt(v2)     1.51 1.37      1.69
    log(v3)      1.36 1.13      1.64
    log(v4)      0.62 0.50      0.76
    dich_var    62.76 0.09 167702.4
    

Full output from logist and extractAIC:

Model fitted by Penalized ML 
Confidence intervals and p-values by Profile Likelihood 

             coef       se(coef)   lower 0.95 upper 0.95  Chisq      p
(Intercept)  4.12807056 2.7584264  2.6061200  5.671118391 1.479713 0.2238194
log(v1)     -0.08109114 0.1625829 -0.1744268  0.006705001 0.000000 1.0000000
sqrt(v2)     0.39223967 0.1831892  0.2963022  0.501110156 0.000000 1.0000000
log(v3)      0.31123164 0.3336092  0.1309848  0.502245304 0.000000 1.0000000
log(v4)     -0.53354748 0.3718985 -0.7448874 -0.331731496 0.000000 1.0000000
dich_varYes  4.14502663 3.9598211 -3.1206739 12.035651631    Inf   0.0000000

Likelihood ratio test=5.612847 on 5 df, p=0.3457304, n=1714
Wald test = 10.01195 on 5 df, p = 0.07489748

Covariance-Matrix:
            [,1]         [,2]          [,3]         [,4]          [,5]
[1,]  7.60891623  0.355883900 -0.2358586207  0.107347858 -0.9014000169
[2,]  0.35588390  0.026433207 -0.0139567792  0.006657177 -0.0352398354
[3,] -0.23585862 -0.013956779  0.0335582829 -0.012769583  0.0002785645
[4,]  0.10734786  0.006657177 -0.0127695828  0.111295107 -0.0034625698
[5,] -0.90140002 -0.035239835  0.0002785645 -0.003462570  0.1383084900
[6,]  0.07854902  0.004844901 -0.0121205581 -0.007438953 -0.0037122993
             [,6]
[1,]  0.078549017
[2,]  0.004844901
[3,] -0.012120558
[4,] -0.007438953
[5,] -0.003712299
[6,] 15.680183083

extractAIC(full1F)
[1] 5.000000 4.387153
  • 2
    A little hard to say without more information (can you provide a reproducible example??), but you might want to look at profile confidence intervals (pl=TRUE in your logistf(...) call) rather than Wald intervals ... the fact that logistf provides an extractAIC method suggests that it would be OK to report the AIC ... – Ben Bolker Jun 18 '15 at 22:35
  • You might also consider the brglm implementation which has the output formatted in the same way as glm. It can also be interpreted in the same way (including information criteria) as the estimator is rather close to the maximum likelihood estimator - just adding some bias reduction. – Achim Zeileis Jun 19 '15 at 00:45
  • I added more details of the output I get. I generated confidence intervals by Profile Likelihood. The chisq for the dichotomous variable is infinity. I am thinking that is the reason for the wide CI. Is there any way to get around that? I could also use help with interpreting the p-values of 0 or 1 and the two numbers generated for the AIC (5.0 and 4.39). – user80121 Jun 19 '15 at 05:23

0 Answers0