1

I am fairly new to R and multiple regression analyses so I could use some help interpreting my results. For my research I am trying to find predictors for the amount of blood loss during surgery. For this I have a dataset of clinical variables (which are either dichotomous, ordinal or continuous) and blood loss as an outcome in mL. As blood loss is a non-normally distributed continuous variable with only positive values, I understood I am best off using a generalized linear model with Gamma regression. To build the model I used the following code in R:

fullmodel <- glm(ebl ~ embol + age + gender + bmi + charlson + 
    path_fracture + pain + ecog + asia_pre + prim_tumor + 
    other_bone_mets + spine_mets + visc_mets + brain_mets + 
    local_radiation + previous_systemic + ellipsoid_cm3 + bilsky + 
    hgb + wbc + plt + lymph + neut + creatinine + calcium + albumin + 
    time_prim_surg + operation + levels_operated + opn_time_min, 
    data = predictebl, family= Gamma(link="log"))

summary(fullmodel)

simulationOutput2 <- simulateResiduals(fittedModel = fullmodel) plot(simulationOutput2)

testDispersion(simulationOutput2)

Which gave me the following results:

    Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)         6.431e+00  1.836e+00   3.502 0.000925 ***
embol               2.164e-01  2.221e-01   0.974 0.334304    
age                -1.085e-02  1.037e-02  -1.046 0.300070    
gender2            -4.557e-01  2.467e-01  -1.847 0.070122 .  
bmi                -3.737e-03  1.943e-02  -0.192 0.848185    
charlson            4.448e-02  7.829e-02   0.568 0.572239    
path_fracture2     -8.190e-03  2.231e-01  -0.037 0.970845    
pain2              -4.070e-01  2.723e-01  -1.495 0.140720    
ecog2               2.781e-01  2.607e-01   1.067 0.290690    
ecog3               2.113e-01  3.331e-01   0.634 0.528553    
ecog4              -3.904e-01  3.878e-01  -1.007 0.318482    
ecog5              -4.656e-01  5.573e-01  -0.836 0.407017    
asia_pre2           1.531e-01  2.053e-01   0.746 0.459062    
asia_pre3          -6.207e-01  5.248e-01  -1.183 0.241970    
asia_pre4           2.041e+00  9.495e-01   2.150 0.035982 *  
prim_tumor2        -7.277e-02  2.600e-01  -0.280 0.780568    
other_bone_mets2    5.886e-02  2.077e-01   0.283 0.777901    
spine_mets2        -4.733e-01  2.679e-01  -1.767 0.082828 .  
spine_mets3         5.052e-02  2.602e-01   0.194 0.846752    
visc_mets2         -7.395e-01  2.092e-01  -3.535 0.000836 ***
brain_mets2         5.740e-02  3.288e-01   0.175 0.862053    
local_radiation2   -1.631e-01  2.011e-01  -0.811 0.420737    
previous_systemic2  2.151e-01  2.441e-01   0.881 0.382131    
ellipsoid_cm3       2.866e-03  2.875e-03   0.997 0.323289    
bilsky             -1.202e-02  6.576e-02  -0.183 0.855612    
hgb                 5.356e-03  7.193e-02   0.074 0.940917    
wbc                 6.318e-02  5.073e-02   1.245 0.218291    
plt                 6.412e-05  1.000e-03   0.064 0.949134    
lymph              -1.235e-01  1.947e-01  -0.634 0.528633    
neut               -8.864e-02  5.940e-02  -1.492 0.141316    
creatinine         -3.602e-02  1.339e-01  -0.269 0.788925    
calcium             9.560e-02  8.701e-02   1.099 0.276667    
albumin            -2.514e-01  1.932e-01  -1.301 0.198718    
time_prim_surg      4.495e-05  5.097e-05   0.882 0.381705    
operation2          3.100e-01  2.313e-01   1.340 0.185614    
operation3         -1.084e+00  4.389e-01  -2.470 0.016640 *  
operation4         -2.784e-01  4.373e-01  -0.637 0.527054    
levels_operated2    3.685e-01  2.811e-01   1.311 0.195342    
levels_operated3    4.108e-01  2.892e-01   1.421 0.161058    
opn_time_min        3.619e-03  6.433e-04   5.625 6.44e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 0.5311014)

Null deviance: 117.807  on 94  degrees of freedom

Residual deviance: 36.493 on 55 degrees of freedom AIC: 1586.4

Number of Fisher Scoring iterations: 14

I also used DHARMa to create some plots to visualize the model results:

DHARMa residual plots:

Dispersion test plot:

Now I have a few questions regarding the results of my analysis:

[Q1]: was it appropriate to add all variables of the dataset into the model, or should I have done another analysis first to select variables that are associated with blood loss? I have read about LASSO, should I have used that before the GLM?

[Q2]: to me the Q-Q plot and associated tests seem fair, but the second plot states that quantile deviations were detected. Is this problematic for the model, and if so, how should I fix this?

[Q3]: I would like to present the results of my analysis in a table displaying the factors that are associated with blood loss, what would the best approach to this be? Should I just write down the name of the variable, together with the factor in row 1 and the p-value?

  • Just a comment re: Q2. I believe DHARMA package is not necessarily suitable for checking assumptions for your single-level model as it is built for testing multilevel model assumptions. Also, I'm not very familiar with gamma regression, but I believe its assumptions are much less strict than for linear regression, e.g. your error residuals do not need to be normally distributed. – Sointu Aug 19 '23 at 19:21
  • Re: Q1. This answer may help. – Sointu Aug 19 '23 at 19:27

0 Answers0