I investigated the effect of weather variables on disease severity. My response variable is proportion of disease severity observed in different years. The study is conducted over 10 year and disease was assessed 3-6 times a year, but only final disease assessment has been considered. The response variable is continuous, positive with minimum value of 0.35 and a maximum value of 0.919. I have tried both binomial and beta regression, and beta regression seems to be making more sense and results are supported by the literature.
However, the model diagnostics are all over the place and doesn't show a good fit, although results are logical. I checked autocorrelation with performance package, and there is no significant autocorrelation. But when I checked concurvity, there is strong collinearity between variables with a value of 1 for all variables in worst category. This seems to be the problem.
concurvity(mod5)
para s(mean_rh) s(total_rain) s(mean_temp) s(mean_ws)
worst 0.978845 1.0000000 1.0000000 1 1
observed 0.978845 0.9624381 0.7694716 1 1
estimate 0.978845 0.9594373 0.7116748 1 1
Does anyone know how to account for collinearity between variables? I am attaching model fit as well as diagnostic plots, any help will be very appreciated.
My model is given below.
mod5 <- gam(severity ~ s(mean_rh, k = 5) + s(total_rain, k = 6) +
s(mean_temp, k=3) + s(mean_ws, k= 3), family = betar(),
data = dat_seasonal)
summary(mod5)

