I am new to generalised linear models and want to use them for my fourth year dissertation project in ecology. Please forgive any ignorance on my part, I have done my best to research this on my own but have reached a bit of a dead end and would like some help.
I have an explanatory variable and four predictors for the GLM and am using a Gaussian distribution. I have run several GLM's in RStudio with different variations of the predictors to obtain the best model, however the best model (with the lowest AICc score) is a reduced model and so only uses two of the predictors.
When reporting this in my dissertation, what do I say about the predictors that are not included in the model?
My current understanding is that I report the best model (with just the two predictors) and state that the other two were excluded from the model. Do I then ignore the other two predictors entirely or should I run a different statistical analysis individually on them such as a standard linear regression?
I started with multiple linear regression, but was running into problems because the coefficient estimates were huge and adjusted R-squared was low. I checked the diagnostics and the plots looked really good, so I was struggling to understand what was going on. I thought that in this scenario GLMs would be a better tool, but I've always found them intimidating.
I saw via tutorials you could fit the GLM as you would a multiple linear regression, but you specify the family distribution. When I did this with the full model, the coefficient estimates did not match with what was plotted. For example, a scatter plot indicated that there was a positive slope, but the estimate came back as 0.003 or a highly negative slope. I understand that this is often due to multicollinearilty, so I ran a VIF check for multicollinearity and centred the variables that were causing the issue. I then read that it is often because other predictors are suppressing the main predictors, so I decided to run multiple models.
I compare the models using AICc and ANOVA, which was fine and now I have a reduced model that works well but it takes away two predictors. My question is regards to what this actually means and how I go about reporting the results.