I am currently trying to run a model analyzing the duration of the egg stage of each sex of two species of insect across five different temperatures. All independent variables are categorical. My model looks like this:
model1<-lm(Duration.egg~Temperature*Species*Sex, data = egg.na.2)
In building this model I have tried to evaluate the assumptions of:
- Normality of residuals
- Homoscedasticity
- No serial autocorrelation
- no unduly influential observations (leverage above a particular threshold).
It would seem that all of these assumptions appear to have been violated (see below my diagnostic plots, Autocorrelation function plot, leverage plot). As a result I was wondering what would be the best assumption to tackle first or whether or not there is a generally accepted hierarchy as to which assumptions, if violated, should be rectified first.
Here are my diagnostic plots via plot(model1):

Here is my autocorrelation function plot:

Here is my plot of leverage for each data point with the threshold determining whether or not a datapoint is influential being set as 2p/n where p = no of independent variables and n = sample size.


sresid<-residuals(model1, type = "pearson")to produce standardized residuals thenacf(sresid, main = "Auto-correlation plot")to produce the ACF plot as this was provided in my R handbook. 4) I will try the function in the olsrr in order to gauge influence. – Insect_biologist Jan 16 '23 at 20:13lag Autocorrelation D-W Statistic p-value 1 0.893277 0.2054992 0 Alternative hypothesis: rho != 0– Insect_biologist Jan 16 '23 at 20:19