I am doing linear regression in R. I have identified an outlier in my data:
outliers::grubbs.test(all_pav$fmd_perc)
Grubbs test for one outlier
data: all_pav$fmd_perc
G = 4.42003, U = 0.75274, p-value = 9.442e-05
alternative hypothesis: highest value 43.6823104693141 is an outlier
I can see this outlier in my model plots:
moda = lm (fmd_perc ~ egfr_cr_cys + tacpd + qrisk3 + homa_ir,
data = all_pav, na.action =na.omit)
autoplot(moda)
I then create a new model with the outlier removed from the data. I intended to compare the adjusted $R^2$ value between the two models, and then perform an ANOVA comparing the two models. That second step isn't possible because the two models have different numbers of observations, and in R I get an error:
anova(moda, new_moda)
Error in anova.lmlist(object, ...) :
models were not all fitted to the same size of dataset
Is my approach correct for examining whether an outlier is influential?
