I am trying to understand how simulation plays a role in checking model assumptions when the residuals do not have exact distributions.
I took this Poisson Regression Model:
$$\lambda_i = e^{(\beta_0 + \beta_1X_i)}$$
$$P(Y_i | X_i) \sim Poisson(\lambda_i = e^{(\beta_0 + \beta_1X_i)})$$
How can I see if the model assumptions are being met?
Some ideas come to mind:
Approach 1: I can bootstrap the data, fit the model on each bootstrap sample. If the model assumptions are met - at each $x_i$: the distribution of residuals over all bootstrap samples will have a mean of 0 and a variance of $\lambda_i$. (this approach seems like the most work computationally since multiple models are being fit)
Approach 2: I fit the model on the original data, then simulate new datasets from the model ... and compare all real $Y$ vs the simulated $Y$. If the model assumptions are met, the model will be able to very closely "reproduce" the observed data. (I don't know how to measure "closeness" over all $x_i$) . This approach seems like less work since only one model is being fit.
Approach 3 (DHARMA approach https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html): I fit the model on the original data. After fitting the model, I generate multiple values of $y_i$ at each $x_i$. For a given $x_i$ , I make an Empirical CDF of the simulated $y_i$. For the actual observed $y_i$ at that $x_i$, I find out where this $y_i$ is situated on the CDF. If the model fits the data well, this $y_i$ should be at the 50% level (I am not sure why - is this because that residuals have a uniform distribution?). I then repeat this empirical CDF for each $x_i$ and see on average if the true $y_i$ are close to the 50% level in each CDF on average. (this approach also only involves fitting one model)
All 3 approaches correct? Is this the correct use of simulation to validate model assumptions when the residuals do not have exact distributions?