I wonder if the following mixed model suffers of overfitting
Note: The following is just an hipotethical model to examplify the model construction.
Research question: what affect apple production?
Experiment design: I have 100 indipendent sites where I grow 1 apple tree x site. At each site I measured every year a variable (e.g. yearly number of apples produced by the tree). Time series length (number of years with data) varies from site to site ranging from 2 to 10 years. If I pull together all data series I have 500 yearly observations.
Mixed model:
- the response variable is the "yearly number of apple"
- the random factor is the "site"
- the predictors are 10 (e.g. age of the tree; yearly precipitation...)
If I divide yearly observations (i.e. 500) by fixed factors (i.e. 10) the ratio is 50, consequently the model satisfies the rule of 10-15 observations for each predictor to avoid overfitting.
However, if I divide yearly observations (i.e. 500) by fixed (i.e. 10) and random (i.e. 100) factors the result is 0.5, that is too few observations to fit a model with 10 predictors. Clear overfitting.
Which is the way I should look for overfitting in a similar mixed model?