When applying the Bayesian information criterion, one has to use an "effective sample size" in the penalty term. E.g. if observing longitudinal data (e.g. changes in the blood pressure of an individual over time), the sample size to use should be somewhere between the number of observed individuals and the number of measured values. How to calculate this in general is not clear.
My question is: can the effective sample size depend on the model? Does it mean that BIC can be reliably used only with nested models, as then the effective sample size would be constant?