Suppose I have some prior $\pi(\theta)$, from which I draw $N$ samples, each having parameter $\theta_i$. These $\theta_i$'s are known to me. Suppose that one of these samples (unknown to me which) generates some data $y$, following some likelihood $\mathcal{L}(y|\theta)$. Now, I can try to identify which $i$ generated the $y$ by calculating evidences of each $i$ generating the data $y$. The true $i$ will then usually have the highest evidence.
The question that I am interested in is: how predictive is this data $y$ to getting the right $i$? How often can I expect to get it right, and with what certainty? I am essentially asking: what the information content that $y$ gives me for identification?
One way (possibly not a good one of looking at this) is with some information theory-like arguments, so a possible avenue I thought of:
I could calculate the relative entropy (KL-divergence) of the prior to the posterior, $D_{KL}(p(\theta|y)||\pi(\theta))$. This can be converted to a factor, for example, if the information is a factor of 50, I would say that I can expect to identify the right $i$ out of a sample of $N=50$. Is that right? What is the actual relation between accuracy and $N$ given some information as quantified by the relative entropy? Is there a way of quantifying this?
Another way to go about this might be by looking at the distributions of the evidences given a wrong datapoint and a right datapoint, but how to go about this, I am also unsure of.
I feel like stuff this must have been done before, so relevant references would also be appreciated!