Can a likelihood's relative entropy be related to its predictive accuracy?

Question

Suppose I have some prior $\pi(\theta)$, from which I draw $N$ samples, each having parameter $\theta_i$. These $\theta_i$'s are known to me. Suppose that one of these samples (unknown to me which) generates some data $y$, following some likelihood $\mathcal{L}(y|\theta)$. Now, I can try to identify which $i$ generated the $y$ by calculating evidences of each $i$ generating the data $y$. The true $i$ will then usually have the highest evidence.

The question that I am interested in is: how predictive is this data $y$ to getting the right $i$? How often can I expect to get it right, and with what certainty? I am essentially asking: what the information content that $y$ gives me for identification?

One way (possibly not a good one of looking at this) is with some information theory-like arguments, so a possible avenue I thought of:

I could calculate the relative entropy (KL-divergence) of the prior to the posterior, $D_{KL}(p(\theta|y)||\pi(\theta))$. This can be converted to a factor, for example, if the information is a factor of 50, I would say that I can expect to identify the right $i$ out of a sample of $N=50$. Is that right? What is the actual relation between accuracy and $N$ given some information as quantified by the relative entropy? Is there a way of quantifying this?

Another way to go about this might be by looking at the distributions of the evidences given a wrong datapoint and a right datapoint, but how to go about this, I am also unsure of.

I feel like stuff this must have been done before, so relevant references would also be appreciated!

Interesting question! You could have a look at my answer at https://stats.stackexchange.com/questions/188903/intuition-on-the-kullback-leibler-kl-divergence/189758#189758, I think there is a conection ... — kjetil b halvorsen, Dec 26 '20 at 14:35

Can a likelihood's relative entropy be related to its predictive accuracy?

0 Answers0