A lot of machine learning models aim to approximate probability distributions. Let’s say P is the distribution of the data and Q is the distribution learned by our model. How do you measure how close Q is to P?
Please include sources/references if possible.
This question is from Chip Huyens ML interviews book