0

A lot of machine learning models aim to approximate probability distributions. Let’s say P is the distribution of the data and Q is the distribution learned by our model. How do you measure how close Q is to P?

Please include sources/references if possible.

This question is from Chip Huyens ML interviews book

  • 1
    It may depend on whether you are dealing with univariate or multivariate distributions and whether the random variables are continuous or discrete. Wikipedia has a list of statistical distances – Henry Jul 22 '23 at 16:44
  • 1
    and there are answers on this site such as https://stats.stackexchange.com/questions/425040/how-to-measure-the-statistical-distance-between-two-frequency-distributions and https://stats.stackexchange.com/questions/4044/measuring-the-distance-between-two-multivariate-distributions and https://stats.stackexchange.com/questions/78405/measuring-distance-between-two-empirical-distributions – Henry Jul 22 '23 at 16:44
  • https://stats.stackexchange.com/questions/76350/goodness-of-fit-for-continuous-variables might help as well. – jbowman Jul 22 '23 at 16:44

0 Answers0