Is there any advantage of fitting a distribution to the data to calculate the percentile?

Question

Method 1: Calculating the percentiles, e.g. 99 percentile of the data, is straight forward, and the calculation is based on the ordering of the data values.

Method 2: A more complicated way of calculating the percentiles will be first fit a distribution to the data (e.g. if we know the data is normal, we fit a normal distribution, or do a non-parametric Kernel density estimation), and then calculate the inverse cdf to get the 99 percentile of the data.

I am wondering is there any advantage for doing the latter method? My two guesses

I am thinking inferring the percentiles from the distribution may be more robust as method 1 result is more sensitive to changes in the data?
We can treat method two result as also a probability of the value occurring, whereas method 1 we can't?

score 3 · Accepted Answer · answered Mar 05 '19 at 17:42

3

Fitting a distribution to the data first (and then getting the quantiles from the fitted distribution) is a way of smoothing, or regularization. If you are reasonably confident that your model is reasonable, then that would typically lead to better estimates. Especially if you need estimates out in the tail.

answered Mar 05 '19 at 17:42

kjetil b halvorsen

77,844

Thanks - do you know of any reference I can cite that examines this question? – Realhermit Oct 13 '22 at 20:18

Is there any advantage of fitting a distribution to the data to calculate the percentile?

1 Answers1