Suppose that I have a list of numbers, say [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]. These numbers are realisations of a random variable $X$ whose distribution I am interested in.
Suppose I want to estimate the 5th percentile of this distribution (to form a confidence interval for $X$). Intuitively, I would take the lowest number (i.e. 1) as my estimate for this since 90% of the numbers are higher, and 0% are lower (which in some sense 'averages out' to the desired 5%!) However, when I use numpy.percentile (with the default setting), it suggests an estimate of 1.45. Which estimate is better, and why?
Update: To clarify, my goal is to estimate the interval $I=[x,y]$ with $y > x$ such that $\mathbb{P}(X \in I)=0.95$. The numbers $x$ and $y$ are the 5th and 95th percentiles respectively of the distribution of $X$.
numpy.percentilein Python. (I assume that's what you're using.) There are several options for methods that it uses to calculate percentiles. numpy.org/doc/stable/reference/generated/numpy.percentile.html – Sal Mangiafico Apr 19 '23 at 11:18