1

I have a real-valued, unknown distribution $\mu$ and would like to find the largest threshold $t \in \mathbb{R}$ such that $\Pr_{X \sim \mu}\left[X \leq t\right] \leq q$ with high probability $1-\alpha$, where $q$ is some pre-specified value $\in [0,1]$. Importantly, $t$ is not a fixed value, but a value that should be chosen based on sampled data.

In other words, I would like to obtain a probabilistic lower bound on $F^{-1}(q)$, where $F^{-1}$ is the quantile function / inverse cdf of $\mu$.

What would be an appropriate way of obtaining such a t via sampling?

I believe that How to obtain a confidence interval for a percentile? discusses a version of my problem, but is concerned with two-sided confidence intervals.

My approach would be to

  1. Take $N$ samples from distribution $\mu$.
  2. Sort them in ascending order $X_1 \leq X_2 \leq X_3 \dots \leq X_N$.
  3. Compute the maximum of the critical region of the one-sided binomial test, i.e. the largest $m$ s.t. $\mathrm{BT}(p>q, m, N) < \alpha$, where $p>q$ is the null-hypothesis.
  4. Return $X_m$.

Is that approach correct? Is there any existing literature that discusses this approach or this problem?

Thank you

  • This is called a non-parametric upper tolerance limit (not lower) and is solved in exactly the same manner as the reference you found. The solution actually is simpler, so you (and all readers) should find it straightforward to adapt that reference to your situation: just set the lower limit to $-\infty.$ The reference I gave in my answer deals with your situation, too. – whuber May 04 '22 at 17:46
  • 1
    Thank you, that answers my question :) – funky_capybara May 04 '22 at 17:53

0 Answers0