The key to answering your question is in this part of the article:
The asymptotic distribution of the $p$-th sample quantile is well-known:
it is asymptotically normal around the $p$-th population quantile with variance equal to:
$$ \frac{p(1-p)}{N f(x_p)^2} $$
Now what does this mean? First, let's consider an experiment: Given a probability distribution $X$ that we can draw samples from, a fixed real number $ 0 < p < 1$, and an integer value $N$, we draw $N$ samples from $X$, sort the samples, and find the $p$-th quantile by finding the $N \times p$-th sample (for simplicity, if the result is an interval, meaning that $Np \notin \mathbb{N} $ and the $\lfloor Np \rfloor$-th sample is not equal to the $\lceil Np \rceil$-th sample, let's just go with the middle of the interval, i.e., the average of the two samples), recording it as the result of the experiment.
What the quote from the article states, is that if we know the $p$-th quantile of $X$ (again choosing the middle of the interval, if needed) and we choose a large enough $N$ (hence the term "asymptotic"), and perform the above experiment many many times, the results of the different experiments would be distributed similar to a normal distribution around $x_p$ with the variance above, given that the "true" $p$-th quantile $x_p$, a.k.a the $p$-th population quantile, is a possible output of the distribution ($f(x_p) \neq 0$).
This means that there are three possible cases:
- If the population $p$-th quantile $x_p$ has finite, non-zero probability density in the underlying distribution of the population (0 < $f(x_p) < \infty$), the results of the experiment will get closer to $x_p$ as we increase the sample size.
- If the $f(x_p)$ is infinite, then, for a large enough $N$, the variance will be identically zero! This means that almost all experiments will result in the correct estimation of $x_p$, e.g., discrete uniform distribution over $\{ 1, 2, 3 \}$.
- If $f(x_p) = 0$, however, as in your example with the discrete distribution over $\{ 1, 2, 3, 4 \}$, the statement offers nothing! Looking closer at the problem, we can see that in those cases, we would only correctly estimate $x_p$ in the middle of the interval if our sample happened to have the exact same number of members from either side of the interval. In many other cases, the result of the experiment would be one of the end points of the interval. This demonstrates that the results of our experiments indeed do not necessarily get closer to $x_p$ as we increase $N$. Note that whatever strategy we had decided to handle the intervals with, we would have faced a similar problem in this case.
Regarding the nine methods mentioned in the article, I believe it's important to note that given the asymptotic case, i.e. a constant $p$ (say 0.01) and a very large $N$ (say millions), a floor/ceiling operator and/or a $+\frac{3}{8}$ don't really matter, as their effects would pale in comparison to the magnitude of the $N \times p$ term. Meaning that all the methods boil down to the same thing, if $N$ is large enough.
Everything discussed in this answer pertained the asymptotic case, where $N$ is very large. For a small number of samples from an arbitrary distribution, I believe a similar analysis would be more complicated and less applicable to "interesting" real life use cases.
I hope this helps :)