2

I have 10th, 25th, 50th, 75th and 90th percentiles of a population which is NOT normally distributed. On the other hand, I have a sample from that population and I want to test if its percentiles are equal to the population percentiles:

H0: nth percentile of sample = nth percentile of the population

I know I can implement One sample Wilcoxon Rank-Sum Test for test the median, but I was wondering if it's possible to use it for testing the other percentiles?

Hossein
  • 49
  • Since you have more than two groups (e.g., you describe 5), and the rank sum test is strictly for 2 groups, I wonder if you would be better served by an omnibus test, like a Kruskal-Wallis (e.g., with Dunn's test or the Conover-Iman test for post hoc pairwise comparisons)? However, the K-W assumes group independence and I wonder if groups defined by percentiles may have some dependency? Also: Welcome to CV, Hossein Ayouqi! – Alexis Dec 15 '22 at 19:00
  • 1
    Some thoughts: The Wilcoxon signed rank test is not normally a test of the median, so it may not really an option to modify it to address other percentiles. I think an option would be to use a one-sample sign test, modified to address other percentiles. I think this would work simply, but I'd have to think about it to be sure. Another option is to set up a permutation test for the desired percentile. – Sal Mangiafico Dec 15 '22 at 19:42
  • 2
    Assuming the reference distribution is continuous at these five percentiles, they partition the population into six nonoverlapping bins. A sufficient statistic for this test therefore is the vector of the six bin counts for the sample. The null hypothesis is that this is multinomial with probability vector $(0.10,0.15,0.25,0.25,0.15,0.10).$ The chi-squared statistic is one good (and simple) way to test this against all alternatives. – whuber Dec 15 '22 at 19:54
  • @whuber, That looks interesting and doable as my data is discrete. Is there any caveat of using it? – Hossein Dec 15 '22 at 21:26
  • Yes: if there are ties at the percentiles, you need additional information, because now the bins overlap. Another caveat, but not much of one, is that if your sample size is less than 50 or so, you need to be careful about using the chi-squared distribution to compute p-values (because some of the expected bin populations will be less than 5). But that's fairly easy to overcome by simulating the null distribution. – whuber Dec 15 '22 at 21:47

0 Answers0