0

I have a table of ~600 numbers and percentiles like below:

Data 40th 50th 60th 70th 80th 90th
71 53 55 57 60 63 68
101 78 81 83 88 92 102
121 89 94 98 105 110 123
209 146 153 163 171 176 189
68 71 74 77 81 85 92
... ... ... ... ... ... ...

What I want to know is what would be the best way to determine which percentile the 'Data' column falls closest to. On first glance, it looks to be the 90th percentile.

I calculated the percent difference between each Data number and each percentile in their own columns, then I took the average of each of those columns (only considering percent differences in the range -100% <= x <= 100%) to see which average percent difference was the smallest. I again think that the 90th percentile is what closest aligns since I ended up with 2% average % difference.

40th % Diff 50th % Diff 60th % Diff 70th % Diff 80th % Diff 90th % Diff
-25% -23% -20% -15% -11% -4%
-23% -20% -18% -13% -9% 1%
-26% -22% -19% -13% -9% 2%
-30% -27% -22% -18% -16% -10%
4% 9% 13% 19% 25% 35%
... ... ... ... ... ...
Percentile Average % Diffs
40th -21%
50th -18%
60th -14%
70th -10%
80th -6%
90th 2%

Or is my methodology off? Is there a better way of trying to determine which percentile is closest to the 'Data' column?

  • It all comes down to how you measure "closeness." It could be the absolute difference, a relative difference, a difference in percentiles, and many other possibilities. Could you tell us what measure is suitable for your application? – whuber Dec 21 '22 at 19:09
  • @whuber Not entirely sure, was given this dataset and told to "determine which percentile the data column closely aligns with" and my first guess was to look at the differences between percentiles. Perhaps is there a way to calculate the exact percentile of the 'Data' column, like maybe it's actually the 89th or 91st percentile? – Buracku Dec 21 '22 at 19:37
  • I suggest returning to your client for clarification about what they mean. – whuber Dec 21 '22 at 20:03

0 Answers0