It is trivial to create a boxplot in R with a full dataset. However, with limited access to the whole dataset, I just have 5 data point at min, 25%, 50% ,75%, and max. So is there any easy way to reproduce the boxplot with only these 5 values?
Asked
Active
Viewed 4,648 times
2
1 Answers
8
It's still pretty trivial. You can't reproduce the whiskers of a default boxplot effectively if the minimum and maximum values exceed Tukey's fences, but the box itself should remain unaltered. E.g., with x=rnorm(9999), compare boxplot(x) vs. boxplot(quantile(x)):
$\leftarrow$ full dataset vs. your five values $\rightarrow$
Nick Stauner
- 12,342
- 5
- 52
- 110
-
2
boxplot(fivenum(x))is a lot shorter thanboxplot(c(min(x),quantile(x,c(.25,.5,.75)),max(x)))(though if the quartiles don't match the definition of hinge infivenumthat might not be suitable) – Glen_b Apr 02 '14 at 05:20 -
-
1See the edit to my comment; your original has the advantage of being able to use any of the 9 definitions of quantiles. Then again,
boxplot(quantile(x))would work in place of fivenum and probably matches the original post better; I'm just used to associating boxplots with fivenum. – Glen_b Apr 02 '14 at 05:25 -
-
Yep. In large samples like that you can't see any difference between them, of course – Glen_b Apr 02 '14 at 05:47
boxplot(your.five.data.points)? – Nick Stauner Apr 01 '14 at 23:44