0

This thread offers a solution for random multisampling with predefined probabilities. Here's an adaptation:

set.seed(1)
table(samp <- sample(1:3, size = nrow(mtcars), prob = c(0.3, 0.3, 0.4), replace = TRUE))
#> 
#>  1  2  3 
#>  9  8 15

nrow(mtcars) # 32
#> [1] 32
set1 = mtcars[samp == 1, ]
set2 = mtcars[samp == 2, ]
set3 = mtcars[samp == 3, ]

# disp means
mean(set1$disp)
#> [1] 263.9889
mean(set2$disp)
#> [1] 232.1625
mean(set3$disp)
#> [1] 209.9933

Created on 2022-04-29 by the reprex package (v2.0.1)

I need to take samples but in a way that disp average in each set are as close as possible.

Alberson Miranda
  • 766
  • 3
  • 20

0 Answers0