6

How do I select a random row from the database based on the probability chance assigned to each row.
Example:

Make        Chance  Value
ALFA ROMEO  0.0024  20000
AUDI        0.0338  35000
BMW         0.0376  40000
CHEVROLET   0.0087  15000
CITROEN     0.016   15000
........

How do I select random make name and its value based on the probability it has to be chosen.

Would a combination of rand() and ORDER BY work? If so what is the best way to do this?

Drew
  • 24,556
  • 10
  • 41
  • 75
Dharman
  • 26,923
  • 21
  • 73
  • 125

1 Answers1

8

You can do this by using rand() and then using a cumulative sum. Assuming they add up to 100%:

select t.*
from (select t.*, (@cumep := @cumep + chance) as cumep
      from t cross join
           (select @cumep := 0, @r := rand()) params
     ) t
where @r between cumep - chance and cumep
limit 1;

Notes:

  • rand() is called once in a subquery to initialize a variable. Multiple calls to rand() are not desirable.
  • There is a remote chance that the random number will be exactly on the boundary between two values. The limit 1 arbitrarily chooses 1.
  • This could be made more efficient by stopping the subquery when cumep > @r.
  • The values do not have to be in any particular order.
  • This can be modified to handle chances where the sum is not equal to 1, but that would be another question.
Gordon Linoff
  • 1,198,228
  • 53
  • 572
  • 709