RNGCryptoServiceProvider - generate number in a range faster and retain distribution?

Question

I'm using the RNG crypto provider to generate numbers in a range the truly naive way:

byte[] bytes = new byte[4];
int result = 0;
while(result < min || result > max)
{
   RNG.GetBytes(bytes);
   result = BitConverter.ToInt32(bytes);
}

This is great when the range is wide enough such that there is a decent chance of getting a result, but earlier today I hit a scenario where the range is sufficiently small (within 10,000 numbers) that it can take an age.

So I've been trying to think of a better way that will achieve a decent distribution but will be faster. But now I'm getting into deeper maths and statistics that I simply didn't do at school, or at least if I did I have forgotten it all!

My idea is:

get the highest set bit positions of the min and max, e.g. for 4 it would be 3 and for 17 it would be 5
select a number of bytes from the prng that could contain at least the high bits, e.g.1 in this case for 8 bits
see if any of the upper bits in the allowed range (3-5) are set
if yes, turn that into a number up to and including the high bit
if that number is between min and max, return.
if any of the previous tests fail, start again.

Like I say, this could be exceedingly naive, but I am sure it will return a match in a narrow range faster than the current implementation. I'm not in front of a computer at the moment so can't test, will be doing that tomorrow morning UK time.

But of course speed isn't my only concern, otherwise I would just use Random (needs a couple of tick marks there to format correctly if someone would be kind enough - they're not on the Android keyboard!).

The biggest concern I have with the above approach is that I am always throwing away up to 7 bits that were generated by the prng, which seems bad. I thought of ways to factor them in (e.g. a simple addition) but they seem terribly unscientific hacks!

I know about the mod trick, where you only have to generate one sequence, but I also know about its weaknesses.

Is this a dead end? Ultimately if the best solution is going to be to stick with the current implementation I will, I just feel that there must be a better way!

`while(result < min && result > max) { } ` I guess there was meant to be `||` operator instead of `&&` between those two conditional operands. — AlexMelw, Sep 17 '17 at 20:02
yeah - have just edited. can't believe that wasn't picked up earlier. — Andras Zoltan, Nov 06 '17 at 21:01

score 65 · Accepted Answer · edited Nov 06 '17 at 20:59

Stephen Toub and Shawn Farkas has co-written an excellent article on MSDN called Tales From The CryptoRandom that you should definitely read if you're experimenting with RNGCryptoServiceProviders

In it they provide an implementation that inherits from System.Random (which contains the nice range-random method that you're looking for) but instead of using pseudo random numbers their implementation uses the RNGCryptoServiceProvider.

The way he has implemented the Next(min, max) method is as follows:

public override Int32 Next(Int32 minValue, Int32 maxValue)
{
    if (minValue > maxValue) 
        throw new ArgumentOutOfRangeException("minValue");
    if (minValue == maxValue) return minValue;
    Int64 diff = maxValue - minValue;
    while (true)
    {
        _rng.GetBytes(_uint32Buffer);
        UInt32 rand = BitConverter.ToUInt32(_uint32Buffer, 0);

        Int64 max = (1 + (Int64)UInt32.MaxValue);
        Int64 remainder = max % diff;
        if (rand < max - remainder)
        {
            return (Int32)(minValue + (rand % diff));
        }
    }
}

The reasoning for the choice of implementation as well as a detailed analysis about loss of randomness and what steps they are taking to produce high-quality random numbers is in their article.

Thread safe bufferred CryptoRandom

I've written an extended implementation of Stephen's class which utilized a random buffer in order to minimize any overhead of calling out to GetBytes(). My implementation also uses synchronization to provide thread safety, making it possible to share the instance between all your threads to make full use of the buffer.

I wrote this for a very specific scenario so you should of course profile whether or not is makes sense for you given the specific contention and concurrency attributes of your application. I threw the code up on github if you wan't to check it out.

Threadsafe buffered CryptoRandom based on Stephen Toub and Shawn Farkas' implementation

When I wrote it (a couple of years back) I seem to have done some profiling as well

Results produced by calling Next() 1 000 000 times on my machine (dual core 3Ghz)

System.Random completed in 20.4993 ms (avg 0 ms) (first: 0.3454 ms)
CryptoRandom with pool completed in 132.2408 ms (avg 0.0001 ms) (first: 0.025 ms)
CryptoRandom without pool completed in 2 sec 587.708 ms (avg 0.0025 ms) (first: 1.4142 ms)

|---------------------|------------------------------------|
| Implementation      | Slowdown compared to System.Random |
|---------------------|------------------------------------|
| System.Random       | 0                                  |
| CryptoRand w pool   | 6,6x                               |
| CryptoRand w/o pool | 19,5x                              |
|---------------------|------------------------------------|

Please note that theese measurements only profile a very specific non-real-world scenario and should only be used for guidance, measure your scenario for proper results.

Truly awesome, much better than my attempt! I can't believe Google-fu didn't bring up that article but, more importantly, a thread safe version will be of great help too. I'll download and have a play with it, but based on those profiles I can already see it is what I have been looking for. a very generous answer, thank you :) — Andras Zoltan, Jun 10 '11 at 06:47
using this has reduced the amount of time spent looking for a random number by at least a factor of 5. Still with an even distribution as well. I'm a happy man :) — Andras Zoltan, Jun 10 '11 at 15:05
@Andreas: Awesome! Really glad I could help although the majority of praise should be directed towards the two masterminds behind the MSDN article :) I've been using my implementation in production for quite a while now without encountering any bugs but if you find some you just let me know. — Markus Olsson, Jun 10 '11 at 15:12
Note that the `max` argument has a bit of a strange name here, as `min` is inclusive and `max` exclusive. So for a dice roll you would have to ask for the range [1, 7) - i.e. `max = 7` instead of `max = 6`. — Maarten Bodewes, Aug 09 '15 at 22:55
The MSDN article can be found here: [MSDNMagazineSeptember2007en-us.chm](http://download.microsoft.com/download/3/A/7/3A7FA450-1F33-41F7-9E6D-3AA95B5A6AEA/MSDNMagazineSeptember2007en-us.chm), in CHM form. (Columns, .Net Matters: Tales from the CryptoRandom) — CB-Dan, May 18 '16 at 15:45
Very late to this, but I'm pretty sure this code in the gist is a mistake: `if (IsRandomPoolEnabled && _buffer.Length <= buffer.Length)` - it checks that the *source* buffer is smaller than the *destination* buffer, rather than vice versa. Meaning it will only fill the destination buffer from the internal buffer if it's not actually able to. It will throw an exception before trying, though, since `if (requiredBytes > _buffer.Length)` in `EnsureRandomBuffer` does the exact same check. — JimmiTh, Sep 29 '16 at 13:19
Created a nuget package that uses this Crypto Secure Random number method https://www.nuget.org/packages/SecurePasswordGenerator/1.0.2. Packages uses the random number to generate random passwords — Paul Durbin, Jun 18 '19 at 20:09
Where does _uint32Buffer come from? I found this snippet incomplete and does not compile — Hoppe, Aug 05 '19 at 14:03

AlexMelw · Answer 2 · 2021-04-14T13:50:18.767

If you use a while loop, this is going to be slow and is based on an unknown number of iterations.

You could compute it on the first try using the modulo operator (%).

But, if we squeeze results with modulo, we immediately create an imbalance in the probability distributions.

This means that this approach could be applied if we care only about the speed, not probabilistic randomness of the generated number.

Here is a RNG utility that could suit your needs:

using System;
using System.Security.Cryptography;

static class RNGUtil
{
    /// <exception cref="ArgumentOutOfRangeException"><paramref name="min" /> is greater than <paramref name="max" />.</exception>
    public static int Next(int min, int max)
    {
        if (min > max) throw new ArgumentOutOfRangeException(nameof(min));
        if (min == max) return min;

        using (var rng = new RNGCryptoServiceProvider())
        {
            var data = new byte[4];
            rng.GetBytes(data);

            int generatedValue = Math.Abs(BitConverter.ToInt32(data, startIndex: 0));

            int diff = max - min;
            int mod = generatedValue % diff;
            int normalizedNumber = min + mod;

            return normalizedNumber;
        }
    }
}

In this case RNGUtil.Next(-5, 20) would fetch an arbitrary number within range -5..19

A small test:

var list = new LinkedList<int>();

for (int i = 0; i < 10000; i++)
{
    int next = RNGUtil.Next(-5, 20);
    list.AddLast(next);
}

bool firstNumber = true;
foreach (int x in list.Distinct().OrderBy(x => x))
{
    if (!firstNumber) Console.Out.Write(", ");
    Console.Out.Write(x);
    firstNumber = false;
}

Output: -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19

This works perfectly in my tests (an adaption of), so I would like to know why this isn't the correct answer? Why would the other solutions go to such effort of having the while loop with multiple iterations? I'll feel better about this solution if I hear some responses to this. Bottom line is: This solution with modulus is exactly how a random string generation was made elsewhere (by @codesinchaos, see here: https://stackoverflow.com/a/19068116/264031), where every time an index into an array of a set size was needed (thus really: a min-max). So why isn't this sufficient? — Nicholas Petersen, Nov 06 '17 at 18:36
Since the article that @MarkusOlsson linked to is now dead you have to use the time machine: https://web.archive.org/web/20090304194122/http://msdn.microsoft.com:80/en-us/magazine/cc163367.aspx That should explain why the `while` loop is required over the implementation you've given here. The important thing is not just that we get *fast* results, but that each integer has an equal chance of appearing. If you squeeze results with modulo, you immediately create an imbalance in the probability distributions. — Andras Zoltan, Nov 06 '17 at 20:56
@AndrasZoltan the page that opens in place of the article has a link to a doanloadable .chm version of the article. matter of fact, they have a link to probably almost every article. its organized by year/month. just have to scroll down to see it. http://download.microsoft.com/download/3/A/7/3A7FA450-1F33-41F7-9E6D-3AA95B5A6AEA/MSDNMagazineSeptember2007en-us.chm — Heriberto Lugo, Jun 19 '19 at 20:11

score 3 · Answer 3 · answered Jun 09 '11 at 21:28

You can generate many more bytes at once for a very small overhead. The main overhead with the RNGCrptoService is the call itself to fill in the bytes.

While you might throw away unused bytes, I'd give it a shot as I've gotten very good speeds from this and the modulo method (that you aren't using).

int vSize = 20*4;
byte[] vBytes = new byte[vSize];
RNG.GetBytes(vBytes);
int vResult = 0;
int vLocation = 0;
while(vResult < min || vResult > max)
{
    vLocation += 4;
    vLocation = vLocation % vSize;
    if(vLocation == 0)
        RNG.GetBytes(vBytes);
    vResult = BitConverter.ToInt32(vBytes, vLocation);
}

Another thing you can do is the comparison you where thinking of bitwise. However, I'd focus on if the range fits in a byte, a short, an int, or a long. Then you can modulo the int result by the max of that type (giving you the lower order bits).

//We want a short, so we change the location increment and we modulo the result.
int vSize = 20*4;
byte[] vBytes = new byte[vSize];
RNG.GetBytes(vBytes);
int vResult = 0;
int vLocation = 0;
while(vResult < min || vResult > max)
{
    vLocation += 2;
    vLocation = vLocation % vSize;
    if(vLocation == 0)
        RNG.GetBytes(vBytes);
    vResult = BitConverter.ToInt32(vBytes, vLocation) % 32768;
}

Nicholas Petersen · Answer 4 · 2017-11-06T19:02:53.567

The following is an adaptation of @Andrey-WD's answer above, but with the difference that you simply send in a random number you already have generated (a ulong in this case, could be changed to uint). Where this is very efficient is when you need multiple random numbers in a range, you can simply generate an array of such numbers via RNGCryptoServiceProvider (or whatever, even with Random if that fit your needs). This I'm sure will be far more performant when needing to generate multiple random numbers within a range. All you need is the stash of random numbs to feed the function. See my note above on @Andrey-WD's answer, I'm curious why others aren't doing this simpler modulus route which doesn't require multiple iterations. If there really is a required reason for the multiple iterations route, I would be glad to hear it.

    public static int GetRandomNumber(int min, int max, ulong randomNum)
    {
        if (min > max) throw new ArgumentOutOfRangeException(nameof(min));
        if (min == max) return min;

        //var rng = new RNGCryptoServiceProvider();
        //byte[] data = new byte[4];
        //rng.GetBytes(data);
        //int generatedValue = Math.Abs(BitConverter.ToInt32(data, startIndex: 0));

        int diff = max - min;
        int mod = (int)(randomNum % (ulong)diff); // generatedValue % diff;
        int normalizedNumber = min + mod;

        return normalizedNumber;
    }

Here's how you can effectively get a clean array of random numbers. I like how this cleanly encapsulates getting the random numbers, the code that uses this then doesn't have to be cluttered with byte conversion on every iteration to get the int or long with BitConverter. I also assume this gains performance by a singular conversion of bytes to the array type.

    public static ulong[] GetRandomLongArray(int length)
    {
        if (length < 0) throw new ArgumentOutOfRangeException(nameof(length));
        ulong[] arr = new ulong[length];
        if (length > 0) { // if they want 0, why 'throw' a fit, just give it to them ;)
            byte[] rndByteArr = new byte[length * sizeof(ulong)];
            var rnd = new RNGCryptoServiceProvider();
            rnd.GetBytes(rndByteArr);
            Buffer.BlockCopy(rndByteArr, 0, arr, 0, rndByteArr.Length);
        }
        return arr;
    }

Usage:

        ulong[] randomNums = GetRandomLongArray(100);
        for (int i = 0; i < 20; i++) {
            ulong randNum = randomNums[i];
            int val = GetRandomNumber(10, 30, randNum); // get a rand num between 10 - 30
            WriteLine(val);
        }

Since the article that @MarkusOlsson linked to is now dead you have to use the time machine: https://web.archive.org/web/20090304194122/http://msdn.microsoft.com:80/en-us/magazine/cc163367.aspx That should explain why the `while` loop is required over the implementation you've given here. The important thing is not just that we get *fast* results, but that each integer has an equal chance of appearing. If you squeeze results with modulo, you immediately create an imbalance in the probability distributions. — Andras Zoltan, Nov 06 '17 at 20:58

Peter O. · Answer 5 · 2021-06-02T23:43:45.397

Let me talk about random integer generating algorithms that are "optimal" in terms of the number of random bits it uses on average. In the rest of this post, we will assume we have a "true" random generator that can produce unbiased and independent random bits. (Here, a random "byte" will be a block of 8 random bits.)

In 1976, D. E. Knuth and A. C. Yao showed that any algorithm that produces random integers with a given probability, using only random bits, can be represented as a binary tree, where random bits indicate which way to traverse the tree and each leaf (endpoint) corresponds to an outcome. Knuth and Yao showed that any optimal binary tree algorithm for generating integers in [0, n) uniformly, will need at least log2(n) and at most log2(n) + 2 bits on average. (Thus, even an optimal algorithm has a chance of "wasting" bits.) See below for examples of optimal algorithms.

However, any optimal integer generator that is also unbiased will, in general, run forever in the worst case, as also shown by Knuth and Yao. Going back to the binary tree, each one of the n outcomes labels leaves in the binary tree so that each integer in [0, n) can occur with probability 1/n. But if 1/n has a non-terminating binary expansion (which will be the case if n is not a power of 2), this binary tree will necessarily either—

Have an "infinite" depth, or
include "rejection" leaves at the end of the tree,

And in either case, the algorithm will run forever in the worst case, even if it uses very few random bits on average. (On the other hand, when n is a power of 2, the optimal binary tree will have no rejection nodes and require exactly n bits before returning an outcome, so that no bits will be "wasted".) The Fast Dice Roller is an example of an algorithm that uses "rejection" events to ensure it's unbiased; see the comment in the code below.

Thus, in general, a random integer generator can be either unbiased or constant-time (or even neither), but not both. And the binary tree concept shows that there is no way in general to "fix" the worst case of an indefinite running time without introducing bias. For instance, modulo reductions (e.g., rand() % n) are equivalent to a binary tree in which rejection leaves are replaced with labeled outcomes — but since there are more possible outcomes than rejection leaves, only some of the outcomes can take the place of the rejection leaves, introducing bias. A similar kind of binary tree — and a similar kind of bias — results if you stop rejecting after a set number of iterations. (However, this bias may be negligible depending on the application. There are also security aspects to random integer generation, which are too complicated to discuss in this answer.)

Fast Dice Roller Implementation

There are many examples of optimal algorithms in the sense given earlier. One of them is the Fast Dice Roller by J. Lumbroso (2013) (implemented below), and perhaps other examples are the algorithm given as an answer to a similar Stack Overflow question and the algorithm given in the Math Forum in 2004. On the other hand, all the algorithms surveyed by M. O'Neill are not optimal, since they rely on generating blocks of random bits at a time. See also my note on integer generating algorithms.

The following is a JavaScript implementation of the Fast Dice Roller. Note that it uses rejection events and a loop to ensure it's unbiased. nextBit() is a method that produces an independent unbiased random bit (e.g., Math.random()<0.5 ? 1 : 0, which isn't necessarily efficient in terms of random bits ultimately relied on in JavaScript).

function randomInt(minInclusive, maxExclusive) {
 var maxInclusive = (maxExclusive - minInclusive) - 1
 var x = 1
 var y = 0
 while(true) {
    x = x * 2
    var randomBit = nextBit()
    y = y * 2 + randomBit
    if(x > maxInclusive) {
      if (y <= maxInclusive) { return y + minInclusive }
      // Rejection
      x = x - maxInclusive - 1
      y = y - maxInclusive - 1
    }
 }
}

Reducing Bit Waste

Recall that "optimal" integer generators, such as the Fast Dice Roller above, use on average at least log2(n) bits (the lower bound), or come within 2 bits of this lower bound on average. There are various techniques that can be used to bring an algorithm (even a less than optimal one) closer to this theoretical lower bound, including batching and randomness extraction. These are discussed in:

The Fast Dice Roller paper itself, see section 3.1 (batching).
The paper "Random variate generation using only finitely many unbiased, independently and identically distributed random bits" by Devroye and Gravel, section 2.3 (randomness extraction).
The Math Forum page given above (recycling).

The following are examples of "batching": to generate four random digits from 0 through 9, simply generate a random integer in [0, 9999], and break the resulting number into digits. Generating eight random digits instead would involve the interval [0, 99999999].

score 0 · Answer 6 · answered Dec 26 '21 at 17:30

0

public int RandomNumber(int min = 1, int max = int.MaxValue)
{
    using (var rng = new RNGCryptoServiceProvider())
    {
       byte[] buffer = new byte[4];
       rng.GetBytes(buffer);
       return (int)(BitConverter.ToUInt32(buffer, 0) >> 1) % ((max - min) + 1);
    }
}

answered Dec 26 '21 at 17:30

Charles W

1

1

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 26 '21 at 20:14

RNGCryptoServiceProvider - generate number in a range faster and retain distribution?

6 Answers6

Fast Dice Roller Implementation

Reducing Bit Waste

Linked