Friday, July 18, 2014

Implement rand7() using rand5()

Problem

Write a method to generate a random number between 1 and 7, given a method that generates a random number between 1 and 5. (i.e., implement rand7() using rand5()).

Solution

This appear to be one of those probabilistic analysis questions. You should be familiar with the concept of expected value, as it could be extremely helpful in probabilistic analysis.

This solution is based upon Rejection Sampling. The main idea is when you generate a number in the desired range, output that number immediately. If the number is out of the desired range, reject it and re-sample again. As each number in the desired range has the same probability of being chosen, a uniform distribution is produced.
I like this solution as mentioned here. Obviously, we have to run rand5() function at least twice, as there are not enough numbers in the range of 1 to 7. By running rand5() twice, we can get integers from 1 to 25 uniformly. Why?

int rand7()
{
    int vals[5][5] = {
        { 1, 2, 3, 4, 5 },
        { 6, 7, 1, 2, 3 },
        { 4, 5, 6, 7, 1 },
        { 2, 3, 4, 5, 6 },
        { 7, 0, 0, 0, 0 }
    };

    int result = 0;
    while (result == 0)
    {
        int i = rand5();
        int j = rand5();
        result = vals[i-1][j-1];
    }// result is now uniformly random between 1 and 21
    return result;
}

In the above array, if we generate index upto 21 elements i.e. rows 0 to 3 and columns 0 to 5 and for row 4 only column 1.  To put it differently we can also do following:
int rand7()
{
    int vals[5][5] = {
        { 1, 2, 3, 4, 5 },
        { 6, 7, 1, 2, 3 },
        { 4, 5, 6, 7, 1 },
        { 2, 3, 4, 5, 6 },
        { 7, 0, 0, 0, 0 }
    };

    int result = 0;
    while (result > 21)
    {
        i = rand5();
        j = rand5();
        result = j + (i-1)*5;

    }// result is now uniformly random between 1 and 21
    return result % 7 + 1; ;
}

Since 25 is not a multiple of sevens, we have to use rejection sampling. Our desired range is integers from 1 to 21, which we can return the answer immediately. If not (the integer falls between 22 to 25), we reject it and repeat the whole process again.

Now let’s get our hands dirty to calculate the expected value for the number of calls to rand5() function.
E(# calls to rand5) = 2 * (21/25) + 
                      3 * (4/25) * (14/20) + 
                      4 * (4/25)2 * (6/20) + 
                      ...

                      
                    =  2k * (9/49)k-1 * (40/49)
                      k=1

                    = (80/49) / (1 - 9/49)2
                    = 2.21
How does it work? Think of it like this: imagine printing out this double-dimension array on paper, tacking it up to a dart board and randomly throwing darts at it. If you hit a non-zero value, it's a statistically random value between 1 and 7, since there are an equal number of non-zero values to choose from. If you hit a zero, just keep throwing the dart until you hit a non-zero. That's what this code is doing: the i and j indexes randomly select a location on the dart board, and if we don't get a good result, we keep throwing darts.

In worst case, it may run infinitely. 

Also, read this good answer on stackoverflow. 

References

0 comments:

Post a Comment