Finding bias in PRNG

Published: Updated:

Biased scaling

First things first. How to find a bias in random number generator? But not that bias that all statistical tests search for in a stream of 32 bit numbers, but the bias after scaling.

This article, Bounded rands is the answer. It also will explain how to avoid biased scaling.

It's based on this paper, Fast Random Integer Generation in an Interval by Daniel Lemire which also has great insights.

Let's check the Isaac RNG implemented in Javascript (original repo, npm package repo). The JavaScript implementation and the original C implementation from the official website looks similar. Also it produces the same sequence when they start from the same seed.

function random(){
  return 0.5 + this.rand() * 2.3283064365386963e-10; // 2^-32
}

Here this.rand() returns a random integer value in range [0, 2^32), then we divide it to convert it into a double (See paragraph FP multiply in the article). 0.5 is a fix for uint representation in JavaScript.

And this method probably in every browser and MDN warns you:

If extremely large bounds are chosen (2^53 or higher), it's possible in extremely rare cases to reach the usually-excluded upper bound.

And let's assume that you are going to generate a random value in a range by this code

// return random value from [a, b) range
const random = function(a, b) {
  return a + Math.floor(isaac.random() * (b - a));
};

Surprisingly, this is very common way to do it in the JavaScript world.

So here is the way to properly use the isaac library implementing the OpenBSD algorithm

:::note The code for OpenBSD's arc4random_uniform (which is also used on OS X and iOS) adopts this strategy. :::

const max_uint = 0x100000000;
const rand_uint = () => {
  const r = isaac.rand();
  return r < 0 ? r + 0x100000000 : r; // fix uint representation in JavaScript
}
const rand_in_range = (range) => {
  let result = rand_uint();
  const working_range = max_uint - (max_uint % range);
  while (result >= working_range) {
    result = rand_uint();
  }
  return result % range;
}
const random = function(a, b) {
  return a + rand_in_range(b - a);
};

Reference

Plot the error

Anatomy of a pseudorandom number generator – visualising Cryptocat’s buggy PRNG

plot "chi-square.300.fixed.txt" w linesp, "chi-square.300.txt" w linesp

Test suites

TestU01 / BigCrush

Dieharder

isaac_test.js

isaac.seed([0]);

function delay(time) {
  return new Promise(resolve => setTimeout(resolve, time));
}

while (true) {
  const r = isaac.rand();
  const v = r < 0 ? r + 0x100000000 : r; // fix uint representation in JavaScript
  process.stdout.write(v + '\n');
  await delay(1);
}

numbers-to-stream.cpp

#include <cstdint>
#include <iostream>
#include <limits>

constexpr auto max_size = std::numeric_limits<std::streamsize>::max();

int main(int argc, char const *argv[])
{
    uint32_t input_number = 0;
    while (true) {
        std::cin >> input_number;
        if (std::cin.eof() || std::cin.bad()) {
            break;
        } else if (std::cin.fail()) {
            std::cin.clear(); // unset failbit
            std::cin.ignore(max_size, '\n'); // skip bad input
        } else {
            char c = (input_number & 0xFF000000) >> 24;
            std::cout << c;
            c = (input_number & 0x00FF0000) >> 16;
            std::cout << c;
            c = (input_number & 0x0000FF00) >> 8;
            std::cout << c;
            c = (input_number & 0x000000FF);
            std::cout << c;
        }
    }
    return 0;
}
node isaac_test.js | ./numbers-to-stream | dieharder -a -g 200

Here's the result with ISAAC seed = 0. One test is marked as WEAK for this seed, but it passes on another seed.

#=============================================================================#
#            dieharder version 3.31.2 Copyright 2003 Robert G. Brown          #
#=============================================================================#
   rng_name    |rands/second|   Seed   |
stdin_input_raw|  6.37e+05  |2933234618|
#=============================================================================#
        test_name   |ntup| tsamples |psamples|  p-value |Assessment
#=============================================================================#
   diehard_birthdays|   0|       100|     100|0.08296096|  PASSED
      diehard_operm5|   0|   1000000|     100|0.90920381|  PASSED
  diehard_rank_32x32|   0|     40000|     100|0.15277816|  PASSED
    diehard_rank_6x8|   0|    100000|     100|0.98528270|  PASSED
   diehard_bitstream|   0|   2097152|     100|0.06634944|  PASSED
        diehard_opso|   0|   2097152|     100|0.22866042|  PASSED
        diehard_oqso|   0|   2097152|     100|0.04849016|  PASSED
         diehard_dna|   0|   2097152|     100|0.64789513|  PASSED
diehard_count_1s_str|   0|    256000|     100|0.95809155|  PASSED
diehard_count_1s_byt|   0|    256000|     100|0.43858498|  PASSED
 diehard_parking_lot|   0|     12000|     100|0.51526020|  PASSED
    diehard_2dsphere|   2|      8000|     100|0.58505752|  PASSED
    diehard_3dsphere|   3|      4000|     100|0.94691424|  PASSED
     diehard_squeeze|   0|    100000|     100|0.66015259|  PASSED
        diehard_sums|   0|       100|     100|0.34256512|  PASSED
        diehard_runs|   0|    100000|     100|0.81029323|  PASSED
        diehard_runs|   0|    100000|     100|0.78271848|  PASSED
       diehard_craps|   0|    200000|     100|0.79607382|  PASSED
       diehard_craps|   0|    200000|     100|0.76686571|  PASSED
 marsaglia_tsang_gcd|   0|  10000000|     100|0.76298025|  PASSED
 marsaglia_tsang_gcd|   0|  10000000|     100|0.99609551|   WEAK
         sts_monobit|   1|    100000|     100|0.64468356|  PASSED

Compare with this homemade bitshift generator

// By Tony Swain. all rights preservered. 
//Nevermind you can have it, but you can't sue me.
// Actually just whipped it up for @Miklosan & @Mathicino
// **** Yet another random # generator. This one is designed for speed.
// compile gcc -o lrand lrand.c
// advantage; all done in inline and in register; zippidy do dah!
// No implied warranty, Based on code by Leo Schwab written a bazillion years ago.

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

static uint32_t tRandomSeed=0;

static inline uint32_t lrand()
{
    tRandomSeed <<= 1;
    if(tRandomSeed & 0x80000000)
        tRandomSeed ^= 0x80000000;
    else
        tRandomSeed ^= 0x1D872B41;
    return tRandomSeed;
}

int main(int argc, char* argv[])
{
   time_t lt = time(NULL);
   tRandomSeed = (uint32_t)lt; // seed

   union {
      uint32_t value;
      uint8_t stream[4];
   } v;

   while(1)
   {
      v.value = lrand();
      for (int i = 0; i < 4; ++i)
      {
         fprintf(stdout, "%c", v.stream[4-i-1]);
      }
   }
   return 0;
}

I ran "dieharder" tests on this RNG. It's not very convincing.

"dieharder" test results, part 2

ent

https://www.fourmilab.ch/random/

About PRNGs

rand()

Mersenne Twister

Mersenne Twister is very predictable. Here some steps to get it reversed.

ISAAC

The official page.

Review of weak starting points.

Statistical tests

We are working with discrete uniform distribution (not continuous!) https://en.wikipedia.org/wiki/Discrete_uniform_distribution. Kolmogorov-Smirnov test works only with continuous distributions, so we have to use Chi-Square

Chi-Square test

Kolmogorov-Smirnov test

Boost

For quantiles use boost. Include it with cmake in this way. Strip the boost later.

Rate this page