Generating Random in Parallel

The classic requirement for this a large Monte-Carlo simulation which you'd like to be able to run across many threads in production, but at the same time would always like to get exactly the same answer regardless of the number of threads that are used. In other words, the exact numerical answer should be exactly the same across multiple runs and independently of the number of threads.

This can be quite easily achieved using the TRNG library, which implements generators with convenient jump-ahead facility and works well with the OpenMP system.

Instead of a discussion, here is a program which illustrates how this may be done:

// Copyright (C) 2009 Bojan Nikolic <bojan@bnikolic.co.uk>
//
// This program is free software; you can redistribute it and/or
// modify it under the terms of the GNU General Public License in
// version 2 as published by the Free Software Foundation.
//
// This program is distributed in the hope that it will be useful, but
// WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
// General Public License for more details.
//
//
// Illustration of parrelel random number generation using trng
// Comments/equiries welcome at <bojan@bnikolic.co.uk>


#include <iostream>
#include <omp.h>
#include <trng/config.hpp>
#include <trng/yarn2.hpp>
#include <trng/uniform01_dist.hpp>

void example(std::ostream &os)
{
  const size_t tsamples=100000;
  const size_t chunk_size=100;
  const size_t nchunks=tsamples/chunk_size;

  size_t curr_chunk=0;
  size_t i=0;
  double tot=0;

  while (i< tsamples)
  {
#pragma omp parallel
    {

      trng::yarn2 rgen;
      trng::uniform01_dist<> u;
      size_t thischunk;

#pragma omp critical
      {
        thischunk=curr_chunk;
        rgen.jump(curr_chunk*chunk_size);
        ++curr_chunk;
      }

      double ctotal=0;
      for (size_t j=0; j<chunk_size; ++j)
        ctotal+= u(rgen);

#pragma omp critical
      {
        if (thischunk < nchunks)
        {
          tot+=ctotal;
          i+=chunk_size;
        }
        else
        {
          // got to throwaway these samples
        }
      }
    }

  }
  os<<"Average: " << tot/i << "  i: " <<i
    <<std::endl;

}


int main(int argc,
         char *argv[])
{
  example(std::cout);
}

You can try this by typing:

$ OMP_NUM_THREADS=1 ./prog
1000
Average: 0.498779  i: 100000
$ OMP_NUM_THREADS=10 ./prog
1000
Average: 0.498779  i: 100000
$ OMP_NUM_THREADS=100 ./prog
1000
Average: 0.498779  i: 100000

As you can see the result is exactly same with one or 100 threads.I'd be interested to hear of any comments.