From - Thu Jan 21 15:21:52 1999
From: George Marsaglia
Subject: Random numbers for C: End, at last?
Date: Thu, 21 Jan 1999 03:08:52 GMT
Organization: Florida State University
My offer of RNG's for C was an invitation to dance;
I did not expect the Tarantella. I hope this post will
stop the music, or at least slow it to a stately dance
for language chauvinists and software police---under
a different heading.
In response to a number of requests for good RNG's in
C, and mindful of the desirability of having a variety
of methods readily available, I offered several. They
were implemented as in-line functions using the #define
feature of C.
Numerous responses have led to improvements; the result
is the listing below, with comments describing the
generators.
I thank all the experts who contributed suggestions, either
directly to me or as part of the numerous threads.
It seems necessary to use a (circular) table in order
to get extremely long periods for some RNG's. Each new
number is some combination of the previous r numbers, kept
in the circular table. The circular table has to keep
at least the last r, but possible more than r, numbers.
For speed, an 8-bit index seems best for accessing
members of the table---at least for Fortran, where an
8-bit integer is readily available via integer*1, and
arithmetic on the index is automatically mod 256
(least-absolute-residue).
Having little experience with C, I got out my little
(but BIG) Kernighan and Ritchie book to see if there
were an 8-bit integer type. I found none, but I did
find char and unsigned char: one byte. Furthemore, K&R
said arithmetic on characters was ok. That, and a study
of the #define examples, led me to propose #define's
for in-line generators LFIB4 and SWB, with monster
periods. But it turned out that char arithmetic jumps
"out of character", other than for simple cases such as
c++ or c+=1. So, for safety, the index arithmetic
below is kept in character by the UC definition.
Another improvement on the original version takes
advantage of the comma operator, which, to my chagrin,
I had not seen in K&R. It is there, but only with an
example of (expression,expression). From the advice of
contributors, I found that the comma operator allows
(expression,...,expression,expression) with the
last expression determining the value. That makes it
much easier to create in-line functions via #define
(see SHR3, LFIB4, SWB and FIB below).
The improved #define's are listed below, with a
function to initialize the table and a main program
that calls each of the in-line functions one million
times and then compares the result to what I got with
a DOS version of gcc. That main program can serve
as a test to see if your system produces the same
results as mine.
_________________________________________
|If you run the program below, your output|
| should be seven lines, each a 0 (zero).|
-----------------------------------------
Some readers of the threads are not much interested
in the philosophical aspects of computer languages,
but want to know: what is the use of this stuff?
Here are simple examples of the use of the in-line
functions: Include the #define's in your program, with
the accompanying static variable declarations, and a
procedure, such as the example, for initializing
the static variable (seeds) and the table.
Then any one of those in-line functions, inserted
in a C expression, will provide a random 32-bit
integer, or a random float if UNI or VNI is used.
For example, KISS&255; would provide a random byte,
while 5.+2.*UNI; would provide a random real (float)
from 5 to 7. Or 1+MWC%10; would provide the
proverbial "take a number from 1 to 10",
(but with not quite, but virtually, equal
probabilities).
More generally, something such as 1+KISS%n; would
provide a practical uniform random choice from 1 to n,
if n is not too big.
A key point is: a wide variety of very fast, high-
quality, easy-to-use RNG's are available by means of
the nine in-line functions below, used individually or
in combination.
The comments after the main test program describe the
generators. These descriptions are much as in the first
post, for those who missed them. Some of the
generators (KISS, MWC, LFIB4) seem to pass all tests of
randomness, particularly the DIEHARD battery of tests,
and combining virtually any two or more of them should
provide fast, reliable, long period generators. (CONG
or FIB alone and CONG+FIB are suspect, but quite useful
in combinations.)
Serious users of random numbers may want to
run their simulations with several different
generators, to see if they get consistent results.
That should be easy to do.
Bonne chance,
George Marsaglia
The C code follows---------------------------------:
#include
#define znew (z=36969*(z&65535)+(z>>16))
#define wnew (w=18000*(w&65535)+(w>>16))
#define MWC ((znew<<16)+wnew )
#define SHR3 (jsr^=(jsr<<17), jsr^=(jsr>>13), jsr^=(jsr<<5))
#define CONG (jcong=69069*jcong+1234567)
#define FIB ((b=a+b),(a=b-a))
#define KISS ((MWC^CONG)+SHR3)
#define LFIB4 (c++,t[c]=t[c]+t[UC(c+58)]+t[UC(c+119)]+t[UC(c+178)])
#define SWB (c++,bro=(x2^7700 and is highly
recommended.
Subtract-with-borrow has the same local behaviour
as lagged Fibonacci using +,-,xor---the borrow
merely provides a much longer period.
SWB fails the birthday spacings test, as do all
lagged Fibonacci and other generators that merely
combine two previous values by means of =,- or xor.
Those failures are for a particular case: m=512
birthdays in a year of n=2^24 days. There are
choices of m and n for which lags >1000 will also
fail the test. A reasonable precaution is to always
combine a 2-lag Fibonacci or SWB generator with
another kind of generator, unless the generator uses
*, for which a very satisfactory sequence of odd
32-bit integers results.
The classical Fibonacci sequence mod 2^32 from FIB
fails several tests. It is not suitable for use by
itself, but is quite suitable for combining with
other generators.
The last half of the bits of CONG are too regular,
and it fails tests for which those bits play a
significant role. CONG+FIB will also have too much
regularity in trailing bits, as each does. But keep
in mind that it is a rare application for which
the trailing bits play a significant role. CONG
is one of the most widely used generators of the
last 30 years, as it was the system generator for
VAX and was incorporated in several popular
software packages, all seemingly without complaint.
Finally, because many simulations call for uniform
random variables in 0