I’ll bite! Possibly not, if the 40 different seeds are themselves chosen ‘randomly’ and the PRNG has some horrible dependence between numbers spaced by exactly 1 million in every sequence….

Sorry, no proof – just a thought experiment. Assume the PRNG is imperfect (all deterministic ones are!) and there are chunky ‘hyperplanes’ in the pseudorandom generated space. If it so happens that there is some complex but important non-independence between the every millionth number, then using 40 streams that are exactly one million numbers away from each other is probably a really bad idea. Sure, this is unlikely but that’s why I would guess that you would be a shade more likely to have independent sequences if each sequence starts using truly random seeds (lavarand eg) so the 40 sequences at least start independently.

]]>A possibly stupid question: Is there any a priori reason to think that these 40 different chunks coming from one seed will be any more independent than the results coming from 40 different seeds?

]]>A good pseudo random number generator produces a sequence of numbers that are indistinguishable from iid. uniform [0,1]. But there is absolutely no guarantee that two sequences, starting from different seeds, are independent, at least for the generators I know of (not that I have an encyclopedic knowledge about this). This is a common mistake, especially in parallel programming where one has to maintain a single random number stream across multiple processes.

In this case, the correct way to do this would be to generate a single big file which is subsequently split into 40 different files, for example using the unix command split (assuming wgsim tries to produce iid samples)

Of course, for the problem you are trying to address, your solution may be good enough for practical purposes.

]]>“The generation of random numbers is too important to be left to chance.”

—- Robert R. Coveyou, Oak Ridge National Laboratory ]]>

To further complicate matters, one has to be extremely cautious about the type of random number generator used. There was the infamous RANDU generator from IBM in the past, but strange problems persist even to this day. For instance, it turns out that the “standard” intrinsic Fortran number generator “random_number” is actually left unspecified to the compiler providers, with wildly varying results even with identical source code! To ensure full reproducibility, it’s also crucial to specify the precise pseudo-random number generator algorithm used.

Best wishes,

Bob

]]>https://github.com/reedacartwright/jak/blob/master/src/jak.cpp#L79-L91

]]>*In summary, random number generation should not be done randomly.*

for other permutations of this line, see more at the ostensibly truly random Random.org, which is celebrating its 15th anniversary with an Android App. The most famous version of course is:

“Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin.”

– the frequently quoted Johnny von Neumann,

*apparently* in the extremely obscure and difficult to digitally *or* physically find, *Various techniques used in connection with random digits* Journal of Research of the National Bureau of Standards, Appl. Math. Series (1951), 3, 36-38. So much for Google Books revealing the primary literature!