The state of the art for PRNGs (pseudo-random number generators) is much advanced since ju.Random was written. Surely at some point we will refresh our APIs that produce random numbers. In fact, we have added SplittableRandom, but I think the state of the art is farther enough along to consider another refresh. (When we get advanced numerics via Valhalla, I think we will probably want generic algorithms on 128-bit int and vectorized states, as well; that’s just a guess.) I bring this up not for any immediate purpose, but because every so often I google around for information about PRNGs, and I have noticed that a particular reference (since at least 2018) keeps coming to the top of the pile of interesting stuff. I’d like to share it with you all, and talk about making use of it. (FWIW, last weekend I was doing some experiments with perfect hash functions, and I wanted to roll my own PRNG to drive the search for them. A few months ago my problem was test vector generation, and again I rolled my own PRNG after looking at ju.*Random.) The reference I’d like to give here is to Dr. Melissa O’Neill’s website and articles: https://www.pcg-random.org https://www.pcg-random.org/pdf/hmc-cs-2014-0905.pdf Representative code is posted here: https://github.com/imneme/pcg-c-basic O’Neill’s work in some ways parallels the earlier work of people like Bob Jenkins (whom we have used in the past), but I find it easier to use; sometimes a second mover ends up clearing stuff up that a first mover left obscure. She makes a convincing case that many existing PRNGs are weaker than they need to be, and (quite astonishingly) she recommends a family of PRNGs that is in most ways better than the alternative. Crucially, the family is almost laughably easy to understand and implement. (Who knew?) The features that impress me about the PCG algorithms are: 1. Simple, easy to understand numeric state. (Linear congruential.) 2. Simple, easy to understand “output function” (derivation of output values from state). 3. Easily scalable to narrow states, to wide states (128 bit), and to vectorized states. Also easy to vectorize (one root state, VLEN-at-a-time results). Works great out of the box for 64 bit states. 4. Good theory explaining quality of service. 5. Salt-able with a user-supplied parameter, yielding a family of 2^(N-1) distinct independent streams of RNs. (That is you get 2^63 independent PRNGs for free with the first 64-bit one.) 6. Almost trivially easy to fast-forward and rewind (in log time), because of the undisturbed LC state. This could buy us a variation on splittable random, with full reproducibility of the sequence, independent of split points! 7. Difficult (though not cryptographically so) to infer state from output values: Hard to predict if you don’t know the engine parameters inside. 8. Small in memory and cycles. The state is just a double-precision scalar of 2N bits for an N-bit output; if you want salt it’s extra space for a second scalar of whatever size you like. Getting a value requires a handful of simple 2N-bit operations. Two representative variants require MUL+ADD for the state evolution, and then the output function uses RSHIFT+XOR+RSHIFT+RSHIFT/ROT. The last shift or rotate is state-dependent, making the output hard to predict. The main weakness is probably the 2N-bit multiply used to evolve the state, which would seem to require 4 single precision (N-bit) multiplies to obtain an N-bit sample value. But in fact only two N-bit multiplies are required. (More specifically, a high-half-of-multiply instruction is useful.) By contrast, java.util.Random squeeze s 48 bits from a 64-bit multiplication; PCG only gets 32 (very solid) bits from the same multiplication. Given that multiplies are cheap these days, I think this is a relatively tolerable weakness. The algorithm which comes closest to the above set of features would be a simple state counter which transforms them (as an “output function”) via a crypto (or crypto-like) transform. But that’s probably slower than the carefully balanced output function O’Neill chooses, which involves a few shifts and XORs. I should re-emphasize that PCG is a family. The degrees of freedom that I understand are: - state size (usually 128 or 64 bits) - salt size (usually 0 or 64 bits) - complexity of output function (shift, rotate, or multiply-by-small-constant for special mix step) - output full-width (usually 32 or 64 bits) or masked or arithmetically bounded Vectorized variations are not considered by O’Neill AFAIK but they seem useful if we want to speed things up. They would produce identical values to the scalar algorithms, but would produce more at a time, either with a wide bank of independent PRNGs, or a single PRNG that produced more samples per cycle. The most exotic sub-operation there would be the non-constant lanewise shift or rotate, which Intel and ARM both have. It occurs to me that we could generate efficient implementations of all these variations if we used method handle combinators to build them and factor out the degrees of freedom. Classes and OOP don't provide much leverage. Reified generics (and/or type classes) will help. Update 1: Brian Burkhalter and Joe Darcy immediately pointed out something I missed: JEP 356 by Guy Steele and Jim Laskey. Hopefully that can be extended with members of the PCG family, if that is worthwhile, as I think is likely. I will note that O’Neill engages Vigna’s work on her blog, such as [xoshiro256]. I think her work probably emerges successfully from the appropriate comparisons. [xoshiro256]: https://www.pcg-random.org/posts/implausible-output-from-xoshiro256.html