Lua 5.4 random number generator

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Lua 5.4 random number generator

Gé Weijers

I've been taking a look at the random number generator in Lua 5.4.0-work1. I would like to suggest a different implementation of the algorithm 'project', for two reasons:
  • it does not fully eliminate the bias for small numbers that are not a power of two
  • more importantly: when used to flip a coin (e.g. random(0,1) == 0) it generates a short sequence of 128 values.
The xorshift128+ RNG produces a sequence whose low-order bits are an LFSR sequence. This is a known problem with most pseudo-random number generators of this type. It would be better to not use the low-order bit in the way 'project' uses them.

The algorithm I use most of the time (in C99). It removes all biases, at the cost of two divisions:


uint64_t project(uint64_t lo, uint64_t hi)
{
    assert(lo <= hi);
    // handle edge cases first:
    switch (hi - lo) {
    case 0:
        return lo; // one-element range
    case UINT64_MAX: // lo == 0 and hi == UINT64_MAX
        return xorshift128plus();
    }
    //  number of possible values
    uint64_t n = max - min + 1u;
    //  scale == 2**64 / n, calculated as (2**64 - n) / n + 1u
    //  'n * scale' is the largest multiple of 'n' <= 2**64
    uint64_t scale = (uint64_t)(-n) / n + 1u;
    for (;;) {
        uint64_t r = xorshift128plus() / scale;
        if (r < n)
            return lo + r;
    }
}


In the case of 'n' being a power of two the high-order bits will be used ('scale' will be a power of two as well)

Similarly: the 'I2d' function should probably shift a 64-bit random value right by 11 bits in stead of masking the value, this would get rid of the dodgy low order bit.

BTW: the author of the xorshift128+ algorithm now recommends replacing it with a slightly different algorithm, which is ever so slightly faster, and has better statistical properties. It has some downsides as well, the two low-order bits both produce an LFSR sequence, it's probably a wash.

It's here:


Thanks for replacing the random number generator, by the way, definitely an improvement.


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Albert Chan

On Mar 21, 2018, at 8:57 PM, Gé Weijers <[hidden email]> wrote:

I've been taking a look at the random number generator in Lua 5.4.0-work1.
I would like to suggest a different implementation of the algorithm
'project', for two reasons:

 - it does not fully eliminate the bias for small numbers that are not a
 power of two
 - more importantly: when used to flip a coin (e.g. random(0,1) =3D=3D 0)=
it
 generates a short sequence of 128 values.

The xorshift128+ RNG produces a sequence whose low-order bits are an LFSR
sequence. This is a known problem with most pseudo-random number generators
of this type. It would be better to not use the low-order bit in the way
'project' uses them.

The algorithm I use most of the time (in C99). It removes all biases, at
the cost of two divisions:


uint64_t project(uint64_t lo, uint64_t hi)
{
  assert(lo <=3D hi);
  // handle edge cases first:
  switch (hi - lo) {
  case 0:
      return lo; // one-element range
  case UINT64_MAX: // lo =3D=3D 0 and hi =3D=3D UINT64_MAX
      return xorshift128plus();
  }
  //  number of possible values
  uint64_t n =3D max - min + 1u;
  //  scale =3D=3D 2**64 / n, calculated as (2**64 - n) / n + 1u
  //  'n * scale' is the largest multiple of 'n' <=3D 2**64
  uint64_t scale =3D (uint64_t)(-n) / n + 1u;
  for (;;) {
      uint64_t r =3D xorshift128plus() / scale;
      if (r < n)
          return lo + r;
  }
}


In the case of 'n' being a power of two the high-order bits will be used
('scale' will be a power of two as well)

Similarly: the 'I2d' function should probably shift a 64-bit random value
right by 11 bits in stead of masking the value, this would get rid of the
dodgy low order bit.

BTW: the author of the xorshift128+ algorithm now recommends replacing it
with a slightly different algorithm, which is ever so slightly faster, and
has better statistical properties. It has some downsides as well, the two
low-order bits both produce an LFSR sequence, it's probably a wash.

It's here:

http://xoroshiro.di.unimi.it/


I had post the problem earlier: http://lua-users.org/lists/lua-l/2018-03/msg00332.html

Below version avoided 64-bits divisions, at the cost of possibly more generator calls.
My timings suggested this is simpler and faster, even with small n

// return unbiased random number ran, 0 <= ran <= n
// it tried ran = math.random(2 ^ (bits of n)) until ran <= n
// since the range is power of 2, it avoided the expensive 64-bits divisions

static lua_Unsigned project(lua_Unsigned ran, lua_Unsigned n, RanState *state)
{
    int bits = __builtin_clzll(n);
    while ((ran >>= bits) > n) ran = I2UInt( xorshift128plus(state->s) );
    return ran;
}


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Albert Chan
In reply to this post by Gé Weijers
seeding the random generator can possibly be simpler (without warm-up)

http://xoroshiro.di.unimi.it/, section "Why just sizes 128 and 1024":

For 64 bits, we suggest to use a SplitMix64 generator, but 64 bits of state are 
not enough for any serious purpose. Nonetheless, SplitMix64 is very useful to 
initialize the state of other generators starting from a 64-bit seed, as research 
has shown that initialization must be performed with a generator radically different 
in nature from the one initialized to avoid correlation on similar seeds.

---

I tried a simpler approach, using Knuth MMIX LCG:

#define LCG(x)   (6364136223846793005ULL * (x) + 1442695040888963407ULL)

static void setseed (I *state, lua_Integer n) {
  state[0] = Int2I(n);           // guaranteed not stuck in zero state,
  state[1] = LCG(Int2I(n)); // and state[0], state[1] very different 
}
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Isaac Hier
There's also this: http://www.pcg-random.org

On Mar 22, 2018 9:17 AM, "albertmcchan" <[hidden email]> wrote:
seeding the random generator can possibly be simpler (without warm-up)

http://xoroshiro.di.unimi.it/, section "Why just sizes 128 and 1024":

For 64 bits, we suggest to use a SplitMix64 generator, but 64 bits of state are 
not enough for any serious purpose. Nonetheless, SplitMix64 is very useful to 
initialize the state of other generators starting from a 64-bit seed, as research 
has shown that initialization must be performed with a generator radically different 
in nature from the one initialized to avoid correlation on similar seeds.

---

I tried a simpler approach, using Knuth MMIX LCG:

#define LCG(x)   (6364136223846793005ULL * (x) + 1442695040888963407ULL)

static void setseed (I *state, lua_Integer n) {
  state[0] = Int2I(n);           // guaranteed not stuck in zero state,
  state[1] = LCG(Int2I(n)); // and state[0], state[1] very different 
}
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Roberto Ierusalimschy
In reply to this post by Gé Weijers
Many thanks for your analysis.


>    - it does not fully eliminate the bias for small numbers that are not a
>    power of two

Yes. I guess that, if you keep generating one random number per
nanosecond, maybe you can feel this bias after two months. But we will
remove it.


> The xorshift128+ RNG produces a sequence whose low-order bits are an LFSR
> sequence. This is a known problem with most pseudo-random number generators
> of this type. It would be better to not use the low-order bit in the way
> 'project' uses them.

We will try to use the higher-bits to generate the number. But two
divisions seem too expensive.


> BTW: the author of the xorshift128+ algorithm now recommends replacing it
> with a slightly different algorithm, which is ever so slightly faster, and
> has better statistical properties. It has some downsides as well, the two
> low-order bits both produce an LFSR sequence, it's probably a wash.
>
> It's here:
>
> http://xoroshiro.di.unimi.it/

We know about this, but it uses a rotate operation, which we have to
emulate with 32-bit numbers (for the C-89 version). I am not sure it
is worth the effort. But we can change that.

Anyway, no (P)RNG will be good for everybody (fast, good statistical
properties, really random, unpredictable, repeatable when needed,
jumpable, etc.). If you need anything more specific, you should provide
your own.

(What do you mean by "wash" in your message?)

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Roberto Ierusalimschy
In reply to this post by Gé Weijers
>    - more importantly: when used to flip a coin (e.g. random(0,1) == 0) it
>    generates a short sequence of 128 values.

By "a short sequence of 128 values" you mean it repeats itself after
128 values?

I generated a sequence of 1000 flips, then another sequence of 128 flips,
and searched for the second inside the first. The maximum match I found
was a subsequence of 13 values.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Gé Weijers
In reply to this post by Roberto Ierusalimschy


On Thu, Mar 22, 2018 at 8:47 AM, Roberto Ierusalimschy <[hidden email]> wrote:
Many thanks for your analysis.


>    - it does not fully eliminate the bias for small numbers that are not a
>    power of two

Yes. I guess that, if you keep generating one random number per
nanosecond, maybe you can feel this bias after two months. But we will
remove it.


Some tests are really sensitive to small biases, as I have found out... It won't take two months at 10**9 random numbers per second if the tests are any good.



> The xorshift128+ RNG produces a sequence whose low-order bits are an LFSR
> sequence. This is a known problem with most pseudo-random number generators
> of this type. It would be better to not use the low-order bit in the way
> 'project' uses them.

We will try to use the higher-bits to generate the number. But two
divisions seem too expensive.

You're currently using one modulo operation in the simple case, which on my desktop processor (older Core i5) takes longer than a division.
The extra overhead in the common case is one division or slightly less. In the worst case the code ends up doing 0 divisions or modulo operations, but 2 draws from the
random generator on average.

The big issue I had is the short sequence you get when using "random(0, 1)" for coin flips. That's beyond a "slight bias."

 


> BTW: the author of the xorshift128+ algorithm now recommends replacing it
> with a slightly different algorithm, which is ever so slightly faster, and
> has better statistical properties. It has some downsides as well, the two
> low-order bits both produce an LFSR sequence, it's probably a wash.
>
> It's here:
>
> http://xoroshiro.di.unimi.it/

We know about this, but it uses a rotate operation, which we have to
emulate with 32-bit numbers (for the C-89 version). I am not sure it
is worth the effort. But we can change that.

Anyway, no (P)RNG will be good for everybody (fast, good statistical
properties, really random, unpredictable, repeatable when needed,
jumpable, etc.). If you need anything more specific, you should provide
your own.

(What do you mean by "wash" in your message?)

I should know better than using colloquialisms like this on a mailing list with an international audience, my apologies...

Definition of "it's a wash" according to Merriam-Webster.com:

—used to say that something is equal and that one side does not have an advantage You won the first game and I won the second, so it's a wash

Thanks,



Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Roberto Ierusalimschy
> You're currently using one modulo operation in the simple case, which on my
> desktop processor (older Core i5) takes longer than a division.

I guess most processors (including i5) use the same instruction to
compute a modulo and a division, so they have exactly the same cost.


> The extra overhead in the common case is one division or slightly less. In
> the worst case the code ends up doing 0 divisions or modulo operations, but
> 2 draws from the
> random generator on average.

I removed that modulo; it is really more expensive than the general case.
(With it, I removed too its corresponding bias.)


> The big issue I had is the short sequence you get when using "random(0, 1)"
> for coin flips. That's beyond a "slight bias."

Sure it is. But I could not find this sequence (see my other message).

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Gé Weijers
In reply to this post by Roberto Ierusalimschy
On Thu, Mar 22, 2018 at 11:46 AM, Roberto Ierusalimschy <[hidden email]> wrote:
>    - more importantly: when used to flip a coin (e.g. random(0,1) == 0) it
>    generates a short sequence of 128 values.

By "a short sequence of 128 values" you mean it repeats itself after
128 values?


I should not write these things when I'm busy and/or especially tired. You're right, I managed to confuse myself.

It's a linear sequence, but not one so short, it's length is 2**128 - 1. The values generated are an exclusive-or of a subset of the last 128 bits generated.

Never mind.

 



--
--

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Gé Weijers
In reply to this post by Roberto Ierusalimschy


On Thu, Mar 22, 2018 at 11:57 AM, Roberto Ierusalimschy <[hidden email]> wrote:
> You're currently using one modulo operation in the simple case, which on my
> desktop processor (older Core i5) takes longer than a division.

I guess most processors (including i5) use the same instruction to
compute a modulo and a division, so they have exactly the same cost.



That test was on a 32-bit Linux platform, where you have a 64/32 divide instruction only. For 64-bit x86 machines it's the same instruction.

Sorry about getting confused about the sequence. Regular linear congruential generators produce an even-odd-even-odd pattern, so a 128 bit sequence sounded reasonable. xorshift128 and its variants are not nearly that bad, but there still a relatively simple linear relationship there. If you're using it for running a game it probably does not matter, if you're running a casino it definitely does.



Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Roberto Ierusalimschy
> Sorry about getting confused about the sequence. Regular linear
> congruential generators produce an even-odd-even-odd pattern, so a 128 bit
> sequence sounded reasonable. xorshift128 and its variants are not nearly
> that bad, but there still a relatively simple linear relationship there. If
> you're using it for running a game it probably does not matter, if you're
> running a casino it definitely does.

The '+' in the xorshift128+ aims to remove the linearity of the
algorithm, but it does not affect the lowest bit (it has no carry,
so the '+' performs an xor). But I must confess I have no idea why
linearity is bad.

Our goal are things like games. If you are running a real cassino, you
probably should bring your own PRNG.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Albert Chan
On Mar 23, 2018, at 9:11 AM, Roberto Ierusalimschy <[hidden email]> wrote:

> The '+' in the xorshift128+ aims to remove the linearity of the
> algorithm, but it does not affect the lowest bit (it has no carry,
> so the '+' performs an xor). But I must confess I have no idea why
> linearity is bad.
>

What is linearity ?

Does linearity affect randomness (make it predictable) ?





Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Gé Weijers


On Fri, Mar 23, 2018 at 9:42 AM, albertmcchan <[hidden email]> wrote:

What is linearity ?

Does linearity affect randomness (make it predictable) ?




"Linear" here means you can describe it easily using linear algebra. A single step of this random number generator can be described as the application of a 128 x 128 boolean matrix on the state. It's a non-singular matrix, so you can invert it, and run the RNG backwards, and do all kinds of nifty tricks with it like calculating any value in the sequence quickly, without calculating any of the preceding values.

It is fairly simple to predict the next output value if you know the last few output values of this generator or generators like it. So you don't want to use a generator like this for casino games or anything else that has security requirements, unless you use measures to protect yourself. If you don't:


I somehow misread the specs of this generator and interpreted it to mean that the period of the low-order bit was 128, not 2**128-1. 128 would be truly bad. 2**128-1 is a long period, of course, but if you know the last 128 low-order bits you can predict the next one. This can be a problem in some applications. If you only use one bit of each output value you may not want to use the bit with the worst statistical properties.




Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Albert Chan
On Mar 23, 2018, at 5:09 PM, Gé Weijers <[hidden email]> wrote:

I somehow misread the specs of this generator and interpreted it to mean that the period of the low-order bit was 128, not 2**128-1. 128 would be truly bad.

Period 2**128 - 1: Is it due to all zero state will never happen ?

2**128-1 is a long period, of course, but if you know the last 128 low-order bits you can predict the next one. This can be a problem in some applications.

You mean just two math.random(0) (128 bits) can predict the next one ?

If you only use one bit of each output value you may not want to use the bit with the worst statistical properties

Reading http://xoroshiro.di.unimi.it, it seems linearity is not that bad.
BUT, why Vigna say never use last 2 bits to build double ?  I am confused.

The linear dependencies causing the failure of the binary-rank test can be reduced by applying a nonlinear operation on the output: for example, multiplying by a constant, or adding as 64-bit integers (thus, in the ring Z/264Z) two pieces of the state space. This is what happens in the generators described here. With more operations it could be possible to completely eliminate the linear artifacts, but I consider this a waste of time: as we've seen, some of the most used generators around have linear dependencies in every bit, and this is not considered a problem. Note also that when generating floating point values you never use the two lowest bits.


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Coda Highland
On Fri, Mar 23, 2018 at 5:24 PM, Albert Chan <[hidden email]> wrote:
> You mean just two math.random(0) (128 bits) can predict the next one ?

No, if you see the least significant bit of 128 consecutive calls to
math.random, then you can predict the least significant bit of every
call from there on out.

> Reading http://xoroshiro.di.unimi.it, it seems linearity is not that bad.
> BUT, why Vigna say never use last 2 bits to build double ?  I am confused.

The lower-order bits are less random than the higher-order bits, and
this is true of many RNG designs. If you want a coin flip, pick a bit
out of the middle instead of taking the lowest bit. If you want to
construct a floating-point value, the lowest two bits have enough of a
correlation that you get especially bad results.

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Gé Weijers
In reply to this post by Albert Chan


On Fri, Mar 23, 2018 at 3:24 PM, Albert Chan <[hidden email]> wrote:
Period 2**128 - 1: Is it due to all zero state will never happen ?

That's correct, a maximal period linear feedback shift register will go through all possible states except the all-zero state.
 

--
--

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Albert Chan
In reply to this post by Coda Highland
On Mar 23, 2018, at 6:31 PM, Coda Highland <[hidden email]> wrote:

> On Fri, Mar 23, 2018 at 5:24 PM, Albert Chan <[hidden email]> wrote:
>> You mean just two math.random(0) (128 bits) can predict the next one ?
>
> No, if you see the least significant bit of 128 consecutive calls to
> math.random, then you can predict the least significant bit of every
> call from there on out.
>

I misunderstood what Ge were saying, but I somehow got it right !

--> two (at most 3) math.random(0) (128 bits) can predict the next one

For math.random(), 3 calls can definitely "crack" xorshift128+

https://blog.securityevaluators.com/hacking-the-javascript-lottery-80cc437e3b7f

Too bad it is not a real lottery :-(


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Russell Haley
In reply to this post by Gé Weijers


On Fri, Mar 23, 2018 at 2:09 PM, Gé Weijers <[hidden email]> wrote:


On Fri, Mar 23, 2018 at 9:42 AM, albertmcchan <[hidden email]> wrote:

What is linearity ?

Does linearity affect randomness (make it predictable) ?




"Linear" here means you can describe it easily using linear algebra. A single step of this random number generator can be described as the application of a 128 x 128 boolean matrix on the state. It's a non-singular matrix, so you can invert it, and run the RNG backwards, and do all kinds of nifty tricks with it like calculating any value in the sequence quickly, without calculating any of the preceding values.

It is fairly simple to predict the next output value if you know the last few output values of this generator or generators like it. So you don't want to use a generator like this for casino games or anything else that has security requirements, unless you use measures to protect yourself. If you don't:


It's funny that using math to create institutionalized profit of 7.xx percent is a valid, uphold-able law, but using the same math for individual profit is considered cheating and punishable as fraud.

Can someone please stop the planet. I'd like to get off now.

Russ

I somehow misread the specs of this generator and interpreted it to mean that the period of the low-order bit was 128, not 2**128-1. 128 would be truly bad. 2**128-1 is a long period, of course, but if you know the last 128 low-order bits you can predict the next one. This can be a problem in some applications. If you only use one bit of each output value you may not want to use the bit with the worst statistical properties.





Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Albert Chan
In reply to this post by Coda Highland

> On Mar 23, 2018, at 6:31 PM, Coda Highland <[hidden email]> wrote:
>
> The lower-order bits are less random than the higher-order bits, and
> this is true of many RNG designs. If you want a coin flip, pick a bit
> out of the middle instead of taking the lowest bit. If you want to
> construct a floating-point value, the lowest two bits have enough of a
> correlation that you get especially bad results.
>
> /s/ Adam
>

Do you mean top bits is not as good as the middle bits ?




Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4 random number generator

Coda Highland
On Sat, Mar 24, 2018 at 2:25 PM, Albert Chan <[hidden email]> wrote:

>
>> On Mar 23, 2018, at 6:31 PM, Coda Highland <[hidden email]> wrote:
>>
>> The lower-order bits are less random than the higher-order bits, and
>> this is true of many RNG designs. If you want a coin flip, pick a bit
>> out of the middle instead of taking the lowest bit. If you want to
>> construct a floating-point value, the lowest two bits have enough of a
>> correlation that you get especially bad results.
>>
>> /s/ Adam
>>
>
> Do you mean top bits is not as good as the middle bits ?

In some RNGs the bottom bits are weak. In some RNGs the top bits are
weak. The middle is usually safe, so if you don't know the behavior of
the RNG in advance then you should either pull bits out of the middle
or combine bits from both ends. (Obviously if you DO know the RNG's
behavior, pick out what's good.)

/s/ Adam

12