what is the role of the `seed` field in the Lua global state?

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

what is the role of the `seed` field in the Lua global state?

Norman Ramsey
I've been poking at the implementation of Lua, and I've noticed that
in Lua 5.2, Lua's global state acquired a `seed` field.  This field
is initialized at startup and never changes, and as far as I can
tell, it affects only the hash values of strings.

What does this seed accomplish?  What problem does it solve?


Norman

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

Doug Currie
On Tue, Mar 10, 2020 at 6:24 PM Norman Ramsey <[hidden email]> wrote:
[...]
What does this seed accomplish?  What problem does it solve?
 

attackers may use the properties of a hash function to construct a denial of service attack. They could do this by providing strings to your hash function that all hash to the same value destroying the performance of your hash table. But if you use a different seed for each run of your program, the set of strings the attackers must use changes.


 
Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

Philippe Verdy-2
In reply to this post by Norman Ramsey
The fact that this field is initialized once and then never changes is
still the security issue. Programs can run over very long periods of
time.
This field however may be reset each time for generating newer sets of
hashes, and its initialization may come from reading from a global
store which could be fed asynchronously (at randomized periods or with
an adminsitrative manual order) with some antropic sources (which
would be implementation dependant and unknown from the attackers
because these sources would not be exposed directly).
Not all hashes have to be equal for the same value; they have to be
consistent only for the set of data hashed in the same index. Nothing
prevents also reindexing a table at any time using a new hashing
function (or the same hashing function but with a new seed generated
from the antropic source; reindexing tables by rehashing its content
is sometimes desirable as a maintenance tool, notably if the
cardinatity of sets has changed significantly, or because there's a
need to compact the size of indexes to save memory used by stable
data, or to increase it to avoid too frequent reallocations with
unstable data).

Le mar. 10 mars 2020 à 23:24, Norman Ramsey <[hidden email]> a écrit :

>
> I've been poking at the implementation of Lua, and I've noticed that
> in Lua 5.2, Lua's global state acquired a `seed` field.  This field
> is initialized at startup and never changes, and as far as I can
> tell, it affects only the hash values of strings.
>
> What does this seed accomplish?  What problem does it solve?
>
>
> Norman
>

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

dyngeccetor8
On 11/03/2020 04.16, Philippe Verdy wrote:> Nothing
> prevents also reindexing a table at any time using a new hashing
> function (or the same hashing function but with a new seed generated
> from the antropic source;

Moreover, Lua runtime can theoretically do it itself.
In a manner like garbage collection. Say when number of
detected collisions passed some threshold.

-- Martin

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

Roberto Ierusalimschy
In reply to this post by Doug Currie
> > What does this seed accomplish?  What problem does it solve?
> >
>
> I suspect it is related to this:
> https://stackoverflow.com/questions/9241230/what-is-murmurhash3-seed-parameter/9241429#9241429
>
> attackers may use the properties of a hash function to construct a denial
> > of service attack. They could do this by providing strings to your hash
> > function that all hash to the same value destroying the performance of your
> > hash table. But if you use a different seed for each run of your program,
> > the set of strings the attackers must use changes.

Exactly.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

Norman Ramsey
 > > > What does this seed accomplish?  What problem does it solve?
 > >
 > > attackers may use the properties of a hash function to construct a denial
 > > of service attack.. by providing strings to your hash
 > > function that all hash to the same value destroying the performance of
 > > your hash table.
 >
 > Exactly.

Thanks!  

One followup question: is the hash algorithm not vulnerable to exactly
such an attack via strings of lengths between 32 and 40?
It appears that every such string is "short", and therefore interned,
but that the `step` size is 2, so only every other byte contributes to
the hash.  So important strings like "__newindex" are protected
(because the seed makes it impossible to predict what they hash to),
but the hash table in general is not.  Correct?



Norman

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

Coda Highland


On Wed, Mar 11, 2020 at 10:16 AM Norman Ramsey <[hidden email]> wrote:
 > > > What does this seed accomplish?  What problem does it solve?
 > >
 > > attackers may use the properties of a hash function to construct a denial
 > > of service attack.. by providing strings to your hash
 > > function that all hash to the same value destroying the performance of
 > > your hash table.
 >
 > Exactly.

Thanks! 

One followup question: is the hash algorithm not vulnerable to exactly
such an attack via strings of lengths between 32 and 40?
It appears that every such string is "short", and therefore interned,
but that the `step` size is 2, so only every other byte contributes to
the hash.  So important strings like "__newindex" are protected
(because the seed makes it impossible to predict what they hash to),
but the hash table in general is not.  Correct?



Norman


The randomized seed is sufficient to protect against attacking a Lua-based program by providing a fixed malicious input that reliably works across runs. It's not exactly a "vulnerability" to be able to deny service by flooding it with more data than it can reasonably process -- the randomization prevents a more efficient, offline attack.

/s/ adam
Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

Norman Ramsey
 > The randomized seed is sufficient to protect against attacking a Lua-based
 > program by providing a fixed malicious input that reliably works across
 > runs.

Shouldn't it be possible to provide a malicious input that consists of
strings of length 32 to 40 that differ only in characters that don't
contribute to the hash?  That would work reliably across runs?

I'm trying to understand what sort of attack is being defended
against.  So far the only attack I understand is an attack against the
performance of strings that are used by the implementation, like "__index".
The seed renders such an attack impossible.

Is there any other attack that's defended against here?


Norman

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

云风 Cloud Wu
In reply to this post by Roberto Ierusalimschy
Roberto Ierusalimschy <[hidden email]> 于2020年3月11日周三 下午8:57写道:

>
> > > What does this seed accomplish?  What problem does it solve?
> > >
> >
> > I suspect it is related to this:
> > https://stackoverflow.com/questions/9241230/what-is-murmurhash3-seed-parameter/9241429#9241429
> >
> > attackers may use the properties of a hash function to construct a denial
> > > of service attack. They could do this by providing strings to your hash
> > > function that all hash to the same value destroying the performance of your
> > > hash table. But if you use a different seed for each run of your program,
> > > the set of strings the attackers must use changes.
>
> Exactly.
>
> -- Roberto
>

Is there any way to specify a seed when I create a new lua VM ? It's
very useful for reproduction of bugs .
If the seed is always random, the order of table iteration would be
different every time.

I wish you can add a new api to create a lua VM with a seed in future.

--
http://blog.codingnow.com

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

Roberto Ierusalimschy
In reply to this post by Norman Ramsey
>  > The randomized seed is sufficient to protect against attacking a Lua-based
>  > program by providing a fixed malicious input that reliably works across
>  > runs.
>
> Shouldn't it be possible to provide a malicious input that consists of
> strings of length 32 to 40 that differ only in characters that don't
> contribute to the hash?  That would work reliably across runs?
>
> I'm trying to understand what sort of attack is being defended
> against.  So far the only attack I understand is an attack against the
> performance of strings that are used by the implementation, like "__index".
> The seed renders such an attack impossible.
>
> Is there any other attack that's defended against here?

The problem you mentioned is not restricted to strings between 32 and
40 bytes, because often those strings go as keys to a table (e.g.,
collecting key-value pairs in a request) and can create collisions in
that table. That said, we thought about this issue at the time and
concluded it would not be a problem, but I cannot recall why :-)
(Maybe the program should prevent long keys in such a table?)

Anyway, you are right that it would be safer to avoid these collitions
in the internal table, at least. (That is, all short strings should use
all bytes for their hash.)

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

Luiz Henrique de Figueiredo
In reply to this post by 云风 Cloud Wu
> Is there any way to specify a seed when I create a new lua VM ? It's
> very useful for reproduction of bugs .
> If the seed is always random, the order of table iteration would be
> different every time.

Setting the seed is not enough to ensure reproduction of execution: if
your table contains functions or userdata as keys, then the order is
undefined because Lua uses memory addresses to compute key hashes when
keys are functions and userdata.

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

云风 Cloud Wu


> 在 2020年3月12日,22:39,Luiz Henrique de Figueiredo <[hidden email]> 写道:
>
> 
>>
>> Is there any way to specify a seed when I create a new lua VM ? It's
>> very useful for reproduction of bugs .
>> If the seed is always random, the order of table iteration would be
>> different every time.
>
> Setting the seed is not enough to ensure reproduction of execution: if
> your table contains functions or userdata as keys, then the order is
> undefined because Lua uses memory addresses to compute key hashes when
> keys are functions and userdata.
>

I can use my own lua_Alloc to manage the memory addresses. For example, map a fixed virtual address space as the memory heap.

On the other hand, avoiding to use userdata or functions as the key is easy, but strings are always used as the keys.

The point is that we can find a way to reduce that uncertainty, but the seed of lua vm is random generated . It’s out of control :(
Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

云风 Cloud Wu
In reply to this post by Luiz Henrique de Figueiredo


> 在 2020年3月12日,22:39,Luiz Henrique de Figueiredo <[hidden email]> 写道:
>
> 
>>
>> Is there any way to specify a seed when I create a new lua VM ? It's
>> very useful for reproduction of bugs .
>> If the seed is always random, the order of table iteration would be
>> different every time.
>

There is another reason that I want to set the seed by myself.

Our system use thousands lua VMs in one process. I make a small patch to share the function protos (including strings) among these VMs. So I need these VMs use the same seed, so that the strings’ hash can be  used in any of these VM.
Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

Coda Highland


On Thu, Mar 12, 2020 at 10:14 AM 云风 <[hidden email]> wrote:


> 在 2020年3月12日,22:39,Luiz Henrique de Figueiredo <[hidden email]> 写道:
>
> 
>>
>> Is there any way to specify a seed when I create a new lua VM ? It's
>> very useful for reproduction of bugs .
>> If the seed is always random, the order of table iteration would be
>> different every time.
>

There is another reason that I want to set the seed by myself.

Our system use thousands lua VMs in one process. I make a small patch to share the function protos (including strings) among these VMs. So I need these VMs use the same seed, so that the strings’ hash can be  used in any of these VM.

If you're already patching the VM, it shouldn't be hard to just patch the seed assignment yourself.

/s/ Adam 
Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

Roberto Ierusalimschy
> If you're already patching the VM, it shouldn't be hard to just patch the
> seed assignment yourself.

Just define the macro luai_makeseed (see lstate.c).

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

云风 Cloud Wu


在 2020年3月13日,02:03,Roberto Ierusalimschy <[hidden email]> 写道:


If you're already patching the VM, it shouldn't be hard to just patch the
seed assignment yourself.

Just define the macro luai_makeseed (see lstate.c).

-- Roberto


luai_makeseed is not enough, because the seed is still random.

I think let the macro luai_makeseed replacing the whole makeseed function would be better.
Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

云风 Cloud Wu
In reply to this post by Coda Highland


在 2020年3月12日,23:58,Coda Highland <[hidden email]> 写道:




On Thu, Mar 12, 2020 at 10:14 AM 云风 <[hidden email]> wrote:


> 在 2020年3月12日,22:39,Luiz Henrique de Figueiredo <[hidden email]> 写道:
>
> 
>>
>> Is there any way to specify a seed when I create a new lua VM ? It's
>> very useful for reproduction of bugs .
>> If the seed is always random, the order of table iteration would be
>> different every time.
>

There is another reason that I want to set the seed by myself.

Our system use thousands lua VMs in one process. I make a small patch to share the function protos (including strings) among these VMs. So I need these VMs use the same seed, so that the strings’ hash can be  used in any of these VM.

If you're already patching the VM, it shouldn't be hard to just patch the seed assignment yourself.



I don’t want to change the lua api , and my patch doesn’t change the any lua api but only the implementation.

And We can’t change(assign) the seed after lua_newstate . So I can only use a global seed, it’s ugly.

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

Roberto Ierusalimschy
In reply to this post by 云风 Cloud Wu
> luai_makeseed is not enough, because the seed is still random.
>
> I think let the macro luai_makeseed replacing the whole makeseed function would be better.

Sorry, you are right. This is 5.3. I was seeing the code of 5.4, where it
already does that.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

云风 Cloud Wu
In reply to this post by Coda Highland
Coda Highland <[hidden email]> 于2020年3月12日周四 下午11:58写道:
>
> If you're already patching the VM, it shouldn't be hard to just patch the seed assignment yourself.
>

I found that we don't even have to patch the VM  :)

https://gist.github.com/cloudwu/a094583029f15bb74fa6df194159d008

I wrote a small function `lua_changeseed(lua_State *L, unsigned int
seed);`  to change the seed.

--
http://blog.codingnow.com

Reply | Threaded
Open this post in threaded view
|

Re: what is the role of the `seed` field in the Lua global state?

Philippe Verdy-2
That's not the best solution. Only a temporary palliative that
probably does not even solve any practical problem.

Really there should be a non-global seed that can be set on separate
generators. This global seed can only be used to instantiate a new
generator (and then in Lua you can provide a better generator scheme:
the current implementation is fast, but very poor, with insufficient
insufficient for security as its hashing function is very easily
attackable, too much predictable, and does not use enough bits for its
internal state).

And from which data will you use the lua_changeseed() ? You need
access to some other entropy source if you intend to use it for
security purposes; all that can be used with it is to build
reproductible test cases with common sequences of numbers from the
PRNG, from a known constant seed used for testing purpose only (e.g.
coverage test units).


Le mer. 25 mars 2020 à 11:38, 云风 Cloud Wu <[hidden email]> a écrit :

>
> Coda Highland <[hidden email]> 于2020年3月12日周四 下午11:58写道:
> >
> > If you're already patching the VM, it shouldn't be hard to just patch the seed assignment yourself.
> >
>
> I found that we don't even have to patch the VM  :)
>
> https://gist.github.com/cloudwu/a094583029f15bb74fa6df194159d008
>
> I wrote a small function `lua_changeseed(lua_State *L, unsigned int
> seed);`  to change the seed.
>
> --
> http://blog.codingnow.com
>

12