Real-World Impact of Hash DoS in Lua

classic Classic list List threaded Threaded
49 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Real-World Impact of Hash DoS in Lua

John Graham-Cumming
I've been following the discussion about the hash DoS attack that is quite serious in Lua because of the fact that strings > 32 bytes are not fully hashed and because the hashing algorithm is not randomly seeded.  I need to protect against this attack in a real world situation.  I am working with an unnamed (because I don't want to see an attack on it) very, very large Internet company that uses Lua and this attack would have serious consequences for them.  

They have taken steps at the HTTP server level to mitigate the problem, but they would like to be able to handle arbitrary strings in Lua without having to worry about this attack.  The nature of their business means that simple fixes (like reducing the number of POST parameters made to a server) won't totally fix the problem for them (because they process other arbitrary data that they don't control, and an attacker could, using Lua).

Having looked into building a patched Lua for this the solution would seem to be the following:

1. Randomize the hash seed.  In the patch I developed I generate a new unsigned int using rand() and store it in the global state and then use it to initialize the hash value instead of the string length (as is done today).

2. Don't hash large strings.  In this situation a value of 128 bytes can be counted as large.  Any strings above this can be stored in a simple GCObject** list.  The hash value of the large strings (which may be needed if the string is to be inserted into a table) could just be a random 32 bit number.  Given that eqstr assumes the same pointer for the same string a different operation would need to be defined is len(string)>128.

3. Fully hash short strings.  For strings less than 128 bytes all bytes need to be used.  This will protect against the DoS attack when combined with (1) above.

I'm willing to work on a patch for Lua if folks are likely to be interested.

John.

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

David Heiko Kolf
John Graham-Cumming wrote:
> I'm willing to work on a patch for Lua if folks are likely to be interested.

I already wrote a patch, but I guess it would be interesting to see
another solution, too. (I just added a balanced tree). However, I
received no feedback to my patch, so I don't know how big the interest
is for other people on this list.

(My message was http://lua-users.org/lists/lua-l/2012-01/msg00110.html
-- unfortunately the mailing list archive merged the two text files and
now it is hard to tell which patch is for Lua 5.1 and which is for Lua
5.2...)

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

John Graham-Cumming
On Thu, Jan 19, 2012 at 14:37, David Kolf <[hidden email]> wrote:
John Graham-Cumming wrote:
> I'm willing to work on a patch for Lua if folks are likely to be interested.

I already wrote a patch, but I guess it would be interesting to see
another solution, too. (I just added a balanced tree). However, I
received no feedback to my patch, so I don't know how big the interest
is for other people on this list.

Yes, I saw that message.  It's proposing something a bit different to my proposal as it changes the behavior for all strings.  Not sure, of course, which of the two the Lua maintainers would prefer.

John.
 
Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

Natanael Copa
In reply to this post by John Graham-Cumming
On Thu, Jan 19, 2012 at 2:47 PM, John Graham-Cumming <[hidden email]> wrote:
> I've been following the discussion about the hash DoS attack that is quite
> serious in Lua because of the fact that strings > 32 bytes are not fully
> hashed and because the hashing algorithm is not randomly seeded.  I need to
> protect against this attack in a real world situation.  I am working with an
> unnamed (because I don't want to see an attack on it) very, very large
> Internet company that uses Lua and this attack would have serious
> consequences for them.

FWIW, I also use Lua in networking services and would very msuch like
to see an official patch for this fairly soon.

> Having looked into building a patched Lua for this the solution would seem
> to be the following:
>
> 1. Randomize the hash seed.  In the patch I developed I generate a new
> unsigned int using rand() and store it in the global state and then use it
> to initialize the hash value instead of the string length (as is done
> today).

agree.

> 2. Don't hash large strings.  In this situation a value of 128 bytes can be
> counted as large.  Any strings above this can be stored in a simple
> GCObject** list.  The hash value of the large strings (which may be needed
> if the string is to be inserted into a table) could just be a random 32 bit
> number.  Given that eqstr assumes the same pointer for the same string a
> different operation would need to be defined is len(string)>128.
>
> 3. Fully hash short strings.  For strings less than 128 bytes all bytes need
> to be used.  This will protect against the DoS attack when combined with (1)
> above.

How about randomizing which bytes that are skipped? That would be less
intrusive but would it be good enough?

something like this:

(hash_seed and random_skip are initialized at startup from
/dev/urandom or similar)

--- ./lstring.c.orig
+++ ./lstring.c
@@ -74,10 +74,10 @@

 TString *luaS_newlstr (lua_State *L, const char *str, size_t l) {
   GCObject *o;
-  unsigned int h = cast(unsigned int, l);  /* seed */
+  unsigned int h = hash_seed;  /* seed */
   size_t step = (l>>5)+1;  /* if string is too long, don't hash all
its chars */
   size_t l1;
-  for (l1=l; l1>=step; l1-=step)  /* compute hash */
+  for (l1=l - (random_skip % step); l1>=step; l1-=step)  /* compute hash */
     h = h ^ ((h<<5)+(h>>2)+cast(unsigned char, str[l1-1]));
   for (o = G(L)->strt.hash[lmod(h, G(L)->strt.size)];
        o != NULL;


> I'm willing to work on a patch for Lua if folks are likely to be interested.
>
> John.
>


--
Natanael Copa

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

John Graham-Cumming
On Thu, Jan 19, 2012 at 15:31, Natanael Copa <[hidden email]> wrote:
> 3. Fully hash short strings.  For strings less than 128 bytes all bytes need
> to be used.  This will protect against the DoS attack when combined with (1)
> above.

How about randomizing which bytes that are skipped? That would be less
intrusive but would it be good enough?

That doesn't work because as the string gets long an attacker can flip a couple of characters and get a new string with a hash crash with high probability.  Even though they don't know which characters are not used they can get hash collisions just be doing a small number of random flips.

You can decide the probability with which you want someone to be able to get a hash crash by the percentage of the string that you use for the hash.

John.
 
Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

John Graham-Cumming
In reply to this post by John Graham-Cumming
On Thu, Jan 19, 2012 at 13:39, John Graham-Cumming <[hidden email]> wrote:
2. Don't hash large strings.  In this situation a value of 128 bytes can be counted as large.  Any strings above this can be stored in a simple GCObject** list.  The hash value of the large strings (which may be needed if the string is to be inserted into a table) could just be a random 32 bit number.  Given that eqstr assumes the same pointer for the same string a different operation would need to be defined is 
len(string)>128.

Actually the right thing to do is have a 0 hash value for these strings and then when the string is actually used in a table by a Lua programmer hash the entire string to get its value.  That way the security of the system is protected (because a randomized hash of the full string is used), but the performance is still good as the hashing is delayed until needed.

John.

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

Roberto Ierusalimschy
> Actually the right thing to do is have a 0 hash value for these strings and
> then when the string is actually used in a table by a Lua programmer hash
> the entire string to get its value.  That way the security of the system is
> protected (because a randomized hash of the full string is used), but the
> performance is still good as the hashing is delayed until needed.

This is more in line with what we are considering. Also remember that we
can use the variant bits to differentiate short strings from long strings.
So, short strings can be compared with pointer equality and long strings
with memcmp; short strings are never compared with long strings (as
their tags are already different).

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

William Ahern
In reply to this post by John Graham-Cumming
On Thu, Jan 19, 2012 at 01:47:16PM +0000, John Graham-Cumming wrote:
<snip>
> 1. Randomize the hash seed.  In the patch I developed I generate a new
> unsigned int using rand() and store it in the global state and then use it
> to initialize the hash value instead of the string length (as is done
> today).

rand() is anything but random. Likewise for random(). They're extremely
predictable. You're going to have to go platform specific. For OpenBSD or OS
X, for example, use arc4random(). For Linux use sysctl() and mib[] = {
CTL_KERN, KERN_RANDOM, RANDOM_UUID }.

Trying to use /dev random devices is broken for security conscious
applications that have already called chroot().


Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

John Graham-Cumming
In reply to this post by Roberto Ierusalimschy
On Thu, Jan 19, 2012 at 18:02, Roberto Ierusalimschy <[hidden email]> wrote:
> Actually the right thing to do is have a 0 hash value for these strings and
> then when the string is actually used in a table by a Lua programmer hash
> the entire string to get its value.  That way the security of the system is
> protected (because a randomized hash of the full string is used), but the
> performance is still good as the hashing is delayed until needed.

This is more in line with what we are considering. Also remember that we
can use the variant bits to differentiate short strings from long strings.
So, short strings can be compared with pointer equality and long strings
with memcmp; short strings are never compared with long strings (as
their tags are already different).

OK.  Cool.  This is a showstopper for the company I am working with for rolling out embedded Lua with nginx.  Is there anything I can do to help?

John.

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

Alexander Gladysh
In reply to this post by Roberto Ierusalimschy
On Thu, Jan 19, 2012 at 22:02, Roberto Ierusalimschy
<[hidden email]> wrote:
>> Actually the right thing to do is have a 0 hash value for these strings and
>> then when the string is actually used in a table by a Lua programmer hash
>> the entire string to get its value.  That way the security of the system is
>> protected (because a randomized hash of the full string is used), but the
>> performance is still good as the hashing is delayed until needed.

> This is more in line with what we are considering. Also remember that we
> can use the variant bits to differentiate short strings from long strings.
> So, short strings can be compared with pointer equality and long strings
> with memcmp; short strings are never compared with long strings (as
> their tags are already different).

1) May we hope to get a fix for this issue backported to 5.1?

2) Anyone know about Mike Pall's position on the issue? I haven't seen
his reply in any of the related threads, but, maybe, I missed
something?

Alexander.

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

Natanael Copa
In reply to this post by William Ahern
On Thu, Jan 19, 2012 at 7:11 PM, William Ahern
<[hidden email]> wrote:

> rand() is anything but random. Likewise for random(). They're extremely
> predictable. You're going to have to go platform specific. For OpenBSD or OS
> X, for example, use arc4random(). For Linux use sysctl() and mib[] = {
> CTL_KERN, KERN_RANDOM, RANDOM_UUID }.

Please don't assume Linux == glibc. Alpine Linux has arc4random() too.

--
Natanael Copa

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

Rob Kendrick-2
In reply to this post by William Ahern
On Thu, Jan 19, 2012 at 10:11:11AM -0800, William Ahern wrote:

> On Thu, Jan 19, 2012 at 01:47:16PM +0000, John Graham-Cumming wrote:
> <snip>
> > 1. Randomize the hash seed.  In the patch I developed I generate a new
> > unsigned int using rand() and store it in the global state and then use it
> > to initialize the hash value instead of the string length (as is done
> > today).
>
> rand() is anything but random. Likewise for random(). They're extremely
> predictable. You're going to have to go platform specific. For OpenBSD or OS
> X, for example, use arc4random(). For Linux use sysctl() and mib[] = {
> CTL_KERN, KERN_RANDOM, RANDOM_UUID }.

How about use the pointer to one of Lua's symbols?  If you care about
security, you'll have address space randomisation.  If you don't have
address space randomisation, you clearly don't care about security :)

> Trying to use /dev random devices is broken for security conscious
> applications that have already called chroot().

Why?  I suppose it means that anybody who broken into the chroot can
make /dev/random block, but is that much of an issue when you also have
/dev/urandom?

B.

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

Jay Carlson
In reply to this post by John Graham-Cumming
On Thu, Jan 19, 2012 at 6:15 PM, John Graham-Cumming <[hidden email]> wrote:
> On Thu, Jan 19, 2012 at 18:02, Roberto Ierusalimschy
> <[hidden email]> wrote:
>> [hash collision resistance]

> OK.  Cool.  This is a showstopper for the company I am working with for
> rolling out embedded Lua with nginx.

I think showstopper might be a bit strong, given all the other DoS
attacks you are probably not defending against. Keeping other O(n^2)
small by keeping n small is a good idea. Setting a reasonably short
alarm()/itimer watchdog to mark bedtime for any particular request is
good too. (I liked the idea of turning all opcodes into RETURNs.)
Depending on the kind of time being measured, this can also help
against other kinds of resource exhaustion such as running out of
memory, IOPS, or entropy.

The reason the web works is that "500 I'm Having A Bad Day" is not the
end of the world. And if somebody is DoSing the server, the server is
being honest.

Lua should make it hard to get a generalizable technique for a big
amplification of work. (This is why I still think seeded
content-dependent sampling might help if we're limited to touching the
hash function--and you can always turn the existing big string
threshold up to MAXINT.) But I worry you're planning in terms of
prevention rather than cost.

Jay

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

William Ahern
In reply to this post by Rob Kendrick-2
On Fri, Jan 20, 2012 at 07:46:39PM +0900, Rob Kendrick wrote:

> On Thu, Jan 19, 2012 at 10:11:11AM -0800, William Ahern wrote:
> > On Thu, Jan 19, 2012 at 01:47:16PM +0000, John Graham-Cumming wrote:
> > <snip>
> > > 1. Randomize the hash seed.  In the patch I developed I generate a new
> > > unsigned int using rand() and store it in the global state and then use it
> > > to initialize the hash value instead of the string length (as is done
> > > today).
> >
> > rand() is anything but random. Likewise for random(). They're extremely
> > predictable. You're going to have to go platform specific. For OpenBSD or OS
> > X, for example, use arc4random(). For Linux use sysctl() and mib[] = {
> > CTL_KERN, KERN_RANDOM, RANDOM_UUID }.
>
> How about use the pointer to one of Lua's symbols?  If you care about
> security, you'll have address space randomisation.  If you don't have
> address space randomisation, you clearly don't care about security :)

You'd have to do it for multiple symbols, since the placement of each symbol
is going to be in a much narrower window then 32-bits.

I don't think ALSR helps with statically linked apps, does it?

> > Trying to use /dev random devices is broken for security conscious
> > applications that have already called chroot().
>
> Why?  I suppose it means that anybody who broken into the chroot can
> make /dev/random block, but is that much of an issue when you also have
> /dev/urandom?

Ideally you want to chroot to an empty directory, without write permissions,
and without the ability to use devices (e.g. nodev mount option). Those
things aren't redundant when the purpose is to reduce the kernel attack
surface.

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

Roberto Ierusalimschy
In reply to this post by John Graham-Cumming
> OK.  Cool.  This is a showstopper for the company I am working with for
> rolling out embedded Lua with nginx.  Is there anything I can do to help?

what it is still missing now is how to create the initial per-state
random seed. Suggestions included some address and arc4random. I am
afraid that, for the backup ANSI implementation, we cannot do much
better than something like this:

  seed = (unsigned int)time() + (unsigned int)L;

We can have better implementations for particular system. For instance,
we can use arc4random if present, but how to detect it? Are there any
other suggestions?

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

William Ahern
On Wed, Jan 25, 2012 at 08:26:04PM -0200, Roberto Ierusalimschy wrote:

> > OK.  Cool.  This is a showstopper for the company I am working with for
> > rolling out embedded Lua with nginx.  Is there anything I can do to help?
>
> what it is still missing now is how to create the initial per-state
> random seed. Suggestions included some address and arc4random. I am
> afraid that, for the backup ANSI implementation, we cannot do much
> better than something like this:
>
>   seed = (unsigned int)time() + (unsigned int)L;
>
> We can have better implementations for particular system. For instance,
> we can use arc4random if present, but how to detect it? Are there any
> other suggestions?
>

What about a global function pointer, like:

        extern unsigned int (*lua_random)(void);

My DNS library uses a complementary, compile-time constant and some ANSI
standard macro magic to make it easy to specify which method to use. Here's
a working example, slightly modified from my original scheme. It just
requires that LUA_RANDOM be defined to one of the identifiers `time',
`arc4random', or `RAND_bytes'.

        #define PASTE(a, b) a ## b
        #define XPASTE(a, b) PASTE(a, b)

        #define LUA_PRNG_time         1
        #define LUA_PRNG_arc4random   2
        #define LUA_PRNG_RAND_bytes   3 /* OpenSSL */

        #define LUA_PRNG(which) \
                (XPASTE(LUA_PRNG_, which) == XPASTE(LUA_PRNG_, LUA_RANDOM))

        #if LUA_PRNG(time)
        #include <time.h>
        #elif LUA_PRNG(arc4random)
        #include <stdlib.h>
        #elif LUA_PRNG(RAND_bytes)
        #include <openssl/rand.h>
        #endif

        unsigned int lua_random(void) {
        #if LUA_PRNG(RAND_bytes)
                unsigned r;

                assert(1 == RAND_bytes((unsigned char *)&r, sizeof r));

                return r;
        #else
                return LUA_RANDOM();
        #endif
        } /* lua_random() */

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

Jerome Vuarand
In reply to this post by Roberto Ierusalimschy
2012/1/25 Roberto Ierusalimschy <[hidden email]>:

>> OK.  Cool.  This is a showstopper for the company I am working with for
>> rolling out embedded Lua with nginx.  Is there anything I can do to help?
>
> what it is still missing now is how to create the initial per-state
> random seed. Suggestions included some address and arc4random. I am
> afraid that, for the backup ANSI implementation, we cannot do much
> better than something like this:
>
>  seed = (unsigned int)time() + (unsigned int)L;
>
> We can have better implementations for particular system. For instance,
> we can use arc4random if present, but how to detect it? Are there any
> other suggestions?

Well, if these needs for "randomization for increased security" are
gaining weight, maybe a secondary lua_newstate taking a seed argument
would solve the problem, that seed being used for all (present and
future) randomizations needs. All embeddings of Lua that want to be
secure would have to find a way to provide a secure seed. And for the
stock command line interpreter, you could use your time()-based
proposal, and eventually accept a command line option to use another
seed.

As for the actual method to pass the seed to the lua_newstate
function, if adding an argument is going the wrong way (adding a new
argument to newstate every time a problem arise), you can have
newstate accept a structure with several initialization options, the
allocator and the seed being the only fields for the moment, with the
possibility to extend that structure in the future:

typedef struct {
    lua_Alloc f;
    void* ud;
    int seed;
} lua_Init;

lua_State* lua_newstate2(lua_Init* init);

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

Sean Conner
In reply to this post by Roberto Ierusalimschy
It was thus said that the Great Roberto Ierusalimschy once stated:

> > OK.  Cool.  This is a showstopper for the company I am working with for
> > rolling out embedded Lua with nginx.  Is there anything I can do to help?
>
> what it is still missing now is how to create the initial per-state
> random seed. Suggestions included some address and arc4random. I am
> afraid that, for the backup ANSI implementation, we cannot do much
> better than something like this:
>
>   seed = (unsigned int)time() + (unsigned int)L;
>
> We can have better implementations for particular system. For instance,
> we can use arc4random if present, but how to detect it? Are there any
> other suggestions?

  Under Solaris and Linux (both of which I use) I read from /dev/random or
/dev/urandom.  Here's the function I use [1]:

static int math_randomseed(lua_State *const L)
{
  FILE         *fp;
  unsigned int  seed;

  assert(L != NULL);

  if (lua_toboolean(L,1))
  {
    fp = fopen("/dev/random","rb");
    if (fp == NULL)
      return luaL_error(L,"The NSA is keeping you from seeding your RNG");
  }
  else
  {
    fp = fopen("/dev/urandom","rb");
    if (fp == NULL)
      return luaL_error(L,"cannot seed RNG");
  }

  fread(&seed,sizeof(seed),1,fp);
  fclose(fp);
  srand(seed);
  lua_pushnumber(L,seed);
  return 1;
}

  -spc (And if Roberto wants to use the code, I can relicense this under the
        appropriate terms ... )

[1] code can be see at
https://github.com/spc476/lua-conmanorg/blob/master/src/math.c

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

Florian Weimer
In reply to this post by Roberto Ierusalimschy
* Roberto Ierusalimschy:

>> OK.  Cool.  This is a showstopper for the company I am working with for
>> rolling out embedded Lua with nginx.  Is there anything I can do to help?
>
> what it is still missing now is how to create the initial per-state
> random seed. Suggestions included some address and arc4random. I am
> afraid that, for the backup ANSI implementation, we cannot do much
> better than something like this:
>
>   seed = (unsigned int)time() + (unsigned int)L;

Addresses of a stack variable and a public Lua function should provide
a few bits of randomness, too.

Instead of addition, you should use one of Bob Jenkin's mixing
functions.  This means that it's not important where the randomness is
inside the initial values (sometimes, it's in the lower bits,
sometimes, somewhere in the middle).

Reading from /dev/urandom might be problematic because drains entropy
from the whole system.

> We can have better implementations for particular system. For
> instance, we can use arc4random if present, but how to detect it?
> Are there any other suggestions?

Have you tried using PATRICIA and similar trie structures for
interning strings?  I'm just curious how large the performance impact
would be.  I suspect it's so large that it's not worth it.

Reply | Threaded
Open this post in threaded view
|

Re: Real-World Impact of Hash DoS in Lua

Florian Weimer
In reply to this post by John Graham-Cumming
* John Graham-Cumming:

> I'm willing to work on a patch for Lua if folks are likely to be
> interested.

For 5.1, I've got a patch that replaces the string hash function with
the de-facto industry standard (one of Bob Jenkin's functions).  I
could tidy up the patch and post it if there is enough interest.

123