Some thoughts on security

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Some thoughts on security

Samuel Groß
Hi!

I was playing around with Lua a bit lately and noticed that it supported loading precompiled bytecode at runtime through `load` [1], which caught my attention. Looking at the source code, it seemed like no validation was performed on the input (actually, I could have just read the documentation for `load` it seems :P). I was able to confirm this with a quick run of the afl fuzzer [2], which found hundreds of crashes within a few minutes. After some digging around I found some previous threads ([3], [4]), suggesting that this was a fairly well known issue. Also, for example redis has this specific functionality disabled [5]. On the other hand, quoting from [6]: "We have always considered it unacceptable for a Lua program to be able to crash the host application. Lua should be a safe language.". This seems to be clearly violated here.

The crashes mostly seem to have two distinct causes:

    1. The bytecode parser does not validate its input

    2. The Lua VM itself considers the bytecode trusted

Fixing the issues from the first category should be possible without too much overhead. I've already started doing so, see below.

The easiest way to fix the issues from the second category is probably to validate the bytecode at load time by performing some simple checks on every instruction (i.e. check that JUMPs, LOADKs and register accesses are within the respective bounds, and make sure the code ends in a RETURN). This way there is no overhead at execution time.

Due to its simplicity, the Lua interpreter in general seems well suited to be exposed to potentially untrusted input. I think it would thus be a good idea to provide a short guide on how to configure the interpreter to safely use it in such a scenario. This would obviously include disabling (some functions from) the `io` and `os` modules, and disabling bytecode loading (for now). It could also contain a word or two about sandboxing the interpreter. As an example where this currently seems to go wrong consider https://www.lua.org/cgi-bin/demo, which is clearly exposed to untrusted input, but apparently hasn't disabled bytecode loading through `load` (I've also informed the website owners in a separate email).

It should be noted that it is actually possible to compromise (and not only crash) the host application if malicious bytecode is executed by the interpreter. I've written a small proof-of-concept exploit which manages to execute arbitrary shell commands, even though `os.execute` has been disabled in the interpreter. I can provide some more details when this problem has been fixed on lua.org. Exploitation is simplified by the fact that Lua provides addresses of objects through `tostring`, thus allowing for an easy way to bypass ASLR [7].

Below are some quick patches for all bugs in the parser itself that afl has found so far (i.e. bugs that cause a crash when just doing `load` without executing the resulting chunk afterwards):


    diff --git a/src/lundump.c b/src/lundump.c
    index 4080af9..c963535 100644
    --- a/src/lundump.c
    +++ b/src/lundump.c
    @@ -194,6 +194,8 @@ static void LoadDebug (LoadState *S, Proto *f) {
         f->locvars[i].endpc = LoadInt(S);
       }
       n = LoadInt(S);
    +  if (n > f->sizeupvalues)
    +      error(S, "Number of upvalues mismatch");
       for (i = 0; i < n; i++)
         f->upvalues[i].name = LoadString(S);
     }

This fixes an out-of-bounds write if the number of upvalues in the debug section of the file exceeds the number of previously loaded upvalues:


    diff --git a/src/lundump.c b/src/lundump.c
    index c963535..d0e63e6 100644
    --- a/src/lundump.c
    +++ b/src/lundump.c
    @@ -84,6 +84,10 @@ static lua_Integer LoadInteger (LoadState *S) {
       return x;
     }

    +// Taken from lstrlib.c TODO don't duplicate code here
    +#define MAX_SIZET    ((size_t)(~(size_t)0))
    +#define MAXSIZE  \
    +    (sizeof(size_t) < sizeof(int) ? MAX_SIZET : (size_t)(INT_MAX))

     static TString *LoadString (LoadState *S) {
       size_t size = LoadByte(S);
    @@ -91,6 +95,8 @@ static TString *LoadString (LoadState *S) {
         LoadVar(S, size);
       if (size == 0)
         return NULL;
    +  if (size > MAXSIZE)
    +    error(S, "string too large");
       else if (--size <= LUAI_MAXSHORTLEN) {  /* short string? */
         char buff[LUAI_MAXSHORTLEN];
         LoadVector(S, buff, size);

This limits the size of strings in the same way str_rep does it. This is likely still incomplete since as far as I can tell (string) allocations may still fail and return NULL, in which case a segfault will follow. This seems to be a more general issue though.


    diff --git a/src/lundump.c b/src/lundump.c
    index c963535..ba19563 100644
    --- a/src/lundump.c
    +++ b/src/lundump.c
    @@ -90,7 +90,7 @@ static TString *LoadString (LoadState *S) {
       if (size == 0xFF)
         LoadVar(S, size);
       if (size == 0)
    -    return NULL;
    +    error(S, "Invalid string");
       else if (--size <= LUAI_MAXSHORTLEN) {  /* short string? */
         char buff[LUAI_MAXSHORTLEN];
         LoadVector(S, buff, size);

This prevents a crash when LoadString returns NULL. NULL strings are supported by ldump.c (I'm not quite sure why though), but without this patch they'll cause a crash in LoadConstants.

There are some more issues that are more or less parser related, for example, the interpreter will crash if the number of arguments that a function expects is negative. These issues can likely be found by fuzzing and can probably also be fixed in the parser by limiting certain size values.

Looking forward to your opinions!

Best,
Samuel Groß


Reply | Threaded
Open this post in threaded view
|

Re: Some thoughts on security

Daurnimator
On 12 December 2016 at 10:21, Samuel Groß <[hidden email]> wrote:
> I was playing around with Lua a bit lately and noticed that it supported
> loading precompiled bytecode at runtime through `load` [1], which caught my
> attention. Looking at the source code, it seemed like no validation was
> performed on the input (actually, I could have just read the documentation
> for `load` it seems :P). I was able to confirm this with a quick run of the
> afl fuzzer [2], which found hundreds of crashes within a few minutes. After
> some digging around I found some previous threads ([3], [4]), suggesting
> that this was a fairly well known issue.

See http://lua-users.org/lists/lua-l/2009-03/msg00039.html

tl;dr: Lua *did* have a bytecode verifier, but it had issues and there
was too much required to make it 'perfect'.
So rather than pay the code slowdown on load and the maintenance
burden, it was decided to remove the bytecode verifier entirely.

Reply | Threaded
Open this post in threaded view
|

Re: Some thoughts on security

Pierre-Yves Gérardy
In reply to this post by Samuel Groß
On Mon, Dec 12, 2016 at 12:21 AM, Samuel Groß <[hidden email]> wrote:
> [6]: "We have always considered it unacceptable for a Lua program to be able
> to crash the host application. Lua should be a safe language.". This seems
> to be clearly violated here.

Beside what Daurnimator said in a parallel thread, the sentence you
quoted applies to Lua source code, not bytecode.

—Pierre-Yves

Reply | Threaded
Open this post in threaded view
|

Re: Some thoughts on security

Soni "They/Them" L.
In reply to this post by Samuel Groß


On 11/12/16 09:21 PM, Samuel Groß wrote:
>
> [snip]
>

Here's some food for thought: Let's say you write a Lua VM that turns
Lua source code directly into x86 assembly.

1. How do you string.dump this? The easiest way would be to just dump
the whole x86 assembly, and as far as I'm aware the Lua reference manual
allows this.
2. When loading a raw x86 assembly bytecode, how do you validate it?
Well, you can't, unless you include a whole static analysis framework in
your VM, and even then you won't be able to validate that it's Lua - the
"bytecode" could in theory use sockets or something.

Lua bytecode is not and has never been safe. Lua bytecode has always
been implementation-defined. Don't rely on it, and don't trust it.

--
Disclaimer: these emails may be made public at any given time, with or without reason. If you don't agree with this, DO NOT REPLY.


Reply | Threaded
Open this post in threaded view
|

Re: Some thoughts on security

Nagaev Boris
In reply to this post by Pierre-Yves Gérardy


On 11 Dec 2016 4:00 pm, "Pierre-Yves Gérardy" <[hidden email]> wrote:
On Mon, Dec 12, 2016 at 12:21 AM, Samuel Groß <[hidden email]> wrote:
> [6]: "We have always considered it unacceptable for a Lua program to be able
> to crash the host application. Lua should be a safe language.". This seems
> to be clearly violated here.

Beside what Daurnimator said in a parallel thread, the sentence you
quoted applies to Lua source code, not bytecode.

—Pierre-Yves


Lua program can load bytecode using load function.
Reply | Threaded
Open this post in threaded view
|

Re: Some thoughts on security

Luiz Henrique de Figueiredo
In reply to this post by Samuel Groß
For the record, it is perfectly safe to run bytecode from trusted sources,
such as resources built into your program. Running bytecode from untrusted
sources is of course dangerous. The point is that bytecode emitted by the
Lua compiler is safe; handcrafted malicious bytecode is not.

Reply | Threaded
Open this post in threaded view
|

Re: Some thoughts on security

Pierre-Yves Gérardy
In reply to this post by Nagaev Boris
On Mon, Dec 12, 2016 at 1:11 AM, Nagaev Boris <[hidden email]> wrote:

> On 11 Dec 2016 4:00 pm, "Pierre-Yves Gérardy" <[hidden email]> wrote:
> On Mon, Dec 12, 2016 at 12:21 AM, Samuel Groß <[hidden email]> wrote:
>> [6]: "We have always considered it unacceptable for a Lua program to be
>> able
>> to crash the host application. Lua should be a safe language.". This seems
>> to be clearly violated here.
>
> Beside what Daurnimator said in a parallel thread, the sentence you
> quoted applies to Lua source code, not bytecode.
>
> Lua program can load bytecode using load function.

And the potential crashes are acknowledged at the end of
http://www.lua.org/manual/5.3/manual.html#pdf-load :

"Lua does not check the consistency of binary chunks. Maliciously
crafted binary chunks can crash the interpreter."

Also, you can call `load` in "t" mode, it will reject bytecode (it
would be nice it it was the default behavior).

—Pierre-Yves

Reply | Threaded
Open this post in threaded view
|

Re: Some thoughts on security

nobody
In reply to this post by Nagaev Boris
On 2016-12-12 01:11, Nagaev Boris wrote:
>     Beside what Daurnimator said in a parallel thread, the sentence you
>     quoted applies to Lua source code, not bytecode.
>
>     —Pierre-Yves
>
> Lua program can load bytecode using load function.

and os.execute "rm -rf /*" and …, so you obviously have to sandbox
anything untrusted.  Assuming you removed the debug library (and ensured
that the untrusted code can't get it back),

do
  local old_load = load
  function     load( src, src_what, _mode, env )
    return old_load( src, src_what,   "t", env )
  end
end

hard-codes the mode, forbidding binary chunks.  Without getupvalue, you
can't get it back.

-- Marco

Reply | Threaded
Open this post in threaded view
|

Re: Some thoughts on security

Rain Gloom
In reply to this post by Luiz Henrique de Figueiredo
There is also the path the Multi Theft Auto mod[1] took: they host a compiler[2] on a server that signs code, trying to load() bytecode that was not compiled there (ie.: string.dump-ed or emitted by luac) gives you a runtime error.
Not sure if that's a good idea though, I always found it weird that servers are so protective of their mods and want to compile and obfuscate everything.

[1] https://mtasa.com/
[2] https://luac.mtasa.com/

On Mon, Dec 12, 2016 at 12:17 AM, Luiz Henrique de Figueiredo <[hidden email]> wrote:
For the record, it is perfectly safe to run bytecode from trusted sources,
such as resources built into your program. Running bytecode from untrusted
sources is of course dangerous. The point is that bytecode emitted by the
Lua compiler is safe; handcrafted malicious bytecode is not.