[BUG] crash in Lua on table insertion

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[BUG] crash in Lua on table insertion

Viacheslav Usov
This code:

local t = {}
for i = 1, 0x7fffffff do
t[i] = i
end

crashes the interpreter as soon as i becomes 0x40000001. within the call to computesizes(). Here is the relevant portion of the stack:

computesizes(unsigned int * nums, unsigned int * pna) Line 229
rehash(lua_State * L, Table * t, const lua_TValue * ek) Line 392
luaH_newkey(lua_State * L, Table * t, const lua_TValue * key) Line 462
luaV_finishset(lua_State * L, const lua_TValue * t, lua_TValue * key, lua_TValue * val, const lua_TValue * slot) Line 214
luaV_execute(lua_State * L) Line 864
luaD_call(lua_State * L, lua_TValue * func, int nResults) Line 497
luaD_callnoyield(lua_State * L, lua_TValue * func, int nResults) Line 507
f_call(lua_State * L, void * ud) Line 943
luaD_rawrunprotected(lua_State * L, void(*)(lua_State *, void *) f, void * ud) Line 142
luaD_pcall(lua_State * L, void(*)(lua_State *, void *) func, void * u, __int64 old_top, __int64 ef) Line 727
lua_pcallk(lua_State * L, int nargs, int nresults, int errfunc, __int64 ctx, int(*)(lua_State *, int, __int64) k) Line 968

The Lua version is 5.3.3, as far as I can tell no relevant changes were made in 5.3.4.

The particular configuration I have encountered and reproduced this on is Windows 8.1, 64-bit mode, but I think this is not Windows-specific.

The crash happens in line 229 of ltable.c, where it has if (nums[i] > 0).

nums is passed from rehash, where it is defined to be an array of 32 unsigned ints. At the time of the crash, i is 1848, way outside the array bounds. The first 31 elements of nums at the time of crash are: 

[0x00000000] 0x00000001 unsigned int
[0x00000001] 0x00000001 unsigned int
[0x00000002] 0x00000002 unsigned int
[0x00000003] 0x00000004 unsigned int
[0x00000004] 0x00000008 unsigned int
[0x00000005] 0x00000010 unsigned int
[0x00000006] 0x00000020 unsigned int
[0x00000007] 0x00000040 unsigned int
[0x00000008] 0x00000080 unsigned int
[0x00000009] 0x00000100 unsigned int
[0x0000000a] 0x00000200 unsigned int
[0x0000000b] 0x00000400 unsigned int
[0x0000000c] 0x00000800 unsigned int
[0x0000000d] 0x00001000 unsigned int
[0x0000000e] 0x00002000 unsigned int
[0x0000000f] 0x00004000 unsigned int
[0x00000010] 0x00008000 unsigned int
[0x00000011] 0x00010000 unsigned int
[0x00000012] 0x00020000 unsigned int
[0x00000013] 0x00040000 unsigned int
[0x00000014] 0x00080000 unsigned int
[0x00000015] 0x00100000 unsigned int
[0x00000016] 0x00200000 unsigned int
[0x00000017] 0x00400000 unsigned int
[0x00000018] 0x00800000 unsigned int
[0x00000019] 0x01000000 unsigned int
[0x0000001a] 0x02000000 unsigned int
[0x0000001b] 0x04000000 unsigned int
[0x0000001c] 0x08000000 unsigned int
[0x0000001d] 0x10000000 unsigned int
[0x0000001e] 0x20000000 unsigned int
[0x0000001f] 0x00000001 unsigned int


And to be clear, I do not assume that Lua should have infinite tables. It would be good to get a regular Lua error like "table too big" when a table becomes too big, but not a crash.

Another thing I noticed is that the maximum table length (for whatever meaning of length) is not mentioned in the documentation, nor is the behaviour when an attempt is made to exceed it.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Rodrigo Azevedo
> This code:
>
> local t = {}
> for i = 1, 0x7fffffff do
> t[i] = i
> end
>
>
....
....
> And to be clear, I do not assume that Lua should have infinite tables. It
> would be good to get a regular Lua error like "table too big" when a table
> becomes too big, but not a crash.

I am not concerning about the "BUG" itself, but I hope that tables
must be limited only by the available virtual memory, not an
artificial "table too big", even for numerical keys.

--
Rodrigo Azevedo Moreira da Silva

Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Luiz Henrique de Figueiredo
In reply to this post by Viacheslav Usov
> The particular configuration I have encountered and reproduced this on is
> Windows 8.1, 64-bit mode, but I think this is not Windows-specific.

It crashed Lua 5.3.4 in Mac OS X but in Linux it gave an error "not
enough memory".

Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Roberto Ierusalimschy
In reply to this post by Viacheslav Usov
> This code:
>
> local t = {}
> for i = 1, 0x7fffffff do
> t[i] = i
> end
>
> crashes the interpreter as soon as i becomes 0x40000001. within the call
> to computesizes(). Here is the relevant portion of the stack:
>
> [...]
>
> nums is passed from rehash, where it is defined to be an array of 32
> unsigned ints. At the time of the crash, i is 1848, way outside the array
> bounds. The first 31 elements of nums at the time of crash are:
>
> [...]

My machine does not have enough memory to reproduce the bug. Can you
give me the value of '*pna' when 'computesizes' is called?

Thanks,

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Pierre Chapuis
In reply to this post by Luiz Henrique de Figueiredo
May 11, 2017 2:02 PM, "Luiz Henrique de Figueiredo" <[hidden email]> wrote:

>> The particular configuration I have encountered and reproduced this on is
>> Windows 8.1, 64-bit mode, but I think this is not Windows-specific.
>
> It crashed Lua 5.3.4 in Mac OS X but in Linux it gave an error "not
> enough memory".


This is probably because on most distributions the Linux kernel is configured to overcommit memory by default.

Try to run `sudo sysctl vm.overcommit_memory=2` before trying to reproduce. I suspect it will fail with "not enough memory" as well.

--
Pierre Chapuis

Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Viacheslav Usov
On Thu, May 11, 2017 at 2:52 PM, Pierre Chapuis <[hidden email]> wrote:

> This is probably because on most distributions the Linux kernel is configured to overcommit memory by default.

I'd say this is because one needs an excess of 16 GiB of RAM just for that one process to reach the problematic index value. If that much is not available, the failure mode is different.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Viacheslav Usov
In reply to this post by Roberto Ierusalimschy
On Thu, May 11, 2017 at 2:49 PM, Roberto Ierusalimschy <[hidden email]> wrote:

>  My machine does not have enough memory to reproduce the bug. Can you give me the value of '*pna' when 'computesizes' is called?

*pna = 0x40000001.
twotoi = 0
a = 0xccd4bc95
na = 0xccd4bc95
optimal = 0
i = 1832

Note the value of i I gave previously was different. So the other things might also be different, except that I am pretty sure *pna is always 0x40000001.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Roberto Ierusalimschy
> On Thu, May 11, 2017 at 2:49 PM, Roberto Ierusalimschy <
> [hidden email]> wrote:
>
> >  My machine does not have enough memory to reproduce the bug. Can you
> give me the value of '*pna' when 'computesizes' is called?
>
> *pna = 0x40000001.
> twotoi = 0
> a = 0xccd4bc95
> na = 0xccd4bc95
> optimal = 0
> i = 1832
 
Many thanks.


> Note the value of i I gave previously was different. So the other things
> might also be different, except that I am pretty sure *pna is always
> 0x40000001.

'twotoi' is overflowing, and then the loop goes wild until a crash, so
the final 'i' is noise.

With 0x40000001, a size of 0x80000000 for the array part is already
justified, so the algorithm tries to go to the next size (0x100000000)
to see if it can fill that, but 0x100000000 overflows 'twotoi',
which is an unsigned integer.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Roberto Ierusalimschy
> 'twotoi' is overflowing, and then the loop goes wild until a crash, so
> the final 'i' is noise.
>
> With 0x40000001, a size of 0x80000000 for the array part is already
> justified, so the algorithm tries to go to the next size (0x100000000)
> to see if it can fill that, but 0x100000000 overflows 'twotoi',
> which is an unsigned integer.

The obvious fix seems to be limiting the loop explicitly:

-  for (i = 0, twotoi = 1; *pna > twotoi / 2; i++, twotoi *= 2) {
+  for (i = 0, twotoi = 1; i <= MAXABITS && *pna > twotoi / 2; i++, twotoi *= 2) {

Can you try this fix? If possible, could you report the initial value
of '*pna' and the final value of 'optimal' in the last 10 calls to
'computesizes', with your original test?

Thanks again,

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Viacheslav Usov
On Thu, May 11, 2017 at 9:24 PM, Roberto Ierusalimschy <[hidden email]> wrote:

> The obvious fix seems to be limiting the loop explicitly:

That seems to work. The original test finishes normally with it. When I tried to insert more than 0x7ffffff values, the process quickly allocated so much memory that the OS started to page very busily, so after 10-15 minutes of waiting I interrupted that. I then tried to use the # operator, which gave me a negative number: -2147483615. Then I scanned the table in a loop looking for the max key value, and I found it was 2147483681. This is the unsigned 32-bit interpretation of the same binary value. Given this, I suspect that # uses a signed 32-bit integer internally, so if I had much more RAM, I would probably make that overflow even more, into a small positive number.

> If possible, could you report the initial value of '*pna' and the final value of 'optimal' in the last 10 calls to 'computesizes', with your original test?

First I did this, with unmodified Lua:

Lua 5.3.3  Copyright (C) 1994-2016 Lua.org, PUC-Rio
> gt = {} local t = gt for i = 1, (0x40000001 - 10) do t[i] = i end print(#t)
1073741815

Then I set a breakpoint in computesizes and executed the following, line by line:

> gt[1073741816] = 1073741816
> gt[1073741817] = 1073741817
> gt[1073741818] = 1073741818
> gt[1073741819] = 1073741819
> gt[1073741820] = 1073741820
> gt[1073741821] = 1073741821
> gt[1073741822] = 1073741822
> gt[1073741823] = 1073741823
> gt[1073741824] = 1073741824
> gt[1073741825] = 1073741825

For each line, except the last one, the behaviour was identical: *pna = 0, the loop is skipped, optimal = 0 in the end. There were something like 5 calls into this function for each line.

For the last line, some initial calls were also like that, then there was one with *pna = 0x40000001, and that was the end of it.

I hope I did not make any mistake there, it was a bit tedious.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Roberto Ierusalimschy
> That seems to work. The original test finishes normally with it. When I
> tried to insert more than 0x7ffffff values, the process quickly allocated
> so much memory that the OS started to page very busily, so after 10-15
> minutes of waiting I interrupted that. I then tried to use the # operator,
> which gave me a negative number: -2147483615. Then I scanned the table in a
> loop looking for the max key value, and I found it was 2147483681. This is
> the unsigned 32-bit interpretation of the same binary value. Given this, I
> suspect that # uses a signed 32-bit integer internally, so if I had much
> more RAM, I would probably make that overflow even more, into a small
> positive number.

I will try to have a look at that.


> > If possible, could you report the initial value of '*pna' and the final
> value of 'optimal' in the last 10 calls to 'computesizes', with your
> original test?
>
> First I did this, with unmodified Lua:
>
> Lua 5.3.3  Copyright (C) 1994-2016 Lua.org, PUC-Rio
> > gt = {} local t = gt for i = 1, (0x40000001 - 10) do t[i] = i end
> print(#t)
> 1073741815
>
> Then I set a breakpoint in computesizes and executed the following, line by
> line:
>
> > gt[1073741816] = 1073741816
> > gt[1073741817] = 1073741817
> > gt[1073741818] = 1073741818
> > gt[1073741819] = 1073741819
> > gt[1073741820] = 1073741820
> > gt[1073741821] = 1073741821
> > gt[1073741822] = 1073741822
> > gt[1073741823] = 1073741823
> > gt[1073741824] = 1073741824
> > gt[1073741825] = 1073741825
>
> For each line, except the last one, the behaviour was identical: *pna = 0,
> the loop is skipped, optimal = 0 in the end. There were something like 5
> calls into this function for each line.
>
> For the last line, some initial calls were also like that, then there was
> one with *pna = 0x40000001, and that was the end of it.
>
> I hope I did not make any mistake there, it was a bit tedious.

The calls with *pna = 0 are probably tables created by the parser
being reallocated, as Lua compiles each line. As the array part grows
by powers of 2, after the insertion of key 0x20000001 the array part
already sized 0x40000000, so only the last line triggered another
resizing in 'gt'. Everything seems normal here.

Thanks again,

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Roberto Ierusalimschy
In reply to this post by Viacheslav Usov
> That seems to work. The original test finishes normally with it. When I
> tried to insert more than 0x7ffffff values, the process quickly allocated
> so much memory that the OS started to page very busily, so after 10-15
> minutes of waiting I interrupted that. I then tried to use the # operator,
> which gave me a negative number: -2147483615. Then I scanned the table in a
> loop looking for the max key value, and I found it was 2147483681. This is
> the unsigned 32-bit interpretation of the same binary value. Given this, I
> suspect that # uses a signed 32-bit integer internally, so if I had much
> more RAM, I would probably make that overflow even more, into a small
> positive number.

The problem seems to be simply the return types of 'luaH_getn' and
'unbound_search'. (Again, I have no way to test it.) Both operate
internally with unsigned integers, but are returning signed integers.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Roberto Ierusalimschy
> > That seems to work. The original test finishes normally with it. When I
> > tried to insert more than 0x7ffffff values, the process quickly allocated
> > so much memory that the OS started to page very busily, so after 10-15
> > minutes of waiting I interrupted that. I then tried to use the # operator,
> > which gave me a negative number: -2147483615. Then I scanned the table in a
> > loop looking for the max key value, and I found it was 2147483681. This is
> > the unsigned 32-bit interpretation of the same binary value. Given this, I
> > suspect that # uses a signed 32-bit integer internally, so if I had much
> > more RAM, I would probably make that overflow even more, into a small
> > positive number.
>
> The problem seems to be simply the return types of 'luaH_getn' and
> 'unbound_search'. (Again, I have no way to test it.) Both operate
> internally with unsigned integers, but are returning signed integers.

It is not that simple. There are some other problems in 'unbound_search'
(such as an explicit limit for keys larger than ints).

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Viacheslav Usov
In reply to this post by Viacheslav Usov
On Thu, May 11, 2017 at 1:15 PM, Viacheslav Usov <[hidden email]> wrote:

> And to be clear, I do not assume that Lua should have infinite tables. It would be good to get a regular Lua error like "table too big" when a table becomes too big, but not a crash.

> Another thing I noticed is that the maximum table length (for whatever meaning of length) is not mentioned in the documentation, nor is the behaviour when an attempt is made to exceed it.

I can see in the GitHub/Lua repo that the table implementation is still being changed. What I also see is that most of the code and even the definition of the table type uses unsigned or signed ints to deal with the array part of the table. My question is, is the current thinking that the array part must be only as big as the the underlying int type may address? Has it been considered to allow greater ranges for the array part?

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Roberto Ierusalimschy
> My question is, is the current thinking that the
> array part must be only as big as the the underlying int type may address?

This has been the "current thinking" in Lua for a long time.


> Has it been considered to allow greater ranges for the array part?

Yes. They cannot be larger than size_t, which are equal to unsigned int
in most 32-bit machines. size_t, for its part, can be larger, equal, or
smaller than lua_Integer. We could switch to unsigned int, but we still
assume that 'long long' may not exist, which means that the platform may
have no type larger than unsigned int. Things get too messy.
And, of course, using larger numbers may have a performance impact
(mostly in the rehashing algorithm, but also in the size of a Table node
and in some other parts) that everybody will pay, even those not needing
arrays larger than 34GB. (Maybe expanding it to 'unsigned int' could be
done with little impact, if there is a demand for that.)

-- Roberto


Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Viacheslav Usov
On Mon, Jan 8, 2018 at 1:27 PM, Roberto Ierusalimschy <[hidden email]> wrote:

>> Has it been considered to allow greater ranges for the array part?

> Yes. They cannot be larger than size_t, which are equal to unsigned int in most 32-bit machines. size_t, for its part, can be larger, equal, or smaller than lua_Integer.

I would say that should be whatever is smaller of lua_Integer and size_t. And, of course, it would be nice to be able to override that explicitly for some exotic situations when that is really bad.

So the above won't affect (sane) 32-bit platforms, and I doubt that (sane) 64-bit platforms will be affected much. Looking at struct Table, the sizearray field is followed by a pointer field, which are both aligned on 64-bit boundary on a typical 64-bit platform, so by making sizearray a 64-bit type, the overall size is probably not going to change at all. But, of course, I do not know enough to predict the impact on the algorithms and their implementation.

 > ... if there is a demand for that.

There was a sentiment earlier in this thread, which I share, that the table size should only be limited by the available memory. While I am talking about a different issue here, the size of the array part, the very fact that I reported this bug means it is possible to run into a situation when the array part becomes larger than afforded by an int. I have not used the current dev. version of Lua, so I cannot even say that the bug is fixed for me, much less what would happen if I tried to use really large arrays, but I have grounds to believe that at least in some scenarios there might be a sharp difference in performance when a table, which is logically an array, can no longer be represented internally as an array, even if there is plenty of memory still available.

Cheers,
V.


Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Roberto Ierusalimschy
> Looking at struct Table, the
> sizearray field is followed by a pointer field, which are both aligned on
> 64-bit boundary on a typical 64-bit platform, so by making sizearray a
> 64-bit type, the overall size is probably not going to change at all.

In my machine it goes from 56 to 64 bytes. See what comes before that
field.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Viacheslav Usov

On Mon, Jan 8, 2018 at 2:25 PM, Roberto Ierusalimschy <[hidden email]> wrote:

> In my machine it goes from 56 to 64 bytes. See what comes before that

Ah, right you are. So the int field is strategically placed in an area that would otherwise be wasted. That makes my argument weaker for sure.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Viacheslav Usov
In reply to this post by Viacheslav Usov
The below has been listed as a bug in 5.3.4 https://www.lua.org/bugs.html#5.3.4-2 for over a year, and there is no official fix. What are the plans for this bug? Another customer has run into this bug in the mean time...

Cheers,
V.

On Thu, May 11, 2017 at 1:15 PM, Viacheslav Usov <[hidden email]> wrote:
This code:

local t = {}
for i = 1, 0x7fffffff do
t[i] = i
end

crashes the interpreter as soon as i becomes 0x40000001. within the call to computesizes(). Here is the relevant portion of the stack:

computesizes(unsigned int * nums, unsigned int * pna) Line 229
rehash(lua_State * L, Table * t, const lua_TValue * ek) Line 392
luaH_newkey(lua_State * L, Table * t, const lua_TValue * key) Line 462
luaV_finishset(lua_State * L, const lua_TValue * t, lua_TValue * key, lua_TValue * val, const lua_TValue * slot) Line 214
luaV_execute(lua_State * L) Line 864
luaD_call(lua_State * L, lua_TValue * func, int nResults) Line 497
luaD_callnoyield(lua_State * L, lua_TValue * func, int nResults) Line 507
f_call(lua_State * L, void * ud) Line 943
luaD_rawrunprotected(lua_State * L, void(*)(lua_State *, void *) f, void * ud) Line 142
luaD_pcall(lua_State * L, void(*)(lua_State *, void *) func, void * u, __int64 old_top, __int64 ef) Line 727
lua_pcallk(lua_State * L, int nargs, int nresults, int errfunc, __int64 ctx, int(*)(lua_State *, int, __int64) k) Line 968

The Lua version is 5.3.3, as far as I can tell no relevant changes were made in 5.3.4.

The particular configuration I have encountered and reproduced this on is Windows 8.1, 64-bit mode, but I think this is not Windows-specific.

The crash happens in line 229 of ltable.c, where it has if (nums[i] > 0).

nums is passed from rehash, where it is defined to be an array of 32 unsigned ints. At the time of the crash, i is 1848, way outside the array bounds. The first 31 elements of nums at the time of crash are: 

[0x00000000] 0x00000001 unsigned int
[0x00000001] 0x00000001 unsigned int
[0x00000002] 0x00000002 unsigned int
[0x00000003] 0x00000004 unsigned int
[0x00000004] 0x00000008 unsigned int
[0x00000005] 0x00000010 unsigned int
[0x00000006] 0x00000020 unsigned int
[0x00000007] 0x00000040 unsigned int
[0x00000008] 0x00000080 unsigned int
[0x00000009] 0x00000100 unsigned int
[0x0000000a] 0x00000200 unsigned int
[0x0000000b] 0x00000400 unsigned int
[0x0000000c] 0x00000800 unsigned int
[0x0000000d] 0x00001000 unsigned int
[0x0000000e] 0x00002000 unsigned int
[0x0000000f] 0x00004000 unsigned int
[0x00000010] 0x00008000 unsigned int
[0x00000011] 0x00010000 unsigned int
[0x00000012] 0x00020000 unsigned int
[0x00000013] 0x00040000 unsigned int
[0x00000014] 0x00080000 unsigned int
[0x00000015] 0x00100000 unsigned int
[0x00000016] 0x00200000 unsigned int
[0x00000017] 0x00400000 unsigned int
[0x00000018] 0x00800000 unsigned int
[0x00000019] 0x01000000 unsigned int
[0x0000001a] 0x02000000 unsigned int
[0x0000001b] 0x04000000 unsigned int
[0x0000001c] 0x08000000 unsigned int
[0x0000001d] 0x10000000 unsigned int
[0x0000001e] 0x20000000 unsigned int
[0x0000001f] 0x00000001 unsigned int


And to be clear, I do not assume that Lua should have infinite tables. It would be good to get a regular Lua error like "table too big" when a table becomes too big, but not a crash.

Another thing I noticed is that the maximum table length (for whatever meaning of length) is not mentioned in the documentation, nor is the behaviour when an attempt is made to exceed it.

Cheers,
V.

Reply | Threaded
Open this post in threaded view
|

Re: [BUG] crash in Lua on table insertion

Albert Chan
On May 16, 2018, at 11:22 AM, Viacheslav Usov <[hidden email]> wrote:

The below has been listed as a bug in 5.3.4 https://www.lua.org/bugs.html#5.3.4-2 for over a year, and there is no official fix. What are the plans for this bug? Another customer has run into this bug in the mean time...

On Thu, May 11, 2017 at 1:15 PM, Viacheslav Usov <[hidden email]> wrote:
This code:

local t = {}
for i = 1, 0x7fffffff do
t[i] = i
end

crashes the interpreter as soon as i becomes 0x40000001. within the call to computesizes(). Here is the relevant portion of the stack:

On my laptop, win7 :
Tried on Lua 5.3.4, Lua 5.4.0 work1, LuaJIT 1.1.8.
All were unable to fill tables beyond 32 Meg (0x02000000)

Got error "not enough memory"
But, no crashes.
12