Possible bug relating to table sizes

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Possible bug relating to table sizes

Andrew Scott
I was working on some code that involved copying indexed tables, and noticed that copying an indexed table T1 that was missing a [1] key (so that #T1 == 0) yielded a table T2 that would return the length as if the [1] key were in T2. Below is the simplest code that I could get to reproduce this. Removing any of the keys 2, 3, or 4 from T1 makes both tables return a size of 0, but adding additional keys after 4 to T1 simply causes T2 to reflect the new highest key.

----------------------------------------
T1={
    [2]={},
    [3]={},
    [4]={},
}

T2 = {}
for k, v in pairs(T1) do
    T2[k] = v
end

print(#T1, #T2)
assert(#T1 == #T2)  -- will fail
----------------------------------------

After playing around a bit more, I discovered that this occurs with all dynamically generated tables. The following is the simplest demonstration of this

----------------------------------------
T1={}
for i = 2, 4 do
    T1[i] = {}
end
print(#T1)
----------------------------------------

However, there seems to be some strange pattern that determines the number of keys that must be in the table for the size to not equal 0, and it depends on the first key in the table (the closest one to 1). This is demonstrated below (the first number printed is the first key and the second number is the last key, e.g. 2 and 4 with the first example in this message). Change the max for initial as high as you wish to see this pattern extend on. When looking at it, I realized that when the max key is a power of 2 (256, 512, 1024, blah blah), the next max key jump up to a number much higher than the last. Also at these powers of 2, the max key is double the min key.

----------------------------------------
for initial = 2, 100 do
    local size = 0
    while true do
        size = size + 1
        T1={}
        for key = initial, initial+size do
            T1[key] = true
        end
        
        if #T1 ~=0 then
            print(initial, initial+size)
            break
        end
    end
end
----------------------------------------

In fact, when you only look for these numbers, the results are quite pretty

----------------------------------------
initial = 2
while true do
    if floor(log(initial)/log(2)) == log(initial)/log(2) then -- an exact power of 2
        local size = initial*2 - initial
        T1={}
        for key = initial, initial+size do
            T1[key] = true
        end
        
        if #T1 ~=0 then
            print(initial, initial+size)
        end
    end
    initial = initial*2
    collectgarbage("collect") -- collecting garbage allows for one more iteration before we run out of memory
end
----------------------------------------


I don’t know why any of this is happening, but it seems extremely buggy to me. Why are my tables that are missing a [1] key returning a size? ipairs() will not traverse them, but (for i=1, #table) will? This should not be.
Reply | Threaded
Open this post in threaded view
|

Re: Possible bug relating to table sizes

Lorenzo Donati-2
On 29/10/2011 12.07, Andrew Scott wrote:
> I was working on some code that involved copying indexed tables, and
> noticed that copying an indexed table T1 that was missing a [1] key (so
> that #T1 == 0) yielded a table T2 that would return the length as if the
> [1] key were in T2. Below is the simplest code that I could get to
> reproduce this. Removing any of the keys 2, 3, or 4 from T1 makes both
> tables return a size of 0, but adding additional keys after 4 to T1
> simply causes T2 to reflect the new highest key.
>

[...]

> I don’t know why any of this is happening, but it seems extremely buggy
> to me. Why are my tables that are missing a [1] key returning a size?
> ipairs() will not traverse them, but (for i=1, #table) will? This should
> not be.

The length operator '#' returns the intuitively-correct value only for
tables which are "arrays" ("sequences", in the new Lua 5.2 parlance),
i.e. tables whose numeric keys begins at 1 and ends at some other
integer N. If there are "holes" or non integer numeric keys, '#' returns
weird values, although its behaviour is well-defined in the manual [1]
(this behaviour was criticized and lead Lua Team to remove the
definition of '#' for non-sequences in 5.2 and clarify the concept of
"array", now formally called "sequences" in the manual [2]).

The same applies to ipairs (it will traverse an array only if *it is* an
"array") and almost all other table library functions.

[1] http://www.lua.org/manual/5.1/manual.html#2.5.5
[2] http://www.lua.org/work/doc/manual.html#3.4.6


Reply | Threaded
Open this post in threaded view
|

Re: Possible bug relating to table sizes

Dirk Laurie-2
2011/10/31 Lorenzo Donati <[hidden email]>:

> The length operator '#' returns the intuitively-correct value only for
> tables which are "arrays" ("sequences", in the new Lua 5.2 parlance), i.e.
> tables whose numeric keys begins at 1 and ends at some other integer N. If
> there are "holes" or non integer numeric keys, '#' returns weird values,
> …
> The same applies to ipairs (it will traverse an array only if *it is* an
> "array") and almost all other table library functions.
>

5.2 beta allows you to supply an __ipairs metamethod.  Overriding the
__len metamethod has no effect on the standard ipairs.

Dirk