Lua 5.0 wasn't that bad, actually

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Lua 5.0 wasn't that bad, actually

Dirk Laurie-2
>From Lua 4.0 onwards, every version of Lua has had:

* a notion of table length
* some functions that operate on tables as lists, arrays, sequences,
call them what you will, including 'ipairs'

and those have changed every time. Yet another change is in the
offing, if not in 5.4, then surely in 6.0.

The problems that Roberto are trying to solve [1] are these:

> 1) A constructor like {x, y, z} should always create a sequence with
> three elements. A constructor like {...} should always get all arguments
> passed to a function. A constructor like {f(x)} should always get
> all results returned by the function. '#' should always work on these
> tables correctly.
>
> 2) A statement like 't[#t + 1] = x' should always add one more element
> at the end of a sequence.
>
> These are what confuse people all the time, these are what start new
> rounds of discussions around '#'.
>
> Once 'empty' and 'undef' are values, you break all those properties.
>
> I think we cannot use a value to represent the absence of a value, no
> matter how many no-value values we add. It is a contradiction. Several
> languages followed this path, always with bad results.

Ironically, the first of these problems [2] would be much easier to solve
if the current version were Lua 5.0 than 5.3. From Lua 5.1 onwards,
the dreaded border function was introduced, and the "endless rounds
of discussion" ensued.

Lua 5.0 had a function table.setn, described in the manual as follows:

> Updates the size of a table. If the table has a field "n" with
> a numerical value, that value is changed to the given n.
> Otherwise, it updates an internal state so that subsequent
> calls to table.getn(table) return n.

That internal state was set to the size of the table as initially created,
and automatically updated whenever one used table.insert  or
table.remove. It was not updated by assigning a value to tbl[n+1]
or by assigning nil to tbl[n].

Unfortunately, the initial value was not taken from the length of the
table constructor. Nils at the end of the table were elided.

But that small change, of setting the internal state to the actual size
of the constructor, would be all that is necessary to solve all three
parts of the first of the stated problems, without breaking either
of the two characteristic properties of Lua which have persisted from
Lua 1.0 right through to Lua 5.4 work2:

* "Semantically, there is no difference between a field not present
in a table or a field with value nil."

* "Tables are the sole data structuring mechanism in Lua".

[1] Quoted from Roberto's post
http://lua-users.org/lists/lua-l/2018-03/msg00239.html

[2] For which purpose table.insert(t,x) and table.remove(t) are
specifically designed.

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.0 wasn't that bad, actually

Tim Hill


> On Jun 26, 2018, at 6:00 AM, Dirk Laurie <[hidden email]> wrote:
>
>
> That internal state was set to the size of the table as initially created,
> and automatically updated whenever one used table.insert  or
> table.remove. It was not updated by assigning a value to tbl[n+1]
> or by assigning nil to tbl[n].
>
> Unfortunately, the initial value was not taken from the length of the
> table constructor. Nils at the end of the table were elided.
>
> But that small change, of setting the internal state to the actual size
> of the constructor, would be all that is necessary to solve all three
> parts of the first of the stated problems, without breaking either
> of the two characteristic properties of Lua which have persisted from
> Lua 1.0 right through to Lua 5.4 work2:
>
> * "Semantically, there is no difference between a field not present
> in a table or a field with value nil."
>
> * "Tables are the sole data structuring mechanism in Lua".
>
> [1] Quoted from Roberto's post
> http://lua-users.org/lists/lua-l/2018-03/msg00239.html
>
> [2] For which purpose table.insert(t,x) and table.remove(t) are
> specifically designed.
>

This has always been my feeling .. that there is a semantic difference between a “sequence” in which the length is dynamically computed (and is the root cause of all those long long threads) and an “array” (call it what you want) that has a *declared* size very much like the setn/getn model. None of the suggestions to date seem to address this difference completely (including mine), and to be honest I’m not very keen on the “undef” suggestion for 5.4 either .. which (Roberto will shoot me for saying this) still sounds like the all-hated “value that isnt a value”.

I think the only way out of this mess is to leave sequences, the # operator and ipairs() etc alone. They are what they are, and should stay unchanged to avoid breaking existing code. Then make a couple of additions (not changes) to the language/library:

— Add an internal “array length” value to each table (very much like the old getn/setn). And no, in most cases this will NOT add memory overhead given the granularity of most help allocators these days.

— Alter the language so that table constructors set this new array length, correctly handling cases such as {…} with nils (that is, including them), so the length of {nil, nil} is 2. There is a slight edge case here with functions that return multiple values, “{foo()}", but I see this as analogous to {…}.

— Add a new “array” library that has getn/setn as well as equivalents of insert/append/delete etc that respect the array length. Depending on mood, “apairs()” could be added as well, though I suspect this would be controversial.

Far from ideal, but at least it doesnt break anything. Of course, it bloats the library up, but this is easily addressed using a #define to exclude it.


—Tim
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.0 wasn't that bad, actually

Philipp Janda
Am 27.06.2018 um 10:08 schröbte Tim Hill:
>
> — Add an internal “array length” value to each table (very much like the old getn/setn). And no, in most cases this will NOT add memory overhead given the granularity of most help allocators these days.

I'd say the expected memory overhead for an additional 32bit array
length field is 4 bytes. It may fit into the memory block that's
allocated anyway (due to granularity), or it may just overflow the block
and cause a larger memory overhead (also due to granularity). Without
knowing the current table size and the granularity of your memory
allocator that's all you can say.

Now, the current size of an empty table is 56 bytes in Lua 5.3 on amd64.
That's a multiple of 8 but not 16, so if the memory allocator's
granularity is 16, then a 4 byte (or even 8 byte) array length field is
for free. If the granularity is 8 bytes, the array length field will
cost 8 bytes.

The dlmalloc allocator we talked about recently has an 8-byte granularity.

>
>
> —Tim
>

Philipp


p.s.: An extra array length field might still be a good option despite
the potential memory overhead.


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.0 wasn't that bad, actually

Tim Hill


> On Jun 27, 2018, at 10:58 AM, Philipp Janda <[hidden email]> wrote:
>
> Am 27.06.2018 um 10:08 schröbte Tim Hill:
>> — Add an internal “array length” value to each table (very much like the old getn/setn). And no, in most cases this will NOT add memory overhead given the granularity of most help allocators these days.
>
> I'd say the expected memory overhead for an additional 32bit array length field is 4 bytes. It may fit into the memory block that's allocated anyway (due to granularity), or it may just overflow the block and cause a larger memory overhead (also due to granularity). Without knowing the current table size and the granularity of your memory allocator that's all you can say.
>
> Now, the current size of an empty table is 56 bytes in Lua 5.3 on amd64. That's a multiple of 8 but not 16, so if the memory allocator's granularity is 16, then a 4 byte (or even 8 byte) array length field is for free. If the granularity is 8 bytes, the array length field will cost 8 bytes.
>

Understood. On macOS, all 64-bit Linux, and Windows, malloc() currently has 16-byte granularity (the last time I checked, about 18 months ago). Of course, your mileage will vary and in embedded systems (where memory is often at a premium) I would expect a finer granularity. But in such cases I would expect a “reduced” version of Lua anyway (e.g. jettison some libraries).

It’s interesting though, as Dirk notes, that tables USED to have this value in 5.0 and it was later dropped, so in some ways it’s a wash. Adding overhead to tables is to be avoided of course, especially when you have lots of little tables, but that is something of an anti-pattern anyway (as noted in PiL).

—Tim



Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.0 wasn't that bad, actually

Roberto Ierusalimschy
> It’s interesting though, as Dirk notes, that tables USED to have this value in 5.0 and it was later dropped, so in some ways it’s a wash.

Tables in Lua never had this value. Lua 5.0 used either the field 'n' (if
present) or an auxiliary weak table to keep the size of arrays.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.0 wasn't that bad, actually

Dirk Laurie-2
2018-06-28 13:52 GMT+02:00 Roberto Ierusalimschy <[hidden email]>:
>> It’s interesting though, as Dirk notes, that tables USED to have this value in 5.0 and it was later dropped, so in some ways it’s a wash.
>
> Tables in Lua never had this value. Lua 5.0 used either the field 'n' (if
> present) or an auxiliary weak table to keep the size of arrays.

Well, now. Roberto, as always, is careful not to give the name of the
poster, but since the quotation itself mentions me, people may easily
think that I said what the quotation claims I did.

I plead not guitly. I took care to stick the phrase "internal state"
used in the Lua 5.0 manual.

Likewise, Philipp discussed "the expected memory overhead for an
additional 32bit array length field" without actually asserting that
Lua 5.0 had that.

Who deserves the credit, who deserves the blame?
A Poster That Remains For Now "Anonymous" is his name!

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.0 wasn't that bad, actually

Tim Hill


> On Jul 2, 2018, at 10:37 PM, Dirk Laurie <[hidden email]> wrote:
>
> 2018-06-28 13:52 GMT+02:00 Roberto Ierusalimschy <[hidden email]>:
>>> It’s interesting though, as Dirk notes, that tables USED to have this value in 5.0 and it was later dropped, so in some ways it’s a wash.
>>
>> Tables in Lua never had this value. Lua 5.0 used either the field 'n' (if
>> present) or an auxiliary weak table to keep the size of arrays.
>
> Well, now. Roberto, as always, is careful not to give the name of the
> poster, but since the quotation itself mentions me, people may easily
> think that I said what the quotation claims I did.
>
> I plead not guitly. I took care to stick the phrase "internal state"
> used in the Lua 5.0 manual.
>
> Likewise, Philipp discussed "the expected memory overhead for an
> additional 32bit array length field" without actually asserting that
> Lua 5.0 had that.
>
> Who deserves the credit, who deserves the blame?
> A Poster That Remains For Now "Anonymous" is his name!
>

(Shrugs) I was slightly mistaken and Roberto politely corrected me, satisfied now?

I stand by my original post regarding tables/sequences/arrays. The post above seems to add nothing substantial to the discussion.

—Tim


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.0 wasn't that bad, actually

Dirk Laurie-2
2018-07-04 0:40 GMT+02:00 Tim Hill <[hidden email]>:

>
>
>> On Jul 2, 2018, at 10:37 PM, Dirk Laurie <[hidden email]> wrote:
>>
>> 2018-06-28 13:52 GMT+02:00 Roberto Ierusalimschy <[hidden email]>:
>>>> It’s interesting though, as Dirk notes, that tables USED to have this value in 5.0 and it was later dropped, so in some ways it’s a wash.
>>>
>>> Tables in Lua never had this value. Lua 5.0 used either the field 'n' (if
>>> present) or an auxiliary weak table to keep the size of arrays.
>>
>> Well, now. Roberto, as always, is careful not to give the name of the
>> poster, but since the quotation itself mentions me, people may easily
>> think that I said what the quotation claims I did.
>>
>> I plead not guitly. I took care to stick the phrase "internal state"
>> used in the Lua 5.0 manual.
>>
>> Likewise, Philipp discussed "the expected memory overhead for an
>> additional 32bit array length field" without actually asserting that
>> Lua 5.0 had that.
>>
>> Who deserves the credit, who deserves the blame?
>> A Poster That Remains For Now "Anonymous" is his name!
>>
>
> (Shrugs) I was slightly mistaken and Roberto politely corrected me, satisfied now?

:-)

> I stand by my original post regarding tables/sequences/arrays. The post above seems to add nothing substantial to the discussion.

I absolutely agree with that post. I would in fact go further and add
the concept "list" to the list of things that appear to be served by
the table library but differs semantically from the others.

-- Dirk