Thoughts on {...} and tbl[nil]

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

Thoughts on {...} and tbl[nil]

Dirk Laurie-2
At present a table literal can give you a haystack in which a needle
can't be found. I'm thinking mainly of {...} but to make things clearer
the example is an ordinary table literal.

    x = {nil,nil,nil,1,nil,nil}
    for k in ipairs(x) do print(k) end  -- nothing printed
    for k=1,#x do print(k) end  -- nothing printed

You are forced to use 'pairs' and check for the type of the index.

The now-shelved NIL_IN_TABLE would have solved this, but
we will have to wait for Lua 6.0.

A change that might be possible in a minor release is to make
`x[nil]` mean "the length of the table literal from which `x` was
constructed, if still known".

    for k=1,x[nil] do print(x) end  --> 1 2 3 4 5 6

It remains illegal to assign to x[nil].

The implementation would require one currently unused bit in the
table structure.

Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Rodrigo Azevedo
2018-06-02 3:06 GMT-03:00 Dirk Laurie <[hidden email]>:
At present a table literal can give you a haystack in which a needle
can't be found. I'm thinking mainly of {...} but to make things clearer
the example is an ordinary table literal.

    x = {nil,nil,nil,1,nil,nil}
    for k in ipairs(x) do print(k) end  -- nothing printed
    for k=1,#x do print(k) end  -- nothing printed

You are forced to use 'pairs' and check for the type of the index.

The now-shelved NIL_IN_TABLE would have solved this, but
we will have to wait for Lua 6.0.

 
NIL_IN_TABLE is good only for 'arrays' types, which is an absolutely
reduction of the current amazing versatility of lua tables.
 
A change that might be possible in a minor release is to make
`x[nil]` mean "the length of the table literal from which `x` was
constructed, if still known".

    for k=1,x[nil] do print(x) end  --> 1 2 3 4 5 6

It remains illegal to assign to x[nil].

The implementation would require one currently unused bit in the
table structure.


My view of this problem:

This is precisely the redefinition (and memorization) of a border, which I claim
is the right way to go for a while.

What's necessary is a way to construct a table such that the border is
defined/memoized as the

biggest non-zero positive integer key assigned

(__newindex), maybe with a 'nil' value, something like a constructor of the type

-- e.g.
t = {@ ... @}     -- @rray semantics!

and that's all. This table will keep the information (e.g. at the registry) about
its border and consequently its rawlen. This is simply a table subtype
that could be checked by (e.g.) a table.type() function if (seldom) necessary.

Thus,
1) 'numeric for' will work as expected for arrays.
2) 'pair()' will iterate only through non-nils values, as expected.
3) 'sequences' also don't change.

The table also remains memory efficient, because only non-nils values are stored.
Moreover,

t[#t+1] = nil

also work as expected for arrays, because #t+1 is the biggest key (border) assigned
(__newindex).

The only problem I can see is the 'reduction' of the border. A simple setborder() function
can be used (e.g.)

table.setborder(t,#t-1)

but I think this will be rarely needed for real-world 'array' purposes (shrinks arrays? who? why?)

I think that this solves the problems of Roberto concerning the semantics of an array construction
exposed in an early message to this list, at a minimum cost that also keeps the compatibility
with the current versions of lua. The implementation seems also to be straight.

that makes sense?

Thanks,
Rodrigo Azevedo Moreira da Silva
Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

dyngeccetor8
In reply to this post by Dirk Laurie-2
On 06/02/2018 09:06 AM, Dirk Laurie wrote:
> A change that might be possible in a minor release is to make
> `x[nil]` mean "the length of the table literal from which `x` was
> constructed, if still known".

I appreciate idea to have ability to get "syntactical" length of
table literal.

Just don't like (<name> "[" "nil" "]") syntax.

It's not obvious. Literal "nil" contradicts common sense (in my view)
that "v = t[nil]" is equivalent to "k = nil; v = t[k]". It exploits
sole value (nil) that can not be used as a key.

I understand that another approach - add global metatable with border
management methods to any table, will introduce another set of problems.
Like name clashing and some performance drop due overhead. But I'd
prefer it anyway.

-- Martin

Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Andrew Gierth
In reply to this post by Rodrigo Azevedo
>>>>> "Rodrigo" == Rodrigo Azevedo <[hidden email]> writes:

 Rodrigo> NIL_IN_TABLE is good only for 'arrays' types,

This is absolutely not true; I've been finding the lack of nils in
tables _very_ frustrating, and none of my use-cases have been anything
to do with arrays as opposed to more general tables.

--
Andrew.

Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Roberto Ierusalimschy
In reply to this post by Dirk Laurie-2
> A change that might be possible in a minor release is to make
> `x[nil]` mean "the length of the table literal from which `x` was
> constructed, if still known".

With the "if still known" as part of the specification, Lua already
does that:

  a = {1, 2, 3}
  print(a[nil])  --> nil   (meaning, "original length not known anymore" :-)

That satisfies your especification, does it not?

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Dirk Laurie-2
2018-06-04 15:17 GMT+02:00 Roberto Ierusalimschy <[hidden email]>:

>> A change that might be possible in a minor release is to make
>> `x[nil]` mean "the length of the table literal from which `x` was
>> constructed, if still known".
>
> With the "if still known" as part of the specification, Lua already
> does that:
>
>   a = {1, 2, 3}
>   print(a[nil])  --> nil   (meaning, "original length not known anymore" :-)
>
> That satisfies your especification, does it not?

In other words, the specification is already a valid model for Lua does :-)

The definition of "known" is a mere implementation detail.

Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

nobody
In reply to this post by Andrew Gierth
On 2018-06-02 19:46, Andrew Gierth wrote:
>>>>>> "Rodrigo" == Rodrigo Azevedo <[hidden email]> writes:
>
>   Rodrigo> NIL_IN_TABLE is good only for 'arrays' types,
>
> This is absolutely not true; I've been finding the lack of nils in
> tables _very_ frustrating, and none of my use-cases have been anything
> to do with arrays as opposed to more general tables.

Can you (abstractly, briefly) describe a few of them?

(For me, `nil` is a _symbol_ for "there's nothing here", so the approach
of actually storing nils seems backwards… and so I'm looking for other
solutions.  Having a bunch of test problems would be handy.  I scanned
through the relevant threads I recalled and didn't find any good ones.)

Regards,
nobody

Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Vaughan McAlley-2
On Tue, 5 Jun 2018, 07:27 nobody, <[hidden email]> wrote:
On 2018-06-02 19:46, Andrew Gierth wrote:
>>>>>> "Rodrigo" == Rodrigo Azevedo <[hidden email]> writes:
>
>   Rodrigo> NIL_IN_TABLE is good only for 'arrays' types,
>
> This is absolutely not true; I've been finding the lack of nils in
> tables _very_ frustrating, and none of my use-cases have been anything
> to do with arrays as opposed to more general tables.

Can you (abstractly, briefly) describe a few of them?

(For me, `nil` is a _symbol_ for "there's nothing here", so the approach
of actually storing nils seems backwards… and so I'm looking for other
solutions.  Having a bunch of test problems would be handy.  I scanned
through the relevant threads I recalled and didn't find any good ones.)

Regards,
nobody


I like to think that t={} creates a table with every possible numeric, string, boolean etc key value pre-filled with nils. pairs() just ignores nil values.

Vaughan

Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Egor Skriptunoff-2
In reply to this post by Rodrigo Azevedo
I think Rodrigo is correct.
We don't really need NIL_IN_DICTIONARY, we only need NIL_IN_ARRAY.
And the "array border" (or "high water mark") is the right word to solve the problem.

Here is one more (very similar) suggestion for 5.4 which don't require new syntax.
Every Lua table must store 8-byte integer "high water mark" (biggest non-zero positive integer key assigned).
User can read it with function table.hwm(t)

local t = {nil, nil}
assert(table.hwm(t) == 2)
t[math.pi] = 42
assert(table.hwm(t) == 2)
t[5] = 42
assert(table.hwm(t) == 5)
t[5] = nil
assert(table.hwm(t) == 5)
t[100] = nil
assert(table.hwm(t) == 100)

As you see, the logic of hwm(t) is completely independent of #t.
We have both traditional length #t and array-ish length hwm(t) simultaneously,
that solves our problem with nils in arrays.

How arrays with nils should be traversed in 5.4:
for k = 1, table.hwm(array) do ... end

Pros:
1) No broken code.  All 5.3 code will run in 5.4 unmodified.
2) No new syntax, so all 5.4 code will have correct syntax from 5.3 point of view.
3) It is possible to write code which runs on both 5.3 and 5.4,
but only for nil-less arrays which were not shrinked.
Just insert the following line at the beginning:
table.hwm = table.hwm or function (t) return #t end
4) pair() will iterate only through non-nils values, as expected.
5) Only non-nils values are stored in the memory.

Cons:
1) hwm(t) is more verbose than #t
2) As previously, #t is nonsensical for arrays with nils,
and t[#t+1] doesn't work for arrays with nils.  You should write t[hwm(t)+1]
3) No new array type, so arrays don't have specific metatable assigned.
4) 8 bytes more for every table storage.  (How many bytes empty table takes?)

Questions:
1) Is table.sethwm(t, newhwm) needed?
2) Should ipairs() use table.hwm() internally?  Is new function hpairs() needed?
3) Is __hwm metamethod needed?

-- Egor
Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Sam Putman


On Wed, Jun 6, 2018 at 10:34 PM, Egor Skriptunoff <[hidden email]> wrote:
I think Rodrigo is correct.
We don't really need NIL_IN_DICTIONARY, we only need NIL_IN_ARRAY.
And the "array border" (or "high water mark") is the right word to solve the problem.

The idiom I would like to see makes #tab a valid L-value.

Then we could say:

tab = {1,2,3, [50] = 50}

#tab = 50

and ipairs would yield 1,2 ; 2,2 ; 3,3; 50,50.

This would be more attractive than the =tab.n= idiom we find ourselves settling on. 

This new over-ride of #t should behave like the old #t, in that

tab[#tab + 1] = 51 -- #tab is now 51

tab[99] = 99 -- #tab is still 51
#tab = 99     -- and so on 

Add a verb to table to do this in one call:

table.extend(tab, 99, "99")  -- String is the rvalue

I don't get tripped up on sparse tables often, this would solve all the cases I've personally run into.

Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

pocomane
In reply to this post by Egor Skriptunoff-2
On Thu, Jun 7, 2018 at 7:34 AM Egor Skriptunoff
<[hidden email]> wrote:
> Every Lua table must store 8-byte integer "high water mark" (biggest non-zero positive integer key assigned).
> User can read it with function table.hwm(t)

I also think this, or something like this, should be the way to go.
For me this could become the final lua solution. Somehow similar
proposal was in http://lua-users.org/lists/lua-l/2018-03/msg00395.html
.

I do not want to disscuss here the datails of the syntax or
implementation, however I think:

- Cons #3, for me, is a Pros (no new type / metamethod)
- A smart implementation can avoid the fixed 8 byte cost (e.g. keep it
associated to a special key of the hash part?). Yes, I know, smart
hacks should be avoided...
- A table opereator is better than a function (i.e. table.hwm); we
will end with two table ops, e.g. % and # , one for max index, one for
border (that I still found very usefull).

I really would like to hear by the lua team why this kind of solution
was discarded (I am quite sure they already examined it).

Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Sam Putman


On Thu, Jun 7, 2018 at 12:49 AM, pocomane <[hidden email]> wrote:
On Thu, Jun 7, 2018 at 7:34 AM Egor Skriptunoff
<[hidden email]> wrote:
> Every Lua table must store 8-byte integer "high water mark" (biggest non-zero positive integer key assigned).
> User can read it with function table.hwm(t)

I also think this, or something like this, should be the way to go.
For me this could become the final lua solution. Somehow similar
proposal was in http://lua-users.org/lists/lua-l/2018-03/msg00395.html
.


Glad someone else has had the same notion!

I don't care for this

#tab = nil 

Resetting # to the default behavior.  That should actually set the return value to nil.

It's already a little odd[*] having something be an assignment on the left side and 
effectively a function call on the right.  Having the implementation silently change the
assignment to something else would be surprising. 

[*] Not so odd in Lua, where we can use __newindex to do this as well.
 

I really would like to hear by the lua team why this kind of solution
was discarded (I am quite sure they already examined it).


I'm also interested.

This seems like the main con:

#tab = "ahoy matey!" 

Should this line be a runtime error? Or should it propagate until the user sees
(number expected, got string)?

I say the latter, but it seems to me that there are runtimes implications of this question
which might interfere with being efficient in the general, already-supported case.
Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

dyngeccetor8
In reply to this post by Egor Skriptunoff-2
On 06/07/2018 08:34 AM, Egor Skriptunoff wrote:

> I think Rodrigo is correct.
> We don't really need NIL_IN_DICTIONARY, we only need NIL_IN_ARRAY.
> And the "array border" (or "high water mark") is the right word to solve
> the problem.
>
> Here is one more (very similar) suggestion for 5.4 which don't require new
> syntax.
> Every Lua table must store 8-byte integer "high water mark" (biggest
> non-zero positive integer key assigned).
> User can read it with function table.hwm(t)
>
> [...]
>
> As you see, the logic of hwm(t) is completely independent of #t.
> We have both traditional length #t and array-ish length hwm(t)
> simultaneously,
> that solves our problem with nils in arrays.
>
> How arrays with nils should be traversed in 5.4:
> for k = 1, table.hwm(array) do ... end
>
> [...]
>
> -- Egor

Wow, I like this proposal!

I imagine this as a __newindex hook that sets hidden "hwm" field to a maximum
of current "hwm" and integer key value. Small overhead, almost free.

> Questions:
> 1) Is table.sethwm(t, newhwm) needed?
Yes. Stored such way water mark should not be immutable.

> 2) Should ipairs() use table.hwm() internally?  Is new function hpairs()
> needed?
I think no. Or it'll break existing Lua code. Introduce "hpairs()"

> 3) Is __hwm metamethod needed?
Probably yes. What parameters it'll have?

-- Martin

Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Gregg Reynolds-2
In reply to this post by Egor Skriptunoff-2


On Thu, Jun 7, 2018, 12:34 AM Egor Skriptunoff <[hidden email]> wrote:

I'm sure I'm not the first to point this out, but 'Egor Skriptunoff" is just about the coolest programmer name I've ever seen. Right up there with "Axel Kittenburger". Please tell me it's real!
Reply | Threaded
Open this post in threaded view
|

AW: Thoughts on {...} and tbl[nil]

michaelflad
In reply to this post by Egor Skriptunoff-2
> Von: [hidden email] <[hidden email]> Im Auftrag von Egor Skriptunoff
> Gesendet: Donnerstag, 7. Juni 2018 07:35
> An: Lua mailing list <[hidden email]>
> Betreff: Re: Thoughts on {...} and tbl[nil]
>
> I think Rodrigo is correct.
> We don't really need NIL_IN_DICTIONARY, we only need NIL_IN_ARRAY.
> And the "array border" (or "high water mark") is the right word to solve the problem.
>
> Here is one more (very similar) suggestion for 5.4 which don't require new syntax.
> Every Lua table must store 8-byte integer "high water mark" (biggest non-zero positive integer key assigned).
> User can read it with function table.hwm(t)
>
> local t = {nil, nil}
> assert(table.hwm(t) == 2)
> t[math.pi] = 42
> assert(table.hwm(t) == 2)
> t[5] = 42
> assert(table.hwm(t) == 5)
> t[5] = nil
> assert(table.hwm(t) == 5)
> t[100] = nil
> assert(table.hwm(t) == 100)
>
> As you see, the logic of hwm(t) is completely independent of #t.
> We have both traditional length #t and array-ish length hwm(t) simultaneously,
> that solves our problem with nils in arrays.
>
> How arrays with nils should be traversed in 5.4:
> for k = 1, table.hwm(array) do ... end

I'd suggest to make setting the hwm a more explicit operation because otherwise
you're missing on a (IMO) crucial optimization, i.e. the fixed array setup in the
array part of the table.

You can't create a fitting linear array part automatically on every bigger
positive integer assignment as there for sure are many tables with very few
entries but at least one huge integer key.

Also, without this optimization, iteration over the arrays will probably become
by far the slowest way to iterate a Lua table as you need a hashed lookup for
each operation.

If you don't make it an explizit operation you also miss on the opportunity to
generate the array part in a size that can fit the required size optimally,
enabling a much much better memory usage pattern in code that uses lots of arrays
and has enough knowledge to actually know the required sizes (which should actually
be a lot and there will be even more devs thinking about their data structs a bit
more, if there's some real memory/performance to be gained).

Without doing/enable these additional optimization somehow, all you'd get is the
advantage of not having to store the current hwm value yourself and you're left
with a regular hashed table and an iteration over a defined size while doing a hash
lookup for every single iteration, whether there is an entry at the current index
or not. Your even losing the existing optimization of ipairs or simple manual
integer iteration.


Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Dirk Laurie-2
In reply to this post by Gregg Reynolds-2
2018-06-08 2:10 GMT+02:00 Gregg Reynolds <[hidden email]>:
>
> On Thu, Jun 7, 2018, 12:34 AM Egor Skriptunoff <[hidden email]>
> wrote:
>>

> I'm sure I'm not the first to point this out, but 'Egor Skriptunoff" is just
> about the coolest programmer name I've ever seen. Right up there with "Axel
> Kittenburger". Please tell me it's real!

It's real.

Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Rodrigo Azevedo
In reply to this post by dyngeccetor8
On 06/07/2018 08:34 AM, Egor Skriptunoff wrote:

>
> > I think Rodrigo is correct.
> > We don't really need NIL_IN_DICTIONARY, we only need NIL_IN_ARRAY.
> > And the "array border" (or "high water mark") is the right word to solve
> > the problem.
> >
> > Here is one more (very similar) suggestion for 5.4 which don't require new
> > syntax.
> > Every Lua table must store 8-byte integer "high water mark" (biggest
> > non-zero positive integer key assigned).
> > User can read it with function table.hwm(t)
> >
> > [...]
> >
> > As you see, the logic of hwm(t) is completely independent of #t.
> > We have both traditional length #t and array-ish length hwm(t)
> > simultaneously,
> > that solves our problem with nils in arrays.
> >
> > How arrays with nils should be traversed in 5.4:
> > for k = 1, table.hwm(array) do ... end
> >
> > [...]
> >
> > -- Egor
>
> Wow, I like this proposal!
>
> I imagine this as a __newindex hook that sets hidden "hwm" field to a maximum
> of current "hwm" and integer key value. Small overhead, almost free.
>
> > Questions:
> > 1) Is table.sethwm(t, newhwm) needed?
> Yes. Stored such way water mark should not be immutable.
>
> > 2) Should ipairs() use table.hwm() internally?  Is new function hpairs()
> > needed?
> I think no. Or it'll break existing Lua code. Introduce "hpairs()"
>
> > 3) Is __hwm metamethod needed?
> Probably yes. What parameters it'll have?
>
> -- Martin
>

Let's try an even simpler model: (alternative to a specific table constructor)

Definitions: t is a table

0) a 'sequence' is a continuous set of integer keys with non-nil values
1) #t (rawlen) operator: biggest non-zero positive integer key of the sequence starting from key 1 [1]
2) t# (rawborder) operator: biggest non-zero positive integer key assigned (rawset)

Examples:

t = {1,2,3,4,5,nil,nil,8,nil} -- two sequences
#t is 5
t# is 9

t = {nil,2,3,nil,nil} -- one sequence
#t is 0
t# is 5

Thus, as expected and compatible with current versions

1) a  'proper sequence' is a table where #t == t# is true .
2) pairs() iterates non-nils, as expected.
3) ipairs() iterates integer keys (non-nil values), as expected.
4) table.(un)pack() are now 'border symmetric' [2].
5) table.insert/remove() are only meanful for sequences starting from 1, which don't change.
6) 'numeric for' works as expected.

Remark:
1) tables remains memory/CPU efficient for sparse and dense objects.
2) t[t#+1] = nil, always increases the rawborder, as expected.

The objective problems Roberto is trying to solve [3]:
-------
1) A constructor like {x, y, z} should always create a sequence with
three elements. A constructor like {...} should always get all arguments
passed to a function. A constructor like {f(x)} should always get
all results returned by the function. '#' should always work on these
tables correctly.

2) A statement like 't[#t + 1] = x' should always add one more element
at the end of a sequence.

These are what confuse people all the time, these are what start new
rounds of discussions around '#'.
--------

It's all OK.

If you expect a 'proper sequence' use #t.
If 'nils' are important for you just use t# instead.
If you don't know, check #t == t# (but you usually know OK?)

It also separates the logic concerning nils_in_arrays  of the current behavior, which is good and easily discernible into the code.

I think we can use the border t# to solve these problems and keep full compatibility with current versions at the cost of a new and simply operator.

Everybody happy?

[1] This is compatible with current length heuristics for 'sequences', namely, the only case where #t is useful for tables. It can be improved if we consider only the length of a sequence starting from 1 if it exists.
[2] Avoiding the '.n' wart.
[3] http://lua-users.org/lists/lua-l/2018-03/msg00239.html

--
Rodrigo Azevedo Moreira da Silva
Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Sean Conner
It was thus said that the Great Rodrigo Azevedo once stated:

>
> Let's try an even simpler model: (alternative to a specific table
> constructor)
>
> Definitions: t is a table
>
> 0) a 'sequence' is a continuous set of integer keys with non-nil values
> 1) #t (rawlen) operator: biggest non-zero positive integer key of the
> sequence starting from key 1 [1]
> 2) t# (rawborder) operator: biggest non-zero positive integer key assigned
> (rawset)

  t    = {}
  t[3] = 3
  t[1] = 1
  t[5] = 5
  t[2] = 2
  t[4] = 4

  #t == 5 -- because this is a sequence
  t# == 5 -- because 5 is the largest non-zero positive integer key

  table.remove(t)

  #t   == 4 -- because this is still a sequence
  t#   == 4 -- beause we removed 5
  t[4] == 4 -- because we removed the last element

  table.remove(t,1)

  #t   == 3 -- beacuse this is still a sequence
  t#   == 4 -- because 4 is still the largest non-zero positive integer key
  t[3] == 4 -- because we removed the first element

  t[500] == 500

  #t == 3   -- because this is still a sequence
  t# == 500 -- because 500 is still the largest non-zero positive integer key

  table.remove(t,2)

  #t     == 2   -- because this is still a sequence
  t#     == 500 -- because 500 is still the largest non-zero positive integer key
        t[2]   == 4   -- because we removed an element
        t[500] == 500 -- because this isn't part of the sequence

> Examples:
>
> t = {1,2,3,4,5,nil,nil,8,nil} -- two sequences
> #t is 5
> t# is 9

        table.remove(t)

  Which element was removed?  Where will you find '8'?  Is this an issue?

  -spc

Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Sam Putman


On Fri, Jun 8, 2018 at 12:19 PM, Sean Conner <[hidden email]> wrote:


  Which element was removed?  Where will you find '8'?  Is this an issue?

  -spc


That's the other implication of an assignable #t, which tends me toward thinking it isn't worth the cost.

t = {1,2,3,4, [10] = 10}; #t = 10

table.remove(t,3) -- does this change #t?
Reply | Threaded
Open this post in threaded view
|

Re: Thoughts on {...} and tbl[nil]

Andrew Starks-2
In reply to this post by nobody


On Mon, Jun 4, 2018 at 4:27 PM, nobody <[hidden email]> wrote:
On 2018-06-02 19:46, Andrew Gierth wrote:
"Rodrigo" == Rodrigo Azevedo <[hidden email]> writes:

  Rodrigo> NIL_IN_TABLE is good only for 'arrays' types,

This is absolutely not true; I've been finding the lack of nils in
tables _very_ frustrating, and none of my use-cases have been anything
to do with arrays as opposed to more general tables.

Can you (abstractly, briefly) describe a few of them?

(For me, `nil` is a _symbol_ for "there's nothing here", so the approach of actually storing nils seems backwards… and so I'm looking for other solutions.  Having a bunch of test problems would be handy.  I scanned through the relevant threads I recalled and didn't find any good ones.)

Regards,
nobody


Lua's use of nil is pretty clear and taken in isolation, is simple and efficient.

 - It represents mu [1]
 - it is false-y  (`not not nil == false`) but is not false (`nil ~= false`)
 - Is the default value for all un-initialized variables, unused arguments and unused indexes in a tables
 - Is generally how you "remove" values from a table (`t[i] = nil`) [2]

In my experience, this is sufficient. For example, representing a sequence where some of the values are unknown is easily accomplished in Lua:

 - false
 - `.n` field to denote length
 - metatables and/or an API to emulate sequences with holes in a way that works for your application.
- make a userdata object that represents your sequence in a way that works for your host application.
 - Use a sentinel, such as JSON_NIL or whathaveyou if that works better for your application

So, why is there a desire to "store" `nil` in tables in such a way as to make them count towards the length of a sequence? What drives people to "solve" this "problem"?

I am interested in any reasonable answer to this question. My attempts are:

## Interoperability with external languages and libraries

Because other languages can store NIL/NUL/MU in sequences, there are APIs that may/do require some way to represent it. If the programmer is not familiar with Lua, the particular way to bridge this gap is not obvious. 

They may feel like they have to write too much of their own library code or too much meta-programming. They may pre-suppose that their solution will be slow.

How can this problem be solved? Examples in PiL with possible solutions? A white paper on handling sequences/nil in Lua when your API requires it? Does Lua necessarily need to be "fixed" in this case?

## Speed

Lua "sequences" are usually very fast in normal use cases, thanks to VM optimizations. The solutions for making transparent, bulletproof sequences (length works with holes) inside of Lua involves some combination of indirection, function calls / metatables and the Length operator. Speed gains are lost or at least diminished.

One solution is to use a dedicated userdata object. C Programmers can make sequences with their native type and then use Lua's API to provide an interface to this object.

Why isn't this desirable? Too much meta-programming? Not obvious to people unfamiliar with Lua?

# It's not Lua

The thrust of why I wanted to share this was to point out that making Lua more useful often requires the language to change not at all, but for people to understand its design better and to understand the limitations of that design and how to use best use those limitations to your advantage. Or it requires a change to Lua.

If future versions of Lua feature a more complicated concept of mu, only to make it more obvious to people that were not able to see how their project could be accomplished with the simpler way, then that would be sad.



-- 
Andrew Starks
[2] but garbage collection happens on its own and so getting rid of allocated objects should still use an explicit `:close()` method.