Multi-indexed tables

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Multi-indexed tables

Dirk Laurie-2
There are many ways to implement multi-indexed tables in Lua, but
since that is not the main point of the post, I'll just pick one,
namely to wrap the indices into a tuple as described for example in
these articles on the Lua wiki:

http://lua-users.org/wiki/SimpleTuples
http://lua-users.org/wiki/FunctionalTuple

In either case, you code something like tbl[tuple(1,2)] where you
would have liked to code tbl[1,2]. And you can obviously monkey-patch
rawget and rawset to accept more arguments.

Now the real questions:

1. Is there any syntactic reason why one can't patch Lua so that
t[1,2] won't raise a compilation error, but instead invoke
__index(t,1,2) or __newindex(t,1,2,v)?
2. Has anyone done it already?

Reply | Threaded
Open this post in threaded view
|

Re: Multi-indexed tables

Philippe Verdy
In fact it could as well be iumplemented as syntaxic sugar:
  t[1,2] being equivalent to t[1][2] (i.e. using tables of tables)
But what you propose is to extend the keys using table values, i.e.
  t[t.tuple(1,2)]
using a tuple(a,b) function as a member of table t (alternatively as a metatable member where __index would resolve "tuple" to a function creating a table entry in getmetable(t).tuples or computing an integer index from (a,b) if (a,b) are integers whose range is known in (1..N, 1..P), because in that case that tuple(a,b) function would just return (a*P+b) without needing an extra table for the unique keys.

Yes there are multiple ways to do that, it all depends on how indexes are composed to create a unique key: an integer or a table of tuples.

If syntaxic sugger is just  t[1,2] being equivalent to t[1][2], there's no real need of it, it just saves one character in the source (except that it is implied that t[1] will be allocated a new table automatically if it is still nil, so that t[1][2] becomes usable without raising an error (attempt to index a nil object which is not a table).

The alternative would be possible however by accepting t[1,2] if t has a metatable containing a __tuple constructor, returning a single key of any type (it does not matter where the tuple will be stored, either as an table in an table of unique tuples, or as a single integer.

Programmers have the choice and in fact different factores may decide them. So I don't clearly see why Lua would require a single choice.

If there's someting to do in Lua, in my opinion it should allow "t[1][2] = value" so that t[1] will be implicitly assigned a new table if it's still nil (the compiler automatically inserting the test and creation of a new table in t[1] where needed. As this can be costly and repetitive in may cases where tables are used this is where the syntax t[1,2] would be the most productive (but still as syntaxic sugar which can be written in Lua)


Le dim. 9 déc. 2018 à 19:12, Dirk Laurie <[hidden email]> a écrit :
There are many ways to implement multi-indexed tables in Lua, but
since that is not the main point of the post, I'll just pick one,
namely to wrap the indices into a tuple as described for example in
these articles on the Lua wiki:

http://lua-users.org/wiki/SimpleTuples
http://lua-users.org/wiki/FunctionalTuple

In either case, you code something like tbl[tuple(1,2)] where you
would have liked to code tbl[1,2]. And you can obviously monkey-patch
rawget and rawset to accept more arguments.

Now the real questions:

1. Is there any syntactic reason why one can't patch Lua so that
t[1,2] won't raise a compilation error, but instead invoke
__index(t,1,2) or __newindex(t,1,2,v)?
2. Has anyone done it already?

Reply | Threaded
Open this post in threaded view
|

Re: Multi-indexed tables

Dibyendu Majumdar
In reply to this post by Dirk Laurie-2
Hi,

On Sun, 9 Dec 2018 at 18:12, Dirk Laurie <[hidden email]> wrote:
>
> 1. Is there any syntactic reason why one can't patch Lua so that
> t[1,2] won't raise a compilation error, but instead invoke
> __index(t,1,2) or __newindex(t,1,2,v)?

I think there are issues with the way the table op codes work - and
the number of registers they allow.

GETTABLE A B C   R(A) := R(B)[RK(C)]
SETTABLE A B C   R(A)[RK(B)] := RK(C)

The opcode doesn't have room for additional registers.

Lua's parsing also assumes that there is a single register for the key.

Regards
Dibyendu

Reply | Threaded
Open this post in threaded view
|

Re: Multi-indexed tables

Dibyendu Majumdar
On Sun, 9 Dec 2018 at 19:46, Dibyendu Majumdar <[hidden email]> wrote:

> On Sun, 9 Dec 2018 at 18:12, Dirk Laurie <[hidden email]> wrote:
> >
> > 1. Is there any syntactic reason why one can't patch Lua so that
> > t[1,2] won't raise a compilation error, but instead invoke
> > __index(t,1,2) or __newindex(t,1,2,v)?
>
> I think there are issues with the way the table op codes work - and
> the number of registers they allow.
>
> GETTABLE A B C   R(A) := R(B)[RK(C)]
> SETTABLE A B C   R(A)[RK(B)] := RK(C)
>
> The opcode doesn't have room for additional registers.
>
> Lua's parsing also assumes that there is a single register for the key.
>

Also see:

http://lua-users.org/lists/lua-l/2018-03/msg00088.html

Reply | Threaded
Open this post in threaded view
|

Re: Multi-indexed tables

Philippe Verdy
In reply to this post by Dibyendu Majumdar
op codes are not an issue at all here; there's still a sequence of opcodes possible to compute RK(C) in a register mapping the expected key from the pair.
Even the code "t[1][2]" would also generate a sequence (GETTABLE;SETTABLE) or (GETTABLE,GETTABLE) and not a single opcode...

Le dim. 9 déc. 2018 à 20:47, Dibyendu Majumdar <[hidden email]> a écrit :
Hi,

On Sun, 9 Dec 2018 at 18:12, Dirk Laurie <[hidden email]> wrote:
>
> 1. Is there any syntactic reason why one can't patch Lua so that
> t[1,2] won't raise a compilation error, but instead invoke
> __index(t,1,2) or __newindex(t,1,2,v)?

I think there are issues with the way the table op codes work - and
the number of registers they allow.

GETTABLE A B C   R(A) := R(B)[RK(C)]
SETTABLE A B C   R(A)[RK(B)] := RK(C)

The opcode doesn't have room for additional registers.

Lua's parsing also assumes that there is a single register for the key.

Regards
Dibyendu

Reply | Threaded
Open this post in threaded view
|

Re: Multi-indexed tables

Philippe Verdy
As well the code t[a+2] would definitely generate several opcodes to compute a+2 before using it as key in GETTABLE R(A), R(t), R(a+2), where R(t) is the register holder the table t, and R(a+2) is the register containing the computed value of key (a+2)...
The only limit is syntaxic and the semantic we want to apply to that syntax "t[1, 2]". The opcodes in the LuaVM is definitely not an issue.

Le dim. 9 déc. 2018 à 23:15, Philippe Verdy <[hidden email]> a écrit :
op codes are not an issue at all here; there's still a sequence of opcodes possible to compute RK(C) in a register mapping the expected key from the pair.
Even the code "t[1][2]" would also generate a sequence (GETTABLE;SETTABLE) or (GETTABLE,GETTABLE) and not a single opcode...

Le dim. 9 déc. 2018 à 20:47, Dibyendu Majumdar <[hidden email]> a écrit :
Hi,

On Sun, 9 Dec 2018 at 18:12, Dirk Laurie <[hidden email]> wrote:
>
> 1. Is there any syntactic reason why one can't patch Lua so that
> t[1,2] won't raise a compilation error, but instead invoke
> __index(t,1,2) or __newindex(t,1,2,v)?

I think there are issues with the way the table op codes work - and
the number of registers they allow.

GETTABLE A B C   R(A) := R(B)[RK(C)]
SETTABLE A B C   R(A)[RK(B)] := RK(C)

The opcode doesn't have room for additional registers.

Lua's parsing also assumes that there is a single register for the key.

Regards
Dibyendu

Reply | Threaded
Open this post in threaded view
|

Re: Multi-indexed tables

Sam Pagenkopf
Override __call and use table{1, 2}?

On Sun, Dec 9, 2018 at 4:20 PM Philippe Verdy <[hidden email]> wrote:
As well the code t[a+2] would definitely generate several opcodes to compute a+2 before using it as key in GETTABLE R(A), R(t), R(a+2), where R(t) is the register holder the table t, and R(a+2) is the register containing the computed value of key (a+2)...
The only limit is syntaxic and the semantic we want to apply to that syntax "t[1, 2]". The opcodes in the LuaVM is definitely not an issue.

Le dim. 9 déc. 2018 à 23:15, Philippe Verdy <[hidden email]> a écrit :
op codes are not an issue at all here; there's still a sequence of opcodes possible to compute RK(C) in a register mapping the expected key from the pair.
Even the code "t[1][2]" would also generate a sequence (GETTABLE;SETTABLE) or (GETTABLE,GETTABLE) and not a single opcode...

Le dim. 9 déc. 2018 à 20:47, Dibyendu Majumdar <[hidden email]> a écrit :
Hi,

On Sun, 9 Dec 2018 at 18:12, Dirk Laurie <[hidden email]> wrote:
>
> 1. Is there any syntactic reason why one can't patch Lua so that
> t[1,2] won't raise a compilation error, but instead invoke
> __index(t,1,2) or __newindex(t,1,2,v)?

I think there are issues with the way the table op codes work - and
the number of registers they allow.

GETTABLE A B C   R(A) := R(B)[RK(C)]
SETTABLE A B C   R(A)[RK(B)] := RK(C)

The opcode doesn't have room for additional registers.

Lua's parsing also assumes that there is a single register for the key.

Regards
Dibyendu

Reply | Threaded
Open this post in threaded view
|

Re: Multi-indexed tables

Diego Nehab-4
In reply to this post by Dirk Laurie-2
Some time ago, I had a discussion with a friend about what it would take to implement this in Lua. One of our concerns was to maintain compatibility with previous versions.

The core of the idea was that the number of indices in a multi-index table lookup would be decided during compilation. So, for example, t[f()] would always adjust the returns of f() to a single index. In the same way, t[f(), g()] would always be a 2-index access. There is precedent for this behavior for commas in Lua in loops like "for i = a,b,c". These do not work if you replace "a, b, c" with f(), even if f() returns "a, b, c". Since t[f()] would remain a 1-index access, regardless of the number of values returned by f(), and since t[f(), g()] is a syntax error today, I *think* there would be no compatibility problems. Compatibility aside, I think this would lead to fewer bugs anyway.

Access would be via __index and __newindex like today. These metamethods would receive all indices, instead of just the first. Below is what the "default" implementations would look like.

__index = function(t, i, ...)
    assert(select('#', ...) == 0, "too many indices")
    return rawget(t, i)
end

__newindex = function(t, i, v, ...)
    assert(select('#', ...) == 0, "too many indices")
    rawset(t, i, v)
end

Here it is obvious that maintaining compatibility is a bit awkward: In an ideal world, __newindex would receive the value before all indices. But since it receives the value after the index today, this change would break all existing code out there.

The behavior for __index and __newindex is preserved when the requested 1-index entry already exists in the table. The difference is that a multi-index entry is always assumed to be missing (so __newindex and __index are always called for multi-index accesses).

At any rate, user-defined __index and __newindex can do whatever they want with '...'.  If t is a userdata, it becomes trivial to perform multi-index accesses without taxing the garbage collector. If t is a Lua table, it becomes possible to linearize the multi-index to a single index and implement a matrix with a single table, without even using a proxy table.

Anyway, just my 2c.

Regards,
Diego