Lua 5.1 has a serious issue

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Lua 5.1 has a serious issue

Daniel Silverstone
Hi,

In my opinion (and the opinion of various denizens of lua-l and #lua)
there is a serious flaw in Lua 5.1's new # operator.

Specifically it is that it does not invoke the __len metamethod on
tables, only on userdata.

This results in an inability to do interesting things with the #
operator on objects implemented in pure lua. Consider a special
sparse-array as a critical example -- the default implementation of # on
a table (luaH_getn) clearly will never work reliably on a sparse array.
But if the __len metamethod were checked for on tables also, then it
would clearly allow for the implementor to sort it out.

I know this has external effects on the language and that this late in
the game this is frowned upon, but honestly I thought it was a mistake
in the beta and would have been corrected before the release candidate.

It's fairly easy to replace the chunk in lvm.c for handling OP_LEN for
tables to be something like:

          case LUA_TTABLE: {
            const TValue *tm = luaT_gettmbyobj(L, rb, TM_LEN);
            if (ttisnil(tm)) {
              setnvalue(ra, cast(lua_Number, luaH_getn(hvalue(rb))));
            } else {
              Protect(callTMres(L, ra, tm, rb, &luaO_nilobject));
            }
            break;
          }

And such would save a lot of hassle and prevent us from having another
second-class metamethod to go alongside __gc.

I appreciate that the docs may need tweaking as a result of this, but I
seriously recommend it be done. Otherwise I can see many distributions
applying such a patch as a "bugfix" which will make varying and
effectively incompatible versions of Lua occupy our distributions.

D.

--
Daniel Silverstone                         http://www.digital-scurf.org/
PGP mail accepted and encouraged.            Key Id: 2BC8 4016 2068 7895


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Rici Lake-2

On 12-Jan-06, at 7:01 AM, Daniel Silverstone wrote:

> In my opinion (and the opinion of various denizens of lua-l and #lua)
> there is a serious flaw in Lua 5.1's new # operator.
>
> Specifically it is that it does not invoke the __len metamethod on
> tables, only on userdata.

I agree with this 100%

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

John Belmonte
In reply to this post by Daniel Silverstone
Daniel Silverstone wrote:
> I appreciate that the docs may need tweaking as a result of this, but I
> seriously recommend it be done. Otherwise I can see many distributions
> applying such a patch as a "bugfix" which will make varying and
> effectively incompatible versions of Lua occupy our distributions.

Any packager that would apply a modification like that is being
irresponsible.  I don't think your last point should be considered when
deciding this issue.

Regards,
--John
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Daniel Silverstone
On Thu, 2006-01-12 at 07:24 -0500, John Belmonte wrote:
> > I appreciate that the docs may need tweaking as a result of this, but I
> > seriously recommend it be done. Otherwise I can see many distributions
> > applying such a patch as a "bugfix" which will make varying and
> > effectively incompatible versions of Lua occupy our distributions.
> Any packager that would apply a modification like that is being
> irresponsible.  I don't think your last point should be considered when
> deciding this issue.

That doesn't stop me seeing gentoo and others doing it.

D.

--
Daniel Silverstone                         http://www.digital-scurf.org/
PGP mail accepted and encouraged.            Key Id: 2BC8 4016 2068 7895


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Roberto Ierusalimschy
> Otherwise I can see many distributions applying such a patch as a
> "bugfix" which will make varying and effectively incompatible versions
> of Lua occupy our distributions.

This is as much a bug as any other design decision. If packagers feel
they can redesign the packages they distribute, there is nothing we can
do to stop them.

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Asko Kauppi
In reply to this post by Daniel Silverstone

Make that apply to strings as well?

As a quick example, would allow UTF-8 strings to be returning the  
real length (in characters); not the number of bytes.

-asko


Daniel Silverstone kirjoitti 12.1.2006 kello 14.01:

> Hi,
>
> In my opinion (and the opinion of various denizens of lua-l and #lua)
> there is a serious flaw in Lua 5.1's new # operator.
>
> Specifically it is that it does not invoke the __len metamethod on
> tables, only on userdata.
>
> This results in an inability to do interesting things with the #
> operator on objects implemented in pure lua. Consider a special
> sparse-array as a critical example -- the default implementation of  
> # on
> a table (luaH_getn) clearly will never work reliably on a sparse  
> array.
> But if the __len metamethod were checked for on tables also, then it
> would clearly allow for the implementor to sort it out.
>
> I know this has external effects on the language and that this late in
> the game this is frowned upon, but honestly I thought it was a mistake
> in the beta and would have been corrected before the release  
> candidate.
>
> It's fairly easy to replace the chunk in lvm.c for handling OP_LEN for
> tables to be something like:
>
>           case LUA_TTABLE: {
>             const TValue *tm = luaT_gettmbyobj(L, rb, TM_LEN);
>             if (ttisnil(tm)) {
>               setnvalue(ra, cast(lua_Number, luaH_getn(hvalue(rb))));
>             } else {
>               Protect(callTMres(L, ra, tm, rb, &luaO_nilobject));
>             }
>             break;
>           }
>
> And such would save a lot of hassle and prevent us from having another
> second-class metamethod to go alongside __gc.
>
> I appreciate that the docs may need tweaking as a result of this,  
> but I
> seriously recommend it be done. Otherwise I can see many distributions
> applying such a patch as a "bugfix" which will make varying and
> effectively incompatible versions of Lua occupy our distributions.
>
> D.
>
> --
> Daniel Silverstone                         http://www.digital- 
> scurf.org/
> PGP mail accepted and encouraged.            Key Id: 2BC8 4016 2068  
> 7895
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Daniel Silverstone
On Thu, 2006-01-12 at 18:42 +0200, Asko Kauppi wrote:
> Make that apply to strings as well?
> As a quick example, would allow UTF-8 strings to be returning the  
> real length (in characters); not the number of bytes.

That would (unless I'm wrong) require strings to carry around metatables
themselves which is a very big change.

Or did you simply mean check against the string metatable?

D.

--
Daniel Silverstone                         http://www.digital-scurf.org/
PGP mail accepted and encouraged.            Key Id: 2BC8 4016 2068 7895


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Daniel Silverstone
In reply to this post by Roberto Ierusalimschy
On Thu, 2006-01-12 at 14:34 -0200, Roberto Ierusalimschy wrote:
> This is as much a bug as any other design decision. If packagers feel
> they can redesign the packages they distribute, there is nothing we can
> do to stop them.

Fair enough; but disregarding that comment entirely, there is still the
question of the orthogonality and ability to implement __len on pure-lua
objects.

D.

--
Daniel Silverstone                         http://www.digital-scurf.org/
PGP mail accepted and encouraged.            Key Id: 2BC8 4016 2068 7895


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Asko Kauppi
In reply to this post by Daniel Silverstone

Suggested the latter, to apply to the per-type metatable.  I have  
notified this to the authors already, but no feedback on the  
particular issue.   Basically, we'd need separate "byte size" and  
"length" operators, which by default would be the same.

-asko

Daniel Silverstone kirjoitti 12.1.2006 kello 18.47:

> On Thu, 2006-01-12 at 18:42 +0200, Asko Kauppi wrote:
>> Make that apply to strings as well?
>> As a quick example, would allow UTF-8 strings to be returning the
>> real length (in characters); not the number of bytes.
>
> That would (unless I'm wrong) require strings to carry around  
> metatables
> themselves which is a very big change.
>
> Or did you simply mean check against the string metatable?
>
> D.
>
> --
> Daniel Silverstone                         http://www.digital- 
> scurf.org/
> PGP mail accepted and encouraged.            Key Id: 2BC8 4016 2068  
> 7895
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Roberto Ierusalimschy
In reply to this post by Daniel Silverstone
> there is still the question of the orthogonality and ability to
> implement __len on pure-lua objects.

As a general rule, Lua does not allow the redefinition of operators for
cases already defined in the language (e.g., add for numbers, concat for
strings, call for functions).

(On the other hand, we do understand that tables may have a different
status, as they are used to represent "all other types" in Lua.)

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Rici Lake-2

On 12-Jan-06, at 12:06 PM, Roberto Ierusalimschy wrote:

> As a general rule, Lua does not allow the redefinition of operators for
> cases already defined in the language (e.g., add for numbers, concat
> for
> strings, call for functions).

On the other hand, table get and set is redefinable (partially).

>
> (On the other hand, we do understand that tables may have a different
> status, as they are used to represent "all other types" in Lua.)

I think that's the essential point. #table is only useful for a subset
of tables (i.e. non-sparse arrays); there are many cases where you
might want to redefine it; for example, a pure proxy table (commonly
used to allow complete redefinition of get and set).

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Daniel Silverstone
In reply to this post by Roberto Ierusalimschy
On Thu, 2006-01-12 at 15:06 -0200, Roberto Ierusalimschy wrote:
> > there is still the question of the orthogonality and ability to
> > implement __len on pure-lua objects.
> As a general rule, Lua does not allow the redefinition of operators for
> cases already defined in the language (e.g., add for numbers, concat for
> strings, call for functions).

Right. This is quite fair.

> (On the other hand, we do understand that tables may have a different
> status, as they are used to represent "all other types" in Lua.)

Does this mean you'll consider adding __len lookups for tables before
release?

(Please say 'yes' :-)

D.

--
Daniel Silverstone                         http://www.digital-scurf.org/
PGP mail accepted and encouraged.            Key Id: 2BC8 4016 2068 7895


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Mike Pall-41
In reply to this post by Daniel Silverstone
Hi,

Daniel Silverstone wrote:
> [...] Lua 5.1's new # operator.
> Specifically it is that it does not invoke the __len metamethod on
> tables, only on userdata.

Ah. News. Well, as of May 19th 2005:
  http://lua-users.org/lists/lua-l/2005-05/msg00281.html

> It's fairly easy to replace the chunk in lvm.c for handling OP_LEN for
> tables to be something like:

It's also fairly slow. Measuring #t on a plain empty table in a
tight loop minus the loop overhead is 2x slower (!). Not good.

A proper fix is to make __len a fast tag method. Then the
overhead is down to 2% (much less when the table has elements).
Much better.

I've had a 'nice to have, but probably too expensive' opinion
before. But in the light of this benchmark, I'd say go for it.

> And such would save a lot of hassle and prevent us from having another
> second-class metamethod to go alongside __gc.

This has a very different rationale and is an inappropriate
comparison. Please google for (e.g.) Java finalizers or
reachability states to understand the problem. I severly doubt
there is an elegant generic solution.

> Otherwise I can see
> many distributions applying such a patch as a "bugfix" which
> will make varying and effectively incompatible versions of Lua
> occupy our distributions.

Yes, they can do that. Just not call it Lua anymore.

PS: The choice of the subject line and the general tone of your
    message is not helpful for your agenda. Developers are
    usually most impressed by proper research, good code and
    convincing benchmarks.

Bye,
     Mike
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Mike Pall-41
In reply to this post by Asko Kauppi
Hi,

Asko Kauppi wrote:
> Make that apply to strings as well?
>
> As a quick example, would allow UTF-8 strings to be returning the  
> real length (in characters); not the number of bytes.

Please do not reopen this discussion again:

  http://lua-users.org/lists/lua-l/2005-08/msg00096.html

UTF-8 gives rise to multiple measures of 'length'. So you need
multiple operators or functions anyway. The most appropriate
choice for #s is the definition which is common to all strings.
And this is the length in bytes. No need to override this.

BTW: The length in characters is mostly meaningless for UTF-8.
     You probably want glyphs or screen real-estate.

Bye,
     Mike
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Mark Hamburg-4
In reply to this post by Asko Kauppi
I'm not taking sides on the __len semantics issue, but most of what would be
needed to implement pure Lua objects would be to provide a function taking a
metatable and a regular table and returning a userdata with that metatable
and the regular table as its environment. It makes the code implementing the
object a little more complicated in that it needs to go retrieve the
environment table, but it means that you get all of the behaviors associated
with userdata values. Furthermore, one could use it to implement opaque
objects by making the environment hidden to standard Lua calls and instead
using a private weak table to retrieve it.

Mark

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

Wim Couwenberg-2
In reply to this post by Mike Pall-41
> PS: The choice of the subject line and the general tone of your
>     message is not helpful for your agenda.

I find the Lua list refreshingly interesting not in the least because
of people like Daniel any many others.  Currently I can think of only
a couple of people on the list to which your remark might actually
apply from my own perspective.

--
Wim


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.1 has a serious issue

David Burgess
In reply to this post by Mike Pall-41
Oh and Mike brings back to my memory the desirous
t[] =
append syntax.

DB

On 1/13/06, Mike Pall <[hidden email]> wrote:

> Hi,
>
> Daniel Silverstone wrote:
> > [...] Lua 5.1's new # operator.
> > Specifically it is that it does not invoke the __len metamethod on
> > tables, only on userdata.
>
> Ah. News. Well, as of May 19th 2005:
>   http://lua-users.org/lists/lua-l/2005-05/msg00281.html
>
> > It's fairly easy to replace the chunk in lvm.c for handling OP_LEN for
> > tables to be something like:
>
> It's also fairly slow. Measuring #t on a plain empty table in a
> tight loop minus the loop overhead is 2x slower (!). Not good.
>
> A proper fix is to make __len a fast tag method. Then the
> overhead is down to 2% (much less when the table has elements).
> Much better.
>
> I've had a 'nice to have, but probably too expensive' opinion
> before. But in the light of this benchmark, I'd say go for it.
>
> > And such would save a lot of hassle and prevent us from having another
> > second-class metamethod to go alongside __gc.
>
> This has a very different rationale and is an inappropriate
> comparison. Please google for (e.g.) Java finalizers or
> reachability states to understand the problem. I severly doubt
> there is an elegant generic solution.
>
> > Otherwise I can see
> > many distributions applying such a patch as a "bugfix" which
> > will make varying and effectively incompatible versions of Lua
> > occupy our distributions.
>
> Yes, they can do that. Just not call it Lua anymore.
>
> PS: The choice of the subject line and the general tone of your
>     message is not helpful for your agenda. Developers are
>     usually most impressed by proper research, good code and
>     convincing benchmarks.
>
> Bye,
>      Mike
>