rethinking method calls with __mcall metamethod rather than __index/__call

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

rethinking method calls with __mcall metamethod rather than __index/__call

David Manura
A few Lua operations are implemented in terms of more primitive
operations.  ">" and ">=" are implemented in terms of "<" and "<="
respectively [8].  "<=" may be implemented in terms of "<" if a __le
metamethod is not given.  Moreover, method calls, a:b(c), are
implemented in terms of indexing and calling: local a = a; a["b"](a,
c).

There is a cost to that design.

Consider a "set" object with set operations:

  -- Simple Set ADT
  local set_mt = {}
  set_mt.__index = set_mt
  function set_mt:union(set2)
    for k in pairs(set2) do print(k) self[k] = true end
  end
  function set(t)
    local self = setmetatable({}, set_mt)
    for _,v in ipairs(t) do self[v] = true end
    return self
  end

  -- Example
  local s = set{'a', 'b', 'c'}
  s:union(set{'c', 'd'})
  assert(s['a'] and s['d'])
  assert(not s['union']) --> fails
  assert(not s['__index']) --> fails

As seen, if a set "s" has a method "union", this implies that
s['union'] is true.  For such reason, Penlight [1] avoids using the
index operator for membership tests in its set and map ADTs.  Also, if
we use the common technique of storing methods in the metatable
("set_mt.__index = set_mt" above), then s['__index'] is not nil
either, a subtle potential bug or security hole.

That also has implications in the __pairs/__ipairs discussions [7].  Given

  local s = set{'a', 'b', 'c'}
  for k,v in pairs(s) print(k, v) end

would we want to design it to print a, b, and c?  or would we want it
to print union and __index?  After all, all these values (k) satisfy
the condition that s[k] ~= nil, so it would seem consistent that pairs
should print all of them.  Most likely, we only want to print a, b,
and c.  However, in some cases we may want to iterate over method
names (reflection).  The important point is that perhaps it would be
more consistent for s[k] == nil for method names k and we could
provide some other way to iterate method names.

Going further, consider a proxy object that forwards method calls:

  -- Proxy
  local mt = {}
  function mt.__index(_, k)
    return function(self, ...)
      local priv = self[1]
      return priv[k](priv, ...)
    end
  end
  function proxy(o)
    return setmetatable({o}, mt)
  end

  -- Example
  local s = proxy("test")
  assert(s:sub(2,3) == "es")
  assert(s.sub(s,2,3) == "es")

Splitting the method call into index and call operations results in
the inefficiency of a temporary closure being created per each method
call (granted, e.g., these might be cached).  Yet the above
implementation is still too simplistic:

  local s = proxy(math)
  assert(s.pi == math.pi) --> fails
  assert(s.sqrt(4) == 2) --> fails

If an object has both methods and fields, then the __index will need
to test or guess whether priv[k] is a method that should be wrapped or
a value that should be returned as is:

Such complications occurred in MethodChainingWrapper [2].

Consider also the potential for error in colon v.s. dot syntaxes for
method/function calls [5-6]:

  o:f() --> correct
  o.f() --> bad.  function is called, but in the wrong way, resulting
in f likely failing some way

If these were separate operations, then we could allow o.f to be nil,
and the above will fail earlier with cleaner errors (o.f is nil).

I agree with the thinking that o:f() and o.f are two separate
concepts.  One is message passing.  The other is indexing.  But Lua
forces the former to be defined in terms of the latter.

An alternative, proposed for consideration, is to provide a new
__mcall metamethod for method calls (a.k.a. message passing).  If
__mcall is not provided, Lua would revert to the old behavior of
consulting __index and __call instead.  The proxy example above would
be reimplemented more cleanly as follows:

  -- Proxy
  local mt = {}
  function mt:__mcall(k, ...)
    local priv = self[1]
    return priv:[k](...)   --[A]
  end
  function mt:__index(k)
    local priv = self[1]
    return priv[k]
  end
  function proxy(o)
    return setmetatable({o}, mt)
  end

  -- Example
  local s = proxy("test")
  assert(s:sub(2,3) == "es")
  local f = s:sub; assert(f(2,3) == "es")   -- [B]

Note that the above makes use of two proposed extensions: (A) using a
variable method name in the method call as proposed in [3] and (B)
using colon closure construction notation as proposed in [4].

[1] http://penlight.luaforge.net/index.html#T10
[2] http://lua-users.org/wiki/MethodChainingWrapper
[3] http://lua-users.org/lists/lua-l/2009-05/msg00001.html
[4] http://lua-users.org/lists/lua-l/2009-01/msg00606.html
[5] http://lua-users.org/wiki/ColonForMethodCall
[6] http://lua-users.org/lists/lua-l/2003-08/msg00248.html
[7] http://lua-users.org/wiki/GeneralizedPairsAndIpairs
[8] http://www.lua.org/manual/5.1/manual.html#2.5.2
Reply | Threaded
Open this post in threaded view
|

Re: rethinking method calls with __mcall metamethod rather than __index/__call

Mark Hamburg
I strongly endorse this general concept and had been planning on  
writing something similar up. The set example is a really great  
example. The key point I'd thought about had to do with allowing one  
to construct object systems in which one could readily detect failure  
to use a colon when sending messages. It also allows for mixing  
efficient method inheritance with property accessors because __mcall  
could lead to a table of methods while __index could lead to a  
function that did the property accessor logic. Finally, it makes it  
easier to write methods that can know to what object they are being  
applied because they can't be accessed as "loose" functions. That's a  
help when writing efficient bridges to native code.

The semantics of __mcall would be something like the following (which  
essentially corresponds to the self opcode):

        function mcallprep( t, k )

                local mt = getmetatable( t )
                local mcall = mt and mt.__mcall

                if mcall then
                        if type( mcall ) == 'table' then
                                return mcall[ k ], t
                        else
                                local f, o = mcall( t, k )
                                return f, o
                        end
                end

                return t[ k ], t -- Uses standard lookup

        end

There should be discussion about whether this should first look at  
rawget( t, k ) before turning to the metamethods. I chose not to here,  
but that was essentially an arbitrary choice. The calling logic for  
the case where mcall isn't a table also potentially needs discussion  
but is designed to match with the expected behavior of the opcode it  
virtualizes. This also probably needs thought as to how one builds  
inheritance hierarchies.

Finally, one could argue that adding this feature essentially  
necessitates method currying and the obj:[ method ] extensions because  
otherwise one would lose the ability to call arbitrary methods based  
on a parameter or to cache the results of a method lookup.

Mark

P.S. I sometimes class all of this stuff together under the label  
"embracing the colon operator" -- i.e., stop wishing that period did a  
message send and embrace the colon operator.

Reply | Threaded
Open this post in threaded view
|

Re: rethinking method calls with __mcall metamethod rather than __index/__call

David Manura
On Sat, Jun 13, 2009 at 5:04 PM, Mark Hamburg wrote:
> It also allows for mixing efficient method inheritance
> with property accessors because __mcall could lead to a table of methods
> while __index could lead to a function that did the property accessor logic.

Which according to [1], I gather you mean

  mt.__mcall = methods  -- remains fast
  function mt.__index(t, k)  -- slower
      local p = properties[k]
      if p then return p(t, k) end
      error("Unknown property: " .. tostring(k))
  end

> The semantics of __mcall would be something like the following (which
> essentially corresponds to the self opcode):
>
>        function mcallprep( t, k )
>                local mt = getmetatable( t )
>                local mcall = mt and mt.__mcall
>                if mcall then
>                        if type( mcall ) == 'table' then
>                                return mcall[ k ], t
>                        else
>                                local f, o = mcall( t, k )
>                                return f, o
>                        end
>                end
>                return t[ k ], t -- Uses standard lookup
>        end

The "set" example above, in which __mcall will be a table, can be well
represented that way:

  -- Simple Set ADT
  local mt = {}
  local methods = {}
  function methods:union(set2)
    for k in pairs(set2) do self[k] = true end
  end
  mt.__mcall = methods
  function set(t)
    local self = setmetatable({}, mt)
    for _,v in ipairs(t) do self[v] = true end
    return self
  end
  -- note: optionally share mt and methods in same table,
  -- which as a side-effect exposes __mcall as a method.

However, the proxy example, in which __mcall will be a function,
involves a temporary closure again, and we prefer to be able to omit
the "o" value above too:

  -- Proxy
  local mt = {}
  function mt:__mcall(k)
    local priv = self[1]
    return priv:[k]  -- note: omit ",o"
  end
  function mt:__index(k)
    local priv = self[1]
    return priv[k]
  end
  function proxy(o)
    return setmetatable({o}, mt)
  end

The awkwardness lies in that "local f, o = mcall( t, k )" deconstructs
the method in terms of a function and its first argument, but the
proxy doesn't necessary have access to the original f, which as you
noted is no longer loose.

> Finally, one could argue that adding this feature essentially necessitates
> method currying...

i.e. [2]

Also, in [3] you wrote:

> Finally, it would be good to have a fast way to test for method
> support. These changes would essentially force the use of obj:msg
> for any object using the new __self metamethod, but since that
> actually constructs a closure, it's overkill if all we want is a boolean.

True, but in similar fashion neither do we have an efficient way to
test for operator support (e.g. call operator) [4].  Indexing was an
incomplete solution for that anyway.  We might write obj:__call or
obj:["()"] to obtain a closure (or nil) for the call operation.

[1] http://lua-users.org/lists/lua-l/2006-04/msg00527.html
[2] http://lua-users.org/wiki/MethodCurry
[3] http://lua-users.org/lists/lua-l/2009-01/msg00612.html
[4] http://lua-users.org/lists/lua-l/2009-05/msg00479.html
Reply | Threaded
Open this post in threaded view
|

Re: rethinking method calls with __mcall metamethod rather than __index/__call

Steven Johnson
In reply to this post by Mark Hamburg
> Finally, it makes it easier to write methods that can know to what object
> they are being applied because they can't be accessed as "loose" functions.
> That's a help when writing efficient bridges to native code.

With regard to "loose" functions, there's another point that's relevant to your
recent topics about key privacy, as seen from your example in [4] above:

   local cacheMethod = obj1.msg
   -- intervening code where you forget what's going on
   cacheMethod( obj2 )

One may "forget" what's going on and add an obj2 with its own __index and
/ or __newindex metamethods. Especially with little accessors that don't do
enough to fail, suddenly our keys aren't so hidden.

I use a key scheme like yours, so this has been on my mind. Maybe such a
metamethod would help shore this up.

(Per-member weak tables and stuffing each method with a type assertion are
the present solutions of which I'm aware. Are there others? I suppose if I ever
get around to using Metalua to auto-generate the keys I could have it add the
assertions too...)

> There should be discussion about whether this should first look at rawget(
> t, k ) before turning to the metamethods. I chose not to here, but that was
> essentially an arbitrary choice.

I do find it handy to be able to override an instance's method, sometimes only
temporarily, e.g. for getting a button's x-coordinate:

  function my_button_instance:GetX ()
     return (GetScreenWidth() - self:GetW()) / 2 -- Center the button
horizontally
  end

versus the GetX() belonging to the class itself.

Of course, while this use case favors doing the lookup first, it's
probably not a
very important one, as I doubt it's much more difficult to add the handling into
the mcall() itself.
Reply | Threaded
Open this post in threaded view
|

Re: rethinking method calls with __mcall metamethod rather than __index/__call

David Manura
In reply to this post by David Manura
On Sat, Jun 13, 2009 at 7:09 PM, David Manura wrote:
> However, the proxy example, in which __mcall will be a function,
> involves a temporary closure again, and we prefer to be able to omit
> the "o" value above too:

Even more so, consider the case in [5] where the proxy forwards its
messages to multiple destinations.  Under the original __mcall
semantics, we would write simply

  function mt.__mcall(self, k, ...)
    print('DEBUG-TRACE', self, k, ...)
    for i,v in ipairs(self) do self[i] = v:[k](...) end
    return self
  end

[5] http://lua-users.org/lists/lua-l/2009-05/msg00479.html
Reply | Threaded
Open this post in threaded view
|

Re: rethinking method calls with __mcall metamethod rather than __index/__call

David Manura
In reply to this post by Steven Johnson
On Sat, Jun 13, 2009 at 7:26 PM, Steven Johnson wrote:
>> There should be discussion about whether this should first look at
>> rawget( t, k ) before turning to the metamethods.
> I do find it handy to be able to override an instance's method, sometimes only
> temporarily, ... Of course, while this use case favors doing the lookup first, it's
> probably not a very important one, as I doubt it's much more difficult to add
> the handling into the mcall() itself.

In the set ADT example above, storing "union" in the set would break
the union method.  rawget(o, 'k') would determine behavior of both o.k
and o:k.  The typical solution would then be for the object to instead
store its private data in another table proxied through __index.

To bring up the method-operator analogy again--note also that we
normally lack a handy way to override an individual instance's
operators, though you could monkey patch the getmetatable(obj).__mcall
table (affects all objects that use that metatable).

BTW, even with the __mcall proposal, we remain unable to define
"properties" on the set ADT above since, for example, set.empty and
set['empty'] are equivalent, but that we might ignore.
Reply | Threaded
Open this post in threaded view
|

Re: rethinking method calls with __mcall metamethod rather than __index/__call

Steven Johnson
> In the set ADT example above, storing "union" in the set would break
> the union method.  rawget(o, 'k') would determine behavior of both o.k
> and o:k.  The typical solution would then be for the object to instead
> store its private data in another table proxied through __index.

Ah, there is that. And wherever there have been public keys I have turned
toward just that sort of solution. :) And as with Penlight, another means
than __index for access.

Peaceful coexistence would be handy indeed.

> To bring up the method-operator analogy again--note also that we
> normally lack a handy way to override an individual instance's
> operators, though you could monkey patch the getmetatable(obj).__mcall
> table (affects all objects that use that metatable).

__mcall as a table might be messy, but with the function version you of
course get passed the object. Then overrides can be held in per-method
tables with objects as weak keys, and some means provided to set and
clear them.

Hmm, so changing course from my last post...

In my own case anyway, overrides are few and far enough between that this
wouldn't be so inconvenient. It then becomes possible to limit which ones are
even permitted; this would certainly give me some peace of mind about using
any old method call inside another public function or method. And some other
useful assumptions suddenly become valid.
Reply | Threaded
Open this post in threaded view
|

Re: rethinking method calls with __mcall metamethod rather than __index/__call

Mark Hamburg
In reply to this post by David Manura
The reason I went for making mcall return the function and its first  
parameter had to do with the underlying behavior of the Lua virtual  
machine and with the needs for implementing method curries. If we  
insist that the SELF opcode always generates the object to call and  
the first parameter to the call, then everything after that is well  
positioned for reasoning about the state of the stack for a call or  
for building a method curry. The __mcall metatable entry is basically  
a way to override this behavior. We could say that it only gets to  
return the item to call, but that seems overly restrictive.

Mark

Reply | Threaded
Open this post in threaded view
|

Re: rethinking method calls with __mcall metamethod rather than __index/__call

Edgar Toernig
In reply to this post by David Manura
David Manura wrote:
>[...]
> As seen, if a set "s" has a method "union", this implies that
> s['union'] is true.  For such reason, Penlight [1] avoids using the
> index operator for membership tests in its set and map ADTs.  Also, if
> we use the common technique of storing methods in the metatable
> ("set_mt.__index = set_mt" above), then s['__index'] is not nil
> either, a subtle potential bug or security hole.

And that's the reason why I changed the colon to be an operator
like '.' which looks in the object's metatable (and may trigger
__index/__newindex calls in the metatable's metatable) instead
of the object itself.

A drastic change but I'm happy with it ...

Ciao, ET.
Reply | Threaded
Open this post in threaded view
|

Re: rethinking method calls with __mcall metamethod rather than __index/__call

steve donovan
In reply to this post by David Manura
On Sat, Jun 13, 2009 at 10:25 PM, David Manura<[hidden email]> wrote:
> As seen, if a set "s" has a method "union", this implies that
> s['union'] is true.  For such reason, Penlight [1] avoids using the
> index operator for membership tests in its set and map ADTs.

Yes, but I had been cheerfully assuming that m[key] would return the
goods reliably - this is indeed a Big Problem.  It's very natural that
set or map access is through [], but until we can control that
operation more fully, false matches are going to happen, in a nasty
silent way.  And hence explicit set() & get() methods.

People from a C++ background assume that [] is overrideable, but it
ain't an operator!

steve d.
Reply | Threaded
Open this post in threaded view
|

Re: rethinking method calls with __mcall metamethod rather than __index/__call

Daniel Silverstone
In reply to this post by David Manura
On Sat, 2009-06-13 at 16:25 -0400, David Manura wrote:
> An alternative, proposed for consideration, is to provide a new
> __mcall metamethod for method calls (a.k.a. message passing).  If
> __mcall is not provided, Lua would revert to the old behavior of
> consulting __index and __call instead.  The proxy example above would
> be reimplemented more cleanly as follows:

Aranha[1] uses __methindex to provide as closely compatible behaviour as
possible, but still offer separation of methods and members.

__mcall rather than __methindex would offer a minor speedup, but I
wouldn't have thought it would be much. How measurable is it?

Certainly I think it'd be nice to see the core Lua distribution make the
distinction between foo:bar() and foo.bar() at the metamethod level.

D.

--
Daniel Silverstone                         http://www.digital-scurf.org/
PGP mail accepted and encouraged.            Key Id: 2BC8 4016 2068 7895


Reply | Threaded
Open this post in threaded view
|

Re: rethinking method calls with __mcall metamethod rather than __index/__call

Daniel Silverstone
On Fri, 2009-06-19 at 12:15 +0100, Daniel Silverstone wrote:
> Aranha[1]

[1] http://www.digital-scurf.org/software/aranha

Sorry, I forgot the footnote before :-)

D.

--
Daniel Silverstone                         http://www.digital-scurf.org/
PGP mail accepted and encouraged.            Key Id: 2BC8 4016 2068 7895


Reply | Threaded
Open this post in threaded view
|

Re: rethinking method calls with __mcall metamethod rather than __index/__call

David Manura
On Fri, Jun 19, 2009 at 7:15 AM, Daniel Silverstone wrote:
> Aranha[1] uses __methindex to provide as closely compatible behaviour as
> possible, but still offer separation of methods and members.

I see this discussion before [9-10], which brings up some useful
points such as the possible need for a pcallmethod.

[9] http://lua-users.org/lists/lua-l/2004-11/msg00045.html
[10] http://lua-users.org/lists/lua-l/2004-11/msg00057.html

On Sun, Jun 14, 2009 at 1:31 AM, Mark Hamburg wrote:
> The __mcall metatable entry is basically a way to override
> this behavior. We could say that it only gets to return the
> item to call, but that seems overly restrictive.

When would returning a different first argument be useful?