Custom extensions to Lua

classic Classic list List threaded Threaded
43 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Custom extensions to Lua

Lisa Parratt
Hi,

I've been developing a number of libraries for Lua (currently on 5.1w6) for the company I work for (this unfortunately means I can't go into too much detail about them or their uses).

My current main project is a glue code system for allowing Lua scripts to access C variables and functions. Rather than taking the approach to tolua, etc. of directly binding these elements into the Lua namespace, it instead takes a more dynamic object oriented approach.

Proxy objects (tables at their core) intercept attempts to get and set values, look up indices, look up structure/union members and call functions. These appear to the user to be native Lua data types, but instead apply operations to the underlying C equivalent. These proxies are generated on the fly through the use of the __index metamethod.

Unfortunately, I've had to add some metamethods to give them the truly native feel:

__type: Overrides the return value of the Lua function type().
__set: Overrides an attempt to set the value of a Lua value. This is used to make assignment set the value of the underlying C value, rather than replacing the proxy object with the new value. See below for some caveats. __tonumber: Overrides an attempt to convert a Lua value to a number. This is invoked by the C function lua_tonumber(), among other methods. This, along with some modifications to the VM arithmetic operations, allows proxies to be used in mathematical operations as though they were native Lua numbers. __tostring: Overrides an attempt to convert a Lua value to a number. This is invoked by the C function lua_tostring(), among other methods. This allows proxies to be used as though they were native Lua strings. __ueq: Untype checked equals comparison. The normal comparison metamethods do not allow mixed types, preventing numbers from being compared against proxies. This metamethod is invoked when an attempt is made to compare different types.
__ult: Untype checked less than comparison.
__ule: Untype checked less than or equals comparison.
__not: Overrides the not operator. A proxy is really a table, meaning that "not proxy" will always return false. This allows the result to be dependent on the underlying value of the proxy.

Can anybody suggest any ways of providing similar functionality that appears seamless to the end user without having to make these modifications?

There are a couple of troubling issues:

Occasionally, intermittent errors such as "attempt to compare number with boolean" occur. I'm currently putting these down to subtle stack corruption issues, but they squirm away from beneath me when I try to instrument my code. Does anyone have any hints for debugging such issues?

Locals and the __set metamethod - essentially, for every VM cycle, the interpreter has to check whether RA represents a local. If it does, then it will check if the __set metamethod needs to be used. If this check is not done, then everything breaks horribly because the VM attempts to reuse a register and mistakenly triggers the __set metamethod. Currently, I'm walking through the call info stack to determine which registers are locals and which aren't at the start of each cycle, but this is horribly inefficient. I've tried adding a second stack of flags which indicates which registers are locals and which aren't, and this is maintained when the stack is resized, when a Lua function is called, when a Lua function returns, and when a Lua function tail returns. However, this doesn't work - the flags do not mirror the results of the call info stack walk. I suspect this may be related to upvalues and other similar issues. Can anybody shed any light on this issue?

Personally, I don't like having to have made these changes - they make upgrading to new versions of Lua more difficult and slow Lua down - but needs must.

Any assistance people can provide to help me remove my custom metamethods and restore the performance and reliability of the VM would be appreciated. If you need any clarifications, don't hesitate to ask, I'm well aware that I can blabber on a bit :)

Cheers!
--
Lisa
http://www.thecommune.org.uk/~lisa/

Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Uli Kusterer
On Aug 10, 2005, at 17:11:26, Lisa Parratt wrote:
Occasionally, intermittent errors such as "attempt to compare number with boolean" occur. I'm currently putting these down to subtle stack corruption issues, but they squirm away from beneath me when I try to instrument my code. Does anyone have any hints for debugging such issues?

I found the example C code in the Lua book that shows how to dump the stack very helpful in debugging my missing pop()s.

Cheers,
-- M. Uli Kusterer
http://www.zathras.de



Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Rici Lake-2
In reply to this post by Lisa Parratt
Lisa,

I think what you are trying to do is semantically incoherent, so it is not surprising that it is difficult. It's semantically incoherent because it confuses values with references to values ("boxes").

A local in Lua is just like a locally declared variable in C. Consider the following, which I think is a mirror of what you're trying to do:

typedef struct IntList {
  struct IntList *next;
  int val;
} IntList;

IntList *set_and_advance (IntList *ints, int newval) {
  int i = ints->val;
  i = newval;
  return ints->next;
}

Obviously, this isn't going to work :) The local variable 'i' is different from ints->val and the assignment does not make it the same. If I wanted, for whatever reason, to use that style, I would have to make 'i' a pointer and explicitly dereference it:

IntList *set_and_advance (IntList *ints, int newval) {
  int *i = &ints->val;
  *i = newval;
  return ints->next;
}

Lua, however, does not have a pointer type. So there is no obvious way of translating this into Lua. On the other hand, it is not a very common C idiom either.

C++, of course, provides the semantically odd reference type. Perhaps that's what you were thinking of:

IntList *set_and_advance (IntList *ints, int newval) {
  int& i = &ints->val;
  i = newval;
  return ints->next;
}

References are odd beasts, though, since they are a weird mixture of runtime data and compiler sugaring. For example, you cannot assign one reference to another reference:

int& i = &ints->val;
int& j = i;  /* ILLEGAL */
int& j = &i; /* Tell the compiler *not* to dereference i */

contrast with:

int* i = &ints->val;
int* j = i;  /* Correct */
int* j = &i; /* Wrong-o */

Of course, the natural way of writing this function (whether in C or in Lua) avoids all this mucking about with references and pointers:

IntList *set_and_advance (IntList *ints, int newval) {
  ints->val = newval;
  return ints->next;
}

So what's actually going on in this version of set_and_advance? A deceptively simple question.

I would say, the structure 'ints' is being told to alter its 'val' field. But it seems that a lot of programmers, particularly those who program in C++, have the intuition that the '=' operator is being directed at the ints->val object itself. And, indeed, C++ allows '=' to be overridden, and not '->val='.

That's not really the full story, though. In C++ the '=' message is not being sent to the value of ints->val (that is, it is not being interpreted by the value 42, say). Rather, it is being sent to the (invisible) box containing the value 42. And that box is part of 'ints'. So what is actually going on is that 'ints' is being asked to provide the box containing the value, and then the '=' message is being sent to that box.

The fact that 'ints' is not otherwise consulted in the transaction is often a weakness. For example, it precludes 'ints' from applying any sort of data-coherency checks. Consequently, it is quite common to use a member function to change structure data members, and many people would say that allowing direct access to a data member is bad C++ style. This leads to code which looks like:

  aList->setval(aList->getval() + 1);

rather than the arguably more readable:

  ++aList->val;

It's all what you're used to, I suppose.

In any event, in Lua table-assignment is semantically a message to the table (and that includes assignment to global variables, since global variables are just syntactic sugar for operations on the environment table). That's rather different from local assignment, although you might think of local assignment as a message to the Lua stack, after a compile-time translation from local name to stack index.

So, to get back to your glue code.

It's quite straightforward to create a proxy object which represents a C structure (or C++ object); the proxy object can interpret get and set messages as it sees fit. One simple way of doing this is to create two tables of getter and setter methods for each class, and have the __index/__newindex metamethods look up the key in the associated getter/setter table for the class. In this model, the getter method for an integer member would probably convert the appropriate C numeric type into a Lua number, and the setter would attempt to convert the Lua number back into the C numeric type. (Note that round-tripping an int through a double does not lose data; only 64-bit integer types -- or single-precision floating point -- cause problems.)

Since the environment table is just a table, a similar technique can be used to dynamically add "global bindings" to C objects; an example can be found at <http://lua-users.org/wiki/BoundScalarGlobalsOne>. That example does not export functions to dynamically add mappings, but it should be clear how to do that. (It's also written in Lua, rather than C, but I hope it's clear how it could be written in C, as well.) If you do dynamically map globals, you have to think through what the expected behaviour would be if the global is already in use, but otherwise the implementation is pretty simple; also, that example tries to not override existing metamethods implemented on the environment table, but that is probably an unnecessary complication.

Finally, it is fairly simple to patch Lua to provide for a sort of reference/pointer ("boxed") datatype, but the fact that the Lua type system operates entirely at runtime makes it tricky to do without an explicit derefence operator, similar to the C '*' operator. It depends on defining a 'mutate box' primitive, which might be written ':=' or '<-'; the semantics of 'foo <- val' would be fairly similar to the C expression '*foo = val'. A small patch to implement the mutate operator can be found on the <http://lua-users.org/wiki/LuaPowerPatches> page.

Hope that was at least interesting,

R.


On 10-Aug-05, at 10:11 AM, Lisa Parratt wrote:

Hi,

I've been developing a number of libraries for Lua (currently on 5.1w6) for the company I work for (this unfortunately means I can't go into too much detail about them or their uses).

My current main project is a glue code system for allowing Lua scripts to access C variables and functions. Rather than taking the approach to tolua, etc. of directly binding these elements into the Lua namespace, it instead takes a more dynamic object oriented approach.

Proxy objects (tables at their core) intercept attempts to get and set values, look up indices, look up structure/union members and call functions. These appear to the user to be native Lua data types, but instead apply operations to the underlying C equivalent. These proxies are generated on the fly through the use of the __index metamethod.

Unfortunately, I've had to add some metamethods to give them the truly native feel:

__type: Overrides the return value of the Lua function type().
__set: Overrides an attempt to set the value of a Lua value. This is used to make assignment set the value of the underlying C value, rather than replacing the proxy object with the new value. See below for some caveats. __tonumber: Overrides an attempt to convert a Lua value to a number. This is invoked by the C function lua_tonumber(), among other methods. This, along with some modifications to the VM arithmetic operations, allows proxies to be used in mathematical operations as though they were native Lua numbers. __tostring: Overrides an attempt to convert a Lua value to a number. This is invoked by the C function lua_tostring(), among other methods. This allows proxies to be used as though they were native Lua strings. __ueq: Untype checked equals comparison. The normal comparison metamethods do not allow mixed types, preventing numbers from being compared against proxies. This metamethod is invoked when an attempt is made to compare different types.
__ult: Untype checked less than comparison.
__ule: Untype checked less than or equals comparison.
__not: Overrides the not operator. A proxy is really a table, meaning that "not proxy" will always return false. This allows the result to be dependent on the underlying value of the proxy.

Can anybody suggest any ways of providing similar functionality that appears seamless to the end user without having to make these modifications?

There are a couple of troubling issues:

Occasionally, intermittent errors such as "attempt to compare number with boolean" occur. I'm currently putting these down to subtle stack corruption issues, but they squirm away from beneath me when I try to instrument my code. Does anyone have any hints for debugging such issues?

Locals and the __set metamethod - essentially, for every VM cycle, the interpreter has to check whether RA represents a local. If it does, then it will check if the __set metamethod needs to be used. If this check is not done, then everything breaks horribly because the VM attempts to reuse a register and mistakenly triggers the __set metamethod. Currently, I'm walking through the call info stack to determine which registers are locals and which aren't at the start of each cycle, but this is horribly inefficient. I've tried adding a second stack of flags which indicates which registers are locals and which aren't, and this is maintained when the stack is resized, when a Lua function is called, when a Lua function returns, and when a Lua function tail returns. However, this doesn't work - the flags do not mirror the results of the call info stack walk. I suspect this may be related to upvalues and other similar issues. Can anybody shed any light on this issue?

Personally, I don't like having to have made these changes - they make upgrading to new versions of Lua more difficult and slow Lua down - but needs must.

Any assistance people can provide to help me remove my custom metamethods and restore the performance and reliability of the VM would be appreciated. If you need any clarifications, don't hesitate to ask, I'm well aware that I can blabber on a bit :)

Cheers!
--
Lisa
http://www.thecommune.org.uk/~lisa/


Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

David Given
On Wednesday 10 August 2005 18:52, Rici Lake wrote:
[...]
> C++, of course, provides the semantically odd reference type. Perhaps
> that's what you were thinking of:
[...]
> int& i = &ints->val;
> int& j = i;  /* ILLEGAL */
> int& j = &i; /* Tell the compiler *not* to dereference i */

Actually, you're not quite right there --- lines 1 and 3 are invalid, line 2 
is valid. (I think that makes you exactly wrong!)

A reference can be initialised exactly once from a variable of the same type:

int& i = j; /* i becomes an alias for j */

It can then be assigned to, but the assignment is actually performed on the 
target of the reference:

i = k; /* j's value is modified to be that of k's value */

You can't change the target of a reference once it's been initialised. (And 
very annoying it is, too.)

A better C++ analogy is as follows:

  class Proxy {
    int* target;

    Proxy& operator = (int value) { *target = value; return *this; };
    int operator int () { return *target; };
  }

  int fnord;
  Proxy fnordptr;
  fnordptr.target = &fnord;
  fnordptr = 42; /* sets fnord's value */

This is phenomenally useful if you want to, say, add locks around accesses to 
something, particularly when combined with generics. Take a look at C++'s 
smart pointers some time. This allows you to do things like:

  int value;
  locked<int> value_l = &value; /* initialisation */

  value_l = 1; /* lock, assign, unlock */
  printf("%d\n", value_l); /* lock, copy, unlock, return copy */

Very neat.

(Why, yes, I *have* been spending far too much time recently grovelling around 
in C++'s inner workings. I have grown to dislike it immensely, too. While it 
has some nice features, they're implemented so badly as to make them largely 
useless.)

[...]
>    aList->setval(aList->getval() + 1);
>
> rather than the arguably more readable:
>
>    ++aList->val;

You're not going to like me for this, but this is possible! It's a pain in the 
arse, though, and is usually not worth the effort. The key is to make val a 
smart object which has operator++ overloaded.

[...]
> It depends
> on defining a 'mutate box' primitive, which might be written ':=' or
> '<-'; the semantics of 'foo <- val' would be fairly similar to the C
> expression '*foo = val'.

*nods, although I'd write that last 'foo->assign(val)'*

C++ gets away with this because it has two types of thing; scalars and 
objects. (And references, but they're not relevant.) The = operator is 
'assign' for scalars but 'mutate' for objects. You can overload objects but 
not scalars. So this:

  Object foo;
  foo = 0;

...is a fundamentally *different* operation to this:

  Object* foo;
  foo = 0;

...despite syntactic similarities.

In Lua, everything is an object (or pretends to be one), but like Smalltalk 
and Java it has no 'mutate' operator. While one would be very useful, I do 
find myself about dubious about whether it's possible to implement it and 
still be able to call the result Lua!

> > http://www.thecommune.org.uk/~lisa/

Did you use to call yourself 'Communa'? Because if so, I think we've met...

-- 
"Curses! Foiled by the chilled dairy treats of righteousness!" --- Earthworm 
Jim (evil)

Attachment: pgp1Vs_rDP1sf.pgp
Description: PGP signature

Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Rici Lake-2

On 10-Aug-05, at 2:06 PM, David Given wrote:

On Wednesday 10 August 2005 18:52, Rici Lake wrote:
[...]
C++, of course, provides the semantically odd reference type. Perhaps
that's what you were thinking of:
[...]
int& i = &ints->val;
int& j = i;  /* ILLEGAL */
int& j = &i; /* Tell the compiler *not* to dereference i */

Actually, you're not quite right there --- lines 1 and 3 are invalid, line 2
is valid. (I think that makes you exactly wrong!)

A reference can be initialised exactly once from a variable of the same type:

int& i = j; /* i becomes an alias for j */

It can then be assigned to, but the assignment is actually performed on the
target of the reference:

i = k; /* j's value is modified to be that of k's value */

You can't change the target of a reference once it's been initialised. (And
very annoying it is, too.)

I stand corrected. This is why I hardly ever use C++... I should have looked that all up before I blathered. My confusion comes from one of the weirdnesses of references: the apparent similarity of the definition and assignment syntax hide an implicit "enreference" and "dereference" operator, respectively. (And these 'operators' are inserted by the compiler.)

But the point stands: a reference is a very weird beast.

<useful example snipped>

This is phenomenally useful if you want to, say, add locks around accesses to something, particularly when combined with generics. Take a look at C++'s
smart pointers some time. This allows you to do things like:

  int value;
  locked<int> value_l = &value; /* initialisation */

  value_l = 1; /* lock, assign, unlock */
  printf("%d\n", value_l); /* lock, copy, unlock, return copy */

Very neat.

Well, beauty is in the eye of the beholder :) Many people might prefer something like:

  synchronized int value;

although that doesn't give you the option of forgetting to use the locked version :)

   aList->setval(aList->getval() + 1);

rather than the arguably more readable:

   ++aList->val;

You're not going to like me for this, but this is possible! It's a pain in the arse, though, and is usually not worth the effort. The key is to make val a
smart object which has operator++ overloaded.

I don't see how that allows 'aList' to maintain data coherency when its 'val' member is changed, unless the 'smart object' is so smart that it knows about its container. That is, of course, possible but it's not easy to see how to generalise it.

Suppose, for example, that aList is attempting to keep itself sorted. Had the implicit mutate ("assignment") operation been directed at aList in the first place, the implementation would be quite a bit clearer.


[...]
It depends
on defining a 'mutate box' primitive, which might be written ':=' or
'<-'; the semantics of 'foo <- val' would be fairly similar to the C
expression '*foo = val'.

*nods, although I'd write that last 'foo->assign(val)'*

No doubt. As I said, beauty is in the eye of the beholder :)

C++ gets away with this because it has two types of thing; scalars and
objects. (And references, but they're not relevant.) The = operator is
'assign' for scalars but 'mutate' for objects. You can overload objects but
not scalars. So this:

  Object foo;
  foo = 0;

...is a fundamentally *different* operation to this:

  Object* foo;
  foo = 0;

...despite syntactic similarities.

Sure. Furthermore, assignment and initialization are different, just to be more confusing.

Whether an object and a "scalar" are different beasts is, I guess, a philosophical question. Certainly you can override operations on "objects" which cannot be overridden on "scalars" but one might think of that as simply a restriction in the initial type environment. C++ does not, for example, guarantee that pointer assignment works like one might naively think it does; the compiler itself is allowed to insert code to, for example, maintain array boundaries in a pointer for validation purposes, in which case, it is at least conceivable that you might get a run-time exception from:

  Object *foo = 0;
  foo++;

In Lua, everything is an object (or pretends to be one),

That depends on your definition of "object". I wouldn't have said that myself (rather that in Lua every value has a definite type), but terminology is a minefield. What is clear is that Lua, like Smalltalk (at least), does not provide implicit boxes around values which can receive messages. You can only talk to the value itself.

but like Smalltalk and Java it has no 'mutate' operator.


 While one would be very useful, I do
find myself about dubious about whether it's possible to implement it and
still be able to call the result Lua!

The simple answer is: "no". If you patch the Lua core, the best you can say is that you've created a new language based on the Lua code. (At least, I think the license lets you say that.)

The more interesting answer is that the patch to create an explicit "box" type is not very big.

In fact, you could do it without patching the Lua core at all, using only a preprocessor ( :) ), by representing a box as a table with one distinguished key. The transformation would turn

  a <- *a + 7

into

  a.value = a.value + 7

or some variation on that theme. Mind you, you couldn't call the preprocessor input Lua either, but you could certainly call it a "language which can be compiled into Lua".


Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Rici Lake-2

On 10-Aug-05, at 2:57 PM, Rici Lake wrote:

In fact, you could do it without patching the Lua core at all, using only a preprocessor ( :) ), by representing a box as a table with one distinguished key. The transformation would turn

  a <- *a + 7

into

  a.value = a.value + 7

or some variation on that theme. Mind you, you couldn't call the preprocessor input Lua either, but you could certainly call it a "language which can be compiled into Lua".

Just in case it amuses someone other than me:

do
  local function donothing() end
  local function show(_, k, v)
    if type(v) == "string" then
      v = string.format("%q", v)
    else
      v = tostring(v)
    end
    print(string.format("Global `%s' is being changed to %s",
                        tostring(k), v))
  end

  function ref(tab, key, preset, postset)
    preset, postset = preset or donothing, postset or donothing
    return setmetatable({}, {
      __index = function() return tab[key] end,
      __newindex = function(_, _, new)
                     preset(tab, key, new)
                     tab[key] = new
                     postset(tab, key, new)
                   end
    })
  end

  -- reference global by name, assuming we have the
  -- same environment table as our caller
  function gref(name, preset, postset)
    return ref(getfenv(), name, preset, postset)
  end

  function traceref(name)
    return gref(name, show)
  end
end

> a = "Hello"
> b = gref "a"
> c = traceref "a"
> b.value = b.value .. ", world"
> print(a)
Hello, world
> c.value = string.gsub(c.value, ", world", "")
Global `a' is being changed to "Hello"
> print(a)
Hello


Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Glenn Maynard
In reply to this post by Rici Lake-2
On Wed, Aug 10, 2005 at 02:57:58PM -0500, Rici Lake wrote:
> But the point stands: a reference is a very weird beast.

It's weird to you because you're unfamiliar with them.  I suppose that's just
the nature of "weird", though.  :)  In practice, it's an implementation detail
that you don't have to worry about--a reference doesn't always create a
pointer (more specifically, it can be optimized away); it's the reference
semantics that are important.

> >  int value;
> >  locked<int> value_l = &value; /* initialisation */
> >
> >  value_l = 1; /* lock, assign, unlock */
> >  printf("%d\n", value_l); /* lock, copy, unlock, return copy */
> 
> Well, beauty is in the eye of the beholder :) Many people might prefer 
> something like:
> 
>   synchronized int value;
> 
> although that doesn't give you the option of forgetting to use the 
> locked version :)

This requires specific support from the language itself to implement, and
means the compiler itself has to understand locking, compared to the generic
template solution.

You can't forget anything if you use a template type that doesn't initialize
from a pointer:

  locked<int> value = 1;
  value = 2;

> > While one would be very useful, I do
> >find myself about dubious about whether it's possible to implement it 
> >and
> >still be able to call the result Lua!
> 
> The simple answer is: "no". If you patch the Lua core, the best you can 
> say is that you've created a new language based on the Lua code. (At 
> least, I think the license lets you say that.)

The license doesn't require renaming on modification.  I don't think that's
a very good restriction, especially given that modifying Lua to suit the
particular needs of a project is actively encouraged.  It wouldn't be
very nice to publically fork Lua and not rename it (confusing), but there's
nothing wrong with a project using a heavily modified core and still saying
"we use Lua".

-- 
Glenn Maynard


Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Rici Lake-2

On 10-Aug-05, at 3:50 PM, Glenn Maynard wrote:
Well, beauty is in the eye of the beholder :) Many people might prefer
something like:

  synchronized int value;

although that doesn't give you the option of forgetting to use the
locked version :)

This requires specific support from the language itself to implement, and means the compiler itself has to understand locking, compared to the generic
template solution.


Yes, that's true. On the other hand, it means that a compiler aware of concurrency has to do quite a bit of difficult analysis in order to optimise locking, which would have been easier had the declaration been explicit.

You could say the same thing about exceptions, really. Why should the compiler provide specific support when it could be achieved in the standard library (with setjmp and friends)?

The question of which features should be included in a language and which ones should be relegated to library implementation is not easy, and every language draws its own line. Clearly, there are languages which prefer the synchronization declaration, and it appears that there are programmers who are quite happy to use those languages. :)

You can't forget anything if you use a template type that doesn't initialize
from a pointer:

  locked<int> value = 1;
  value = 2;

Yes, again. In which case, the use of the synchronized attribute would have been precisely equivalent, aside from the issue of whether it should be in the compiler, in a standard template library, or even implemented with a preprocessor.

The license doesn't require renaming on modification. I don't think that's a very good restriction, especially given that modifying Lua to suit the
particular needs of a project is actively encouraged.  It wouldn't be
very nice to publically fork Lua and not rename it (confusing), but there's nothing wrong with a project using a heavily modified core and still saying
"we use Lua".

Hmm. I guess I was still thinking of the Lua 4.0 licence.


Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Glenn Maynard
On Wed, Aug 10, 2005 at 04:27:05PM -0500, Rici Lake wrote:
> You could say the same thing about exceptions, really. Why should the 
> compiler provide specific support when it could be achieved in the 
> standard library (with setjmp and friends)?

Destructors need to be called, and setjmp() can't do that.

(setjmp is evil, but so are C++ exceptions, at least in practice: they
bloat binaries far more than their value, in my opinion.  I tend to
stick to old-fashioned error returns.)

> >You can't forget anything if you use a template type that doesn't 
> >initialize
> >from a pointer:
> >
> >  locked<int> value = 1;
> >  value = 2;
> 
> Yes, again. In which case, the use of the synchronized attribute would 
> have been precisely equivalent, aside from the issue of whether it 
> should be in the compiler, in a standard template library, or even 
> implemented with a preprocessor.

Yep; I was responding to the suggestion that this approach is less safe
because you can "forget" things.

-- 
Glenn Maynard

Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

David Given
On Wednesday 10 August 2005 23:29, Glenn Maynard wrote:
[...]
> (setjmp is evil, but so are C++ exceptions, at least in practice: they
> bloat binaries far more than their value, in my opinion.  I tend to
> stick to old-fashioned error returns.)

I have a C++ program, Spey, which is an SMTP proxy implementing greylisting. I 
wrote it specifically to be reliable, scalable, and secure. (It's very neat 
and uses coroutines. <PLUG> http://spey.sf.net </PLUG>)

It uses exceptions extensively in order to communicate error states. Yes, 
they're not cheap in terms of code size, but used properly they can *vastly* 
simplify your logic flow. Another thing I was doing was never dynamically 
allocating anything if I could possibly help it; I think I use the new 
operator once in the entire program. (Although I do use STL containers quite 
a lot.) Since C++ exceptions call destructors, this means that all my memory 
allocation problems basically go away.

Hauling the subject bodily back on topic, one of the few issues I have with 
Lua is that there's no standard for exceptions. There is no consistent 'out 
of memory' exception; some parts throw a string, some a number, some return 
an error code, some return an error string, etc. The other standard failure 
cases are all quite similar.

I do feel strongly that the use of exceptions can very much improve a program, 
and in keeping with the Lua philosophy, it would be a good idea to provide a 
standardised framework that application and library writers can use. Part of 
this framework should, I think, involve standard exception *names*. 
Otherwise, there's no way of knowing if you've caught a standard exception or 
not.

-- 
"Curses! Foiled by the chilled dairy treats of righteousness!" --- Earthworm 
Jim (evil)

Attachment: pgpPWlOOjRG4m.pgp
Description: PGP signature

Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Glenn Maynard
On Thu, Aug 11, 2005 at 12:02:36AM +0100, David Given wrote:
> It uses exceptions extensively in order to communicate error states. Yes, 
> they're not cheap in terms of code size, but used properly they can *vastly* 
> simplify your logic flow. Another thing I was doing was never dynamically 
> allocating anything if I could possibly help it; I think I use the new 
> operator once in the entire program. (Although I do use STL containers quite 
> a lot.) Since C++ exceptions call destructors, this means that all my memory 
> allocation problems basically go away.

Turning off exceptions cut a 7 meg program down to 5 megs with g++, and a
4 meg program down to ~3.5 with VC.  (Numbers aren't exact--tests were over
a year ago--but the order of magnitude is correct.)  That's just too much,
in my opinion.  I'm developing for memory-limited environments (not 128k
embedded, but not half-gig-desktop, either: two megs matters), which influences
my outlook, but I wouldn't enjoy all code on my workstation being 20% bigger,
either.

-- 
Glenn Maynard

Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

David Given
On Thursday 11 August 2005 00:17, Glenn Maynard wrote:
[...]
> Turning off exceptions cut a 7 meg program down to 5 megs with g++, and a
> 4 meg program down to ~3.5 with VC.

Which gcc, may I ask? Spey is about 4000 loc, and produces a 120kB binary, 
which I consider quite reasonable.

> I'm developing for memory-limited environments (not 128k
> embedded, but not half-gig-desktop, either: two megs matters), which
> influences my outlook, but I wouldn't enjoy all code on my workstation
> being 20% bigger, either.

But if it's more reliable and faster to write...

I do know where you're coming from, though. I use C++ for my day job, which 
involves embedded systems, and we've had to give up using exceptions because 
the code is just too big. (And uses *ludicrous* stacks --- about a megabyte 
for one small program.) We put that down to poor compiler support.

-- 
"Curses! Foiled by the chilled dairy treats of righteousness!" --- Earthworm 
Jim (evil)

Attachment: pgprqtANws86f.pgp
Description: PGP signature

Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Rici Lake-2
In reply to this post by David Given

On 10-Aug-05, at 6:02 PM, David Given wrote:

It uses exceptions extensively in order to communicate error states. Yes, they're not cheap in terms of code size, but used properly they can *vastly* simplify your logic flow. Another thing I was doing was never dynamically
allocating anything if I could possibly help it; I think I use the new
operator once in the entire program. (Although I do use STL containers quite a lot.) Since C++ exceptions call destructors, this means that all my memory
allocation problems basically go away.

The Lua parser is an interesting example of precisely that philosophy, although of course it uses the features Lua provides (like tables) rather than STL.


Hauling the subject bodily back on topic, one of the few issues I have with Lua is that there's no standard for exceptions. There is no consistent 'out of memory' exception; some parts throw a string, some a number, some return an error code, some return an error string, etc. The other standard failure
cases are all quite similar.

There are a lot of libraries out there which have not tried to conform to the Lua style, imho. Many of these are just light wrappers around C or C++ libraries, of course, and have not been reengineered in any way. There is really no excuse for not using a standard Lua mechanism for returning errors.

Having said that, I do agree that:

I do feel strongly that the use of exceptions can very much improve a program, and in keeping with the Lua philosophy, it would be a good idea to provide a standardised framework that application and library writers can use. Part of
this framework should, I think, involve standard exception *names*.
Otherwise, there's no way of knowing if you've caught a standard exception or
not.

One of the interesting things about exceptions is that catching named exceptions implies a dynamic (i.e. runtime) scope for bindings (to exception handlers).


Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Glenn Maynard
In reply to this post by David Given
On Thu, Aug 11, 2005 at 12:31:45AM +0100, David Given wrote:
> On Thursday 11 August 2005 00:17, Glenn Maynard wrote:
> [...]
> > Turning off exceptions cut a 7 meg program down to 5 megs with g++, and a
> > 4 meg program down to ~3.5 with VC.
> 
> Which gcc, may I ask? Spey is about 4000 loc, and produces a 120kB binary, 
> which I consider quite reasonable.

I think it was 3.3, on about 200k lines.  That's comparable, line for line.

I wish g++ had a "-ffake-exceptions" option (abort() when an exception is
thrown, ignore exception handlers), so I could ask "how big is yours without
exceptions?", but as is I don't know of any easy way to find out.

-- 
Glenn Maynard

Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Lisa Parratt
In reply to this post by Rici Lake-2
Rici Lake wrote:
Lisa,

I think what you are trying to do is semantically incoherent, so it is not surprising that it is difficult. It's semantically incoherent because it confuses values with references to values ("boxes").

Not so much confuses, as intentionally seeks to make the two semantically identical to the end user. The goal of the project is to, in so far as possible, seamlessly integrate the native elements with Lua.

It's quite straightforward to create a proxy object which represents a C structure (or C++ object); the proxy object can interpret get and set messages as it sees fit. One simple way of doing this is to create two tables of getter and setter methods for each class, and have the __index/__newindex metamethods look up the key in the associated getter/setter table for the class.
*snip*

Since the environment table is just a table, a similar technique can be used to dynamically add "global bindings" to C objects

This is exactly how those aspects of the system operate. The main issues occur when one is trying to proxy strings, integers, and non-global variables. Locals aren't bound to a table, and therefore aren't subject to __index and __newindex.

As regards of what to do if the entity already exists - the glue system is intended to be initialised before everything else, and proceeds to prevent the names of glued in entities from being used.

Hope that was at least interesting,

Thanks :)

--
Lisa
http://www.thecommune.org.uk/~lisa/

Reply | Threaded
Open this post in threaded view
|

Exceptions (was Re: Custom extensions to Lua)

Mark Hamburg-4
In reply to this post by Rici Lake-2
on 8/10/05 4:57 PM, Rici Lake at [hidden email] wrote:

> There 
> is
> really no excuse for not using a standard Lua mechanism for returning
> errors.

I think part of the issue is that Lua has a couple of methods for reporting
errors:

1. Throw an error with the error function (or its compatriot assert). Note
that you have to do a little extra work if you don't want it annotated with
information about where it was thrown from. That information is useful for
debugging, but not as useful when you want to process the error on the
catching end.

2. Return nil plus an error message. This can be translated into the first
form via assert though again we run into the annotation issue.

Diego has built an approach in Lua socket that combines the two, but it
requires a certain amount of discipline to use since it depends on using
exceptions for error handling within modules and nil + message at module
boundaries.

Furthermore, there are standard APIs that use both mechanisms. For example,
if you pass something other than a string to io.open, you get an exception
rather than nil plus a message.

So, this thread could be seen as a call for sorting out a set of rules that
are easy to follow and then trying to bring as much code as possible in
compliance with them.

Mark



Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Rici Lake-2
In reply to this post by Lisa Parratt

On 11-Aug-05, at 9:17 AM, Lisa Parratt wrote:

Rici Lake wrote:
Lisa,
I think what you are trying to do is semantically incoherent, so it is not surprising that it is difficult. It's semantically incoherent because it confuses values with references to values ("boxes").

Not so much confuses, as intentionally seeks to make the two semantically identical to the end user. The goal of the project is to, in so far as possible, seamlessly integrate the native elements with Lua.

I think you mean "syntactically identical" :) If so, I sympathize but I think you are going to end up fighting a semantic impedance mismatch.

If I understand you correctly, you'd like to be able to write in Lua something vaguely like Glenn Maynard's useful example of C++:

 locked<int> value = 1;  // constructor
 value = 2;              // mutator

which might translate into something like:

 local value = locked_int(1) -- constructor
 value = 2                   -- mutator

C++, as has been noted, embeds a little visual pun (several, actually, but I won't get into the other ones): the semantics of '=' are quite different in the two lines quoted (which I've annotated). In the first line the '=' is a syntactic element, whereas in the second case it's an operator.

Lua is somewhat more like Objective-C (or even C, if you think of all Lua locals as having the type LuaObject* rather than LuaObject) (note [1]). A Lua assignment is always an assignment; mutation needs to be expressed as a function call.

Consequently, in Lua the semantics of

  local value = locked_int(1)
and
  local value
  value = locked_int(1)

are the same. (The second one differs in that value is set to nil for a very short amount of time, but the difference is not user-visible.) A subsequent

  value = 2

changes 'value' from being a locked_int into being a regular number.

We all have our prejudices (particularly me) and perhaps "semantic incoherence" was an overstatement: I meant that the semantics is incoherent with the syntax (note [2]). Providing a syntactic 'mutate' operator, as proposed by Andrew Lauritzen in <http://lua-users.org/wiki/LuaPowerPatches>, strikes me as more semantically consistent (and I don't see why it is "un-Lua-like" either):

  local howmany = locked_int(1)
  howmany <- 3              -- mutate with lock
  howmany = locked_int(3)   -- new lock-protected value

The syntactic difference between lines 2 and 3 reflects a real semantic difference (note [3]) and it is not difficult to get used to, in practice.

This is exactly how those aspects of the system operate. The main issues occur when one is trying to proxy strings, integers, and non-global variables. Locals aren't bound to a table,

OK, I've written enough about locals. As far as integers and strings go, I honestly think you'd be better off converting them between C and Lua at the call-boundary rather than trying to proxy them. For integers, that is straight-forward as long as lua_Number is double or your integers are short. For strings, the conversion requires a copy (possibly two copies), but that is not awful unless the strings are long. (Furthermore, since Lua interns strings, it can avoid the copy in the event that it already has an instance of the string.) On the whole, this is probably less inefficient than the metamethod comparison approach: you can copy quite a few bytes in the time that it takes Lua to start up a metamethod.

Notes:

[1] Internally, Lua avoids malloc'ing simple scalars (numbers, booleans, nil) and this often leads people to say that Lua distinguishes between scalars and composites. Here's a representative quote from the mailing list:

Lua forces copy semantics for simple value types (nil, boolean, number,
string, light userdata) and reference semantics for complex types (function,
table, userdata).

I don't think this aids understanding of Lua's semantics. Unlike Lua, Python allocates numbers and keeps pointers to the heap-allocated storage, but the semantics of Python numbers and Lua numbers are pretty well identical. Lua strings (which are immutable) are not actually copied even though they are heap-allocated.

The point is that if a datatype is truly immutable, it is not really possible to semantically distinguish between two (hypothetical) copies of it. If the datatype is mutable, then operations such as object-equality comparison and argument passing are naturally based on object identity. Storing a number or a boolean instead of storing a pointer to a heap-allocated instance of the same value is really just an optimization (or at least an attempt at optimization); not a semantic difference.

[2] Lua is not immune to visual puns. The semantics of '=' in the following two statements are radically different:

  local a = 3
  global_a = 3

Like all visual puns, this creates confusion.

[3] Also, Andrew's patch does not implement the unboxing operator '*', and I believe it uses a different symbol to represent the mutation statement.

The locked<T> example is interesting in other ways. For example, in order to get it to work correctly, it's necessary to overload all mutation operators:

  locked<int> howmany = 1;
  howmany++; // this is not the same as howmany = howmany + 1;

But what if the mutation is not anticipated?

  int complex_calculation(int a, int b);
  howmany = complex_calculation(howmany, 42); // Oops

There are several possible solutions to this problem:

1) lock howmany, do the calculation, change howmany, unlock howmany

2) save the value of howmany in a temporary, do the calculation, atomically change howmany if it still has the same value as was saved.

I don't think either of these is "right" or "wrong"; it would depend on how long the calculation was likely to take and how much contention there is on the lock. I'd probably go for the second one in most cases, though. I'll leave it to one of the C++ meisters on the list to come up with an ultracool C++ syntax for this, but I note that Andrew's patch allows the mutator to accept multiple values, so that the following is possible:

  do local temp = *howmany
    howmany <- temp, complex_calculation(temp, 42)
  end

or even:

  function protected(val, func, ...)
    return val, func(val, ...)
  end
  -- ...
  howmany <- protected(*howmany, complex_calculation, 42)


Reply | Threaded
Open this post in threaded view
|

Re: Exceptions (was Re: Custom extensions to Lua)

Luiz Henrique de Figueiredo
In reply to this post by Mark Hamburg-4
> I think part of the issue is that Lua has a couple of methods for reporting
> errors:

Sure, but they serve different purposes. Throwing an error is meant for
hard errors, things that should not occur and the programmer doesn't
what to handle. Returning nil+message is meant for soft errors, things
that may occur and that the programmer wants to handle. Or am I missing
something? --lhf

Reply | Threaded
Open this post in threaded view
|

Re: Exceptions (was Re: Custom extensions to Lua)

Mark Hamburg-4
Perhaps a little more documentation about this might be sufficient, but this
then raises the question of why one ever catches hard errors if they are
basically to be treated as "something very bad happened". Furthermore,
Diego's scheme in Lua socket uses exceptions for more than just hard errors.

Mark

on 8/11/05 9:44 AM, Luiz Henrique de Figueiredo at [hidden email]
wrote:

>> I think part of the issue is that Lua has a couple of methods for reporting
>> errors:
> 
> Sure, but they serve different purposes. Throwing an error is meant for
> hard errors, things that should not occur and the programmer doesn't
> what to handle. Returning nil+message is meant for soft errors, things
> that may occur and that the programmer wants to handle. Or am I missing
> something? --lhf


Reply | Threaded
Open this post in threaded view
|

Re: Custom extensions to Lua

Lisa Parratt
In reply to this post by Rici Lake-2
Rici Lake wrote:
I think you mean "syntactically identical" :) If so, I sympathize but I think you are going to end up fighting a semantic impedance mismatch.

Sorry, long day :)

As far as integers and strings go, I honestly think you'd be better off
> converting them between C and Lua at the call-boundary rather than
trying to proxy them.

The issue here is that this particular system does not magic away pointer types, etc. but instead has it's own type of proxy object. As such, out parameters aren't returned as multiple parameters, but inherently work because they're accessed through proxies. As such, after the following piece of pseudoLua:

myInt = declareCVariable(int)
myInt = 4
setValueOfIntPointer(myInt:pointer(), 5)

myInt will equal 5. (Yes, I know the overloading of = as declaritive assignment and mutation is utterly horrible)

Sorry I can't reply any further, but it's home time now.

It's worth noting that this system actually works (I fixed the stack corruption this morning), it's just a bit slow due to the per cycle local detection issue.

--
Lisa
http://www.thecommune.org.uk/~lisa/

123