__gc visible to lua code...

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

__gc visible to lua code...

Graham Wakefield
Hi,

Today I discovered that a __gc method installed on a userdata metatable is visible to Lua code; when called out of context it is likely to cause crashes, so any __gc metamethod in C must typecheck the argument.

I was surprised to find this out, I thought __gc would be 'hidden' from Lua. I'm not sure I can think of a reason why you'd want to call a userdata's __gc from lua code.

Thought others should know.

Graham

Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Graham Wakefield
In fact, even type-checking doesn't help. 

If I have a typical boxed pointer userdata, with a metatable, and I probably want the __gc method to delete the object boxed.

Now, I can call udata:__gc() from Lua code.  The userdata still exists, but the boxed object has been deleted.  Other calls on the udata can now segfault.  

I really think __gc should not be visible to Lua code.

On Mar 30, 2007, at 2:21 AM, Graham Wakefield wrote:

Hi,

Today I discovered that a __gc method installed on a userdata metatable is visible to Lua code; when called out of context it is likely to cause crashes, so any __gc metamethod in C must typecheck the argument.

I was surprised to find this out, I thought __gc would be 'hidden' from Lua.  I'm not sure I can think of a reason why you'd want to call a userdata's __gc from lua code.

Thought others should know.

Graham

grrr waaa
www.grahamwakefield.net




Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Luiz Henrique de Figueiredo
In reply to this post by Graham Wakefield
> so any __gc metamethod in C must typecheck the argument.

Yes. And in the case of boxed pointers, invalidate it somehow.
That's what the io lib does. (But I'm afraid not all my libraries do it.)
Bottom line: you have to write C binding code carefully.
--lhf

Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Kristofer Karlsson
In reply to this post by Graham Wakefield
2007/3/30, Graham Wakefield <[hidden email]>:
Hi,

Today I discovered that a __gc method installed on a userdata
metatable is visible to Lua code; when called out of context it is
likely to cause crashes, so any __gc metamethod in C must typecheck
the argument.

I was surprised to find this out, I thought __gc would be 'hidden'
from Lua.  I'm not sure I can think of a reason why you'd want to
call a userdata's __gc from lua code.

Thought others should know.

Graham

Wouldn't a sufficient solution be to set the __metatable field of the userdata metatable to "access denied" or something similar. That way you can ensure that your untrusted lua code can't have access to the __gc metamethod. Assuming of course that you don't expose the metatable or the __gc metamethod some other way.

Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Michael Broughton
I don't think that would work.

One way you can prevent the __gc metamethod from being called is to create an __index metamethod that filters out private indices. Of course, that would not prevent any C code from calling your finalizer. Overall, I think this solution is too complicated and it will slow down your whole module.

I think it is best to just keep track of the state of your userdata and any associated resources. For example: if your userdata is a struct, add a field to mark when it is finalized. If your userdata is a boxed pointer, make sure you nullify the pointer when you are done with it. You could also use each userdata's environment table to do something similar. With each of these solutions you need to make sure you check the state of the userdata before you use it (duh).

Mike


Kristofer Karlsson wrote:

Wouldn't a sufficient solution be to set the __metatable field of the userdata metatable to "access denied" or something similar. That way you can ensure that your untrusted lua code can't have access to the __gc metamethod. Assuming of course that you don't expose the metatable or the __gc metamethod some other way.


Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Graham Wakefield
In reply to this post by Luiz Henrique de Figueiredo
So, skipping through liolib.c, in summary, in addition to deleting object boxed, my __gc method should:

set the boxed pointer value to NULL

and my type-checking code should:

test for userdata, 
validate the userdata against the type (e.g. test metatable)
validate that the boxed pointer is not NULL

Setting the pointer to NULL, and testing for NULL, would not be necessary if __gc was not visible to Lua code, and if the lifetime of the object boxed is known to be equal to the lifetime of the userdata, a common case.  

I can't think of an example why you would want to call __gc from Lua code; that's why I'm surprised that it is visible at all.  

Even in liolib.c, where the object boxed (the file) is not necessarily concurrent in existence with the userdata, there is still no need to access __gc from Lua.

On Mar 30, 2007, at 4:51 AM, Luiz Henrique de Figueiredo wrote:

so any __gc metamethod in C must typecheck the argument.

Yes. And in the case of boxed pointers, invalidate it somehow.
That's what the io lib does. (But I'm afraid not all my libraries do it.)
Bottom line: you have to write C binding code carefully.
--lhf

grrr waaa
www.grahamwakefield.net




Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Graham Wakefield
In reply to this post by Michael Broughton

On Mar 30, 2007, at 9:35 AM, Michael Broughton wrote:

One way you can prevent the __gc metamethod from being called is to create an __index metamethod that filters out private indices. Of course, that would not prevent any C code from calling your finalizer. Overall, I think this solution is too complicated and it will slow down your whole module.

I agree.


I think it is best to just keep track of the state of your userdata and any associated resources. For example: if your userdata is a struct, add a field to mark when it is finalized. If your userdata is a boxed pointer, make sure you nullify the pointer when you are done with it.

Yes, I've done this now... but still I think __gc should be private by default.

You could also use each userdata's environment table to do something similar.

? You mean, in __gc set the userdata environment to nil, and then in the type-check test if the environment is nil, for example? Or likewise for a field in the environment? Probably null-checking the userdata value is easier.

Another question: is the userdata's environment garbage collected when __gc is called, or when the userdata itself is garbage collected?

Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Tomás Guisasola-2
> > I think it is best to just keep track of the state of your userdata  
> > and any associated resources. For example: if your userdata is a  
> > struct, add a field to mark when it is finalized. If your userdata  
> > is a boxed pointer, make sure you nullify the pointer when you are  
> > done with it.
	We do exactly this in the implementation of LuaSQL.

> Yes, I've done this now... but still I think __gc should be private  
> by default.
	In LuaSQL, the `close' method and the `__gc' metamethod are
the same C function.  Do you have a better idea?

	Tomás


Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Michael Broughton
In reply to this post by Graham Wakefield
Graham Wakefield wrote:

? You mean, in __gc set the userdata environment to nil, and then in the type-check test if the environment is nil, for example? Or likewise for a field in the environment? Probably null-checking the userdata value is easier.

I meant for a field in the environment table... Assuming the table is unique for each userdata. I think you are right, though; I have not seen a case where this would be better than either of the other two methods.


Another question: is the userdata's environment garbage collected when __gc is called, or when the userdata itself is garbage collected?


The environment table can be collected like any other table: when there are no remaining references to it. Collecting a userdata merely removes the reference to its environment table.

--------

In most of my libraries, I have close metamethods that are exactly synonymous for __gc. That is, they call the same C function. This is so that I can clean up resources whenever I want, but if I ever forget, the garbage collector is there to cover me. I use 'close' because it is nicer to write that '__gc'.

Mike



Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Graham Wakefield
In reply to this post by Tomás Guisasola-2

On Mar 30, 2007, at 11:09 AM, Tomas Guisasola Gorham wrote:

I think it is best to just keep track of the state of your userdata
and any associated resources. For example: if your userdata is a
struct, add a field to mark when it is finalized. If your userdata
is a boxed pointer, make sure you nullify the pointer when you are
done with it.
	We do exactly this in the implementation of LuaSQL.

Yes, I've done this now... but still I think __gc should be private
by default.
	In LuaSQL, the `close' method and the `__gc' metamethod are
the same C function.  Do you have a better idea?

Well there's no harm having a close method; so long as it makes sense to retain a LuaSQL userdata reference that is no longer connected. Otherwise, I thought setting the userdata to nil would be the equivalent.

I thought that __gc was a callback to let you know that the userdata is no longer referenced in Lua and will be deleted. But having it as a public method, simply means that __gc is a callback to say that either the userdata is no longer referenced, or someone called __gc (); it's less well defined.



Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Graham Wakefield
In reply to this post by Michael Broughton
In most of my libraries, I have close metamethods that are exactly synonymous for __gc. That is, they call the same C function. This is so that I can clean up resources whenever I want, but if I ever forget, the garbage collector is there to cover me. I use 'close' because it is nicer to write that '__gc'.

Right; now I know that __gc is public anyway, I think I'll do the same.




Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Rici Lake-2
In reply to this post by Michael Broughton
Kristofer Karlsson wrote:

Wouldn't a sufficient solution be to set the __metatable field of the userdata metatable to "access denied" or something similar. That way you can ensure that your untrusted lua code can't have access to the __gc metamethod. Assuming of course that you don't expose the metatable or the __gc metamethod some other way.


Michael Broughton wrote:
I don't think that would work.

One way you can prevent the __gc metamethod from being called is to create an __index metamethod that filters out private indices. Of course, that would not prevent any C code from calling your finalizer. Overall, I think this solution is too complicated and it will slow down your whole module.

Setting a __metatable field on the userdata's metatable is sufficient
to block access to the metatable from Lua (provided the debug library
isn't accessible) and that should be sufficient. You can set the
__metatable to false, or to the actual method table, or whatever.

The only time you'd need to filter in __index is if you've made the
metatable into the method table by setting it to be its own
__index. That's probably not a good idea for userdata; it's
almost certainly better to have separate method tables and
metatables. If you do that, you can skip the relatively expensive
type-check in unary metamethods (that is, everything but the
binary operators); you still have to do the check with methods.

While it's tempting to expose the method table as the value of
the __metatable key (as mentioned above), it's probably not a
good idea in practice. To start with, it violates the semantic
model: the method table is not the metatable, and a caller
looking for the metatable might expect to get the metatable
(for example, to access pseudometamethods like __tostring,
or __pairs if you use my __pairs patch.) A simple solution
is to duplicate such metamethods in the method table, adding
typechecks as necessary.

Second, if you're exposing the datatype to untrusted sandboxed
code, the shared method table provides a path for communication
(or havoc) between sandboxes.

Hiding the method table has disadvantages, though. It makes it
difficult for client code to extend the datatype with additional
methods written in Lua, which could be useful. (Consider the
case of augmenting the string library, for example.) One
possible solution, assuming that each sandbox has its own
luathread, is to put sandboxed method tables in the
thread-local environment (that is, the environment table
of the luathread) keyed by the base method table. The
datatype ('class' if you prefer) metatable's __index
metamethod is created with the base method table as
its environment; it can then use base table to lookup
the sandboxed method table and then look the key up
in that table; if it fails to find it there, it simply
looks it up in the base table. This slows down method
calls somewhat, but it's still reasonably fast.

Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Michael Broughton
You suggest that it is not a good idea to combine the metatable and method table for userdata. Is performance the only reason?

Sorry if I have been confusing terminology in my emails. I always combine the metatable and method table in my libraries. The third party Lua libraries I've looked at do this as well. So do the examples in PIL2.

Mike


Rici Lake wrote:
Setting a __metatable field on the userdata's metatable is sufficient
to block access to the metatable from Lua (provided the debug library
isn't accessible) and that should be sufficient. You can set the
__metatable to false, or to the actual method table, or whatever.

The only time you'd need to filter in __index is if you've made the
metatable into the method table by setting it to be its own
__index. That's probably not a good idea for userdata; it's
almost certainly better to have separate method tables and
metatables. If you do that, you can skip the relatively expensive
type-check in unary metamethods (that is, everything but the
binary operators); you still have to do the check with methods.

While it's tempting to expose the method table as the value of
the __metatable key (as mentioned above), it's probably not a
good idea in practice. To start with, it violates the semantic
model: the method table is not the metatable, and a caller
looking for the metatable might expect to get the metatable
(for example, to access pseudometamethods like __tostring,
or __pairs if you use my __pairs patch.) A simple solution
is to duplicate such metamethods in the method table, adding
typechecks as necessary.

Second, if you're exposing the datatype to untrusted sandboxed
code, the shared method table provides a path for communication
(or havoc) between sandboxes.

Hiding the method table has disadvantages, though. It makes it
difficult for client code to extend the datatype with additional
methods written in Lua, which could be useful. (Consider the
case of augmenting the string library, for example.) One
possible solution, assuming that each sandbox has its own
luathread, is to put sandboxed method tables in the
thread-local environment (that is, the environment table
of the luathread) keyed by the base method table. The
datatype ('class' if you prefer) metatable's __index
metamethod is created with the base method table as
its environment; it can then use base table to lookup
the sandboxed method table and then look the key up
in that table; if it fails to find it there, it simply
looks it up in the base table. This slows down method
calls somewhat, but it's still reasonably fast.


Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Michael Broughton
I guess one reason might be that obj.__index would give up your metatable, even if you have set the __metatable field?

Mike

Michael Broughton wrote:
You suggest that it is not a good idea to combine the metatable and method table for userdata. Is performance the only reason?

Sorry if I have been confusing terminology in my emails. I always combine the metatable and method table in my libraries. The third party Lua libraries I've looked at do this as well. So do the examples in PIL2.

Mike


Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Rici Lake-2
In reply to this post by Michael Broughton
Michael Broughton wrote:
You suggest that it is not a good idea to combine the metatable and method table for userdata. Is performance the only reason?

Sorry if I have been confusing terminology in my emails. I always combine the metatable and method table in my libraries. The third party Lua libraries I've looked at do this as well. So do the examples in PIL2.


Yes, I know it's common.

My distaste for it comes more from a desire for writing safe modules
than performance considerations. I think this thread demonstrates the
safety issues involved in conflating the two tables. However, once
you decide (if you do) to separate the metamethod and method tables,
you are certainly free to take advantage of resulting efficiencies :)

Fortunately, Lua is flexible enough to encompass both solutions, and
a variety of other ones.



Reply | Threaded
Open this post in threaded view
|

Job opportunity - Lua coding

Robert Aronsson
In reply to this post by Graham Wakefield
Hi all,
 
We are seeking Lua coders for a few projects for customers and partners.
 
Our product is a server monitoring system that have Lua integrated to make it possible for customers to extend our system for their specific needs.
 
If you are interesting to do some part-time Lua coding and getting paid for it, please let us know.
 
You can contact us on [hidden email] for a description of the assignment, condition etc.
 
Information links:
http://www.intellipool.se/doc/lua.htm
 
Best Regards,
Robert Aronsson
Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Javier Guerra Giraldez
In reply to this post by Graham Wakefield
On Friday 30 March 2007, Graham Wakefield wrote:
> Well there's no harm having a close method; so long as it makes sense
> to retain a LuaSQL userdata reference that is no longer connected.
> Otherwise, I thought setting the userdata to nil would be the
> equivalent.

sometimes you want an immediate close, not wait until the GC decides it needs 
to collect the userdata.

also, remember that the GC only counts the memory used by the userdata itself; 
the backend library might use a lot more resources unaccounted by the GC.

usually having a close() is a good idea

-- 
Javier

Attachment: pgpGWXy8zP6ak.pgp
Description: PGP signature

Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Graham Wakefield
In reply to this post by Rici Lake-2
The main safety concern would not be necessary if __gc was only visible to the C API, not Lua.  

In my app I must necessarily be concerned with performance; and I also follow the common example of having the metatable equal the method table for this reason.

On Mar 30, 2007, at 12:31 PM, Rici Lake wrote:

Michael Broughton wrote:
You suggest that it is not a good idea to combine the metatable and method table for userdata. Is performance the only reason?
Sorry if I have been confusing terminology in my emails. I always combine the metatable and method table in my libraries. The third party Lua libraries I've looked at do this as well. So do the examples in PIL2.

Yes, I know it's common.

My distaste for it comes more from a desire for writing safe modules
than performance considerations. I think this thread demonstrates the
safety issues involved in conflating the two tables. However, once
you decide (if you do) to separate the metamethod and method tables,
you are certainly free to take advantage of resulting efficiencies :)

Fortunately, Lua is flexible enough to encompass both solutions, and
a variety of other ones.



grrr waaa
www.grahamwakefield.net




Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Graham Wakefield
In reply to this post by Javier Guerra Giraldez
Having a close() method may be a good idea for many uses of userdata, where freeing resources referred by the userdata can be independent of the userdata lifetime.  In such cases, it makes sense to test against NULL for the userdata contents, etc.  Calling udata:close() won't set the udata to nil, won't free the userdata itself, since this is under the control of the garbage collector.

But this is irrelevant to __gc.  The __gc metamethod, as far as I understand, is supposed to be a kind of callback into C to let you know that the userdata has been collected, so you can free additional resources if necessary.  It makes no sense to expose this method to Lua code.   

Using __gc as a close() method confuses two distinct purposes.

On Mar 30, 2007, at 6:01 PM, Javier Guerra wrote:

On Friday 30 March 2007, Graham Wakefield wrote:
Well there's no harm having a close method; so long as it makes sense
to retain a LuaSQL userdata reference that is no longer connected.
Otherwise, I thought setting the userdata to nil would be the
equivalent.

sometimes you want an immediate close, not wait until the GC decides it needs 
to collect the userdata.

also, remember that the GC only counts the memory used by the userdata itself; 
the backend library might use a lot more resources unaccounted by the GC.

usually having a close() is a good idea

-- 
Javier

grrr waaa
www.grahamwakefield.net




Reply | Threaded
Open this post in threaded view
|

Re: __gc visible to lua code...

Gé Weijers
In reply to this post by Graham Wakefield

On Mar 31, 2007, at 5:42 AM, Graham Wakefield wrote:

The main safety concern would not be necessary if __gc was only visible to the C API, not Lua.  

In my app I must necessarily be concerned with performance; and I also follow the common example of having the metatable equal the method table for this reason.


The number of index operations on tables does not go down when the metatable is the same table as the method table, so I don't see the big performance gain (apart from slightly lower processor cache use maybe?). Lua is still doing 'getmetatable(udata).__index["method"]'. As there is only one instance of each the difference in memory use is insubstantial, and you'd be unable to gain access to __gc if you set __metatable.

I performed a little test, and sharing the method table and metatable was actually 2.5% slower, I don't know why.


--
Gé Weijers



12