block-scope finalization

classic Classic list List threaded Threaded
86 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Roberto Ierusalimschy
> >First, if the library does not protect metatables, any code can get the
> >__gc metamethod and call it explicitly.
>
> IMHO, protecting the metatable is a good idea anyway so that no-one
> removes the `__gc` metamethod and you run out of resources somewhere
> else.

Someone can trivially hold a reference to the object so that you
run out of resources. There is no need to remove __gc.


> >Second, it is a good practice (well, I think it is) to provide an
> >explicit way to "close" an object that needs finalization.
> >
> >The io library, for instance, does that.
>
> It's easy for the io library, because it stores pointers. Downside
> is that the garbage collector isn't aware of the extra memory, and
> memory fragmentation could be higher since there are multiple
> allocations for each userdata now. For every non-pointer userdata
> you'd have to add an extra field to indicate the state of
> finalization.

1 bit is not that expensive.


> But now every (meta-)method has to check that the userdata is still
> valid, and most Lua code that uses those (meta-)methods has to check
> again unless it is ok with a simple getter/setter throwing an error.

Most of these methods have to check that the userdata has the correct
type, anyway. It should be easy to bundle together these two tests
(the object has the correct type *and* it is not closed.)


> Also there are cases where explicitly "closing" an object is unsafe,
> e.g. if another object holds a pointer to that object. You can
> ensure the correct `__gc` order by storing references in the
> appropriate uservalue tables, but invalidating dependent objects is
> harder -- especially if the dependencies might change at runtime.
> Concrete examples from the last three libraries I created bindings
> for are renderers and textures in libSDL, memory pools and any other
> APR object in the Apache Portable Runtime library, and
> Fl_Input_Choice and its Fl_Input and Fl_Menu_Button subwidgets in
> FLTK.

It is worth remembering that, by using simple tricks with weak tables,
it is not difficult to allow an object to be naturally finalized (that
is, it has its __gc called by the collector) *and* to keep a reference
to it. So, if you are worried about that situation, you need some
kind of extra protection anyway; it is not enough to hide the __gc
from the user. (A better technique in some cases is to hide the
object itself from the user.)

-- Roberto


Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Patrick Donnelly
In reply to this post by Viacheslav Usov
On Tue, Nov 24, 2015 at 4:50 AM, Viacheslav Usov <[hidden email]> wrote:

> On Tue, Nov 24, 2015 at 3:10 AM, Patrick Donnelly <[hidden email]>
> wrote:
>
>> For Lua, we can simply annotate the local variable to indicate its value
>> should be cleaned up when the local scope ends.
>
> This is exactly what I proposed originally. But after a week of thinking my
> conclusion is that this is too weird, because it breaks a major assumption,
> See the details in my earlier message and the subsequent exchange with
> Roberto.

Apologies! I've only been selectively reading parts of this thread. At
least we know it's a good idea now ;)

--
Patrick Donnelly

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Patrick Donnelly
In reply to this post by Viacheslav Usov
On Tue, Nov 24, 2015 at 4:50 AM, Viacheslav Usov <[hidden email]> wrote:

> On Tue, Nov 24, 2015 at 3:10 AM, Patrick Donnelly <[hidden email]>
> wrote:
>
>> For Lua, we can simply annotate the local variable to indicate its value
>> should be cleaned up when the local scope ends.
>
> This is exactly what I proposed originally. But after a week of thinking my
> conclusion is that this is too weird, because it breaks a major assumption,
> See the details in my earlier message and the subsequent exchange with
> Roberto.

Sorry, one more message to respond to the "breaks a major assumption":
are you talking about the issue of calling __gc more than once on an
object? Or perhaps a block "hijacking" an object to close?

function a(foo)
  block bar = foo
  -- haha! I'm tricking Lua into calling getmetatable(foo).__gc(foo)
end

function b()
  local bar = io.open("file")
  a(bar)
end

I don't see this as a real issue because a could have simply called foo:close().

--
Patrick Donnelly

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Philipp Janda
In reply to this post by Roberto Ierusalimschy
Am 24.11.2015 um 15:04 schröbte Roberto Ierusalimschy:
>>> First, if the library does not protect metatables, any code can get the
>>> __gc metamethod and call it explicitly.
>>
>> IMHO, protecting the metatable is a good idea anyway so that no-one
>> removes the `__gc` metamethod and you run out of resources somewhere
>> else.
>
> Someone can trivially hold a reference to the object so that you
> run out of resources. There is no need to remove __gc.

Even if I provide a custom environment, or `lua_close()` the Lua state?

>
>
>>> Second, it is a good practice (well, I think it is) to provide an
>>> explicit way to "close" an object that needs finalization.
>>>
>>> The io library, for instance, does that.
>>
>> It's easy for the io library, because it stores pointers. Downside
>> is that the garbage collector isn't aware of the extra memory, and
>> memory fragmentation could be higher since there are multiple
>> allocations for each userdata now. For every non-pointer userdata
>> you'd have to add an extra field to indicate the state of
>> finalization.
>
> 1 bit is not that expensive.

The size is not the issue (although with padding the one bit can cost
you up to 8 bytes). The point is that you cannot use

     lua_newuserdata( L, sizeof( Fl_Input ) );

You have to write

     typedef struct {
       Fl_Input obj;
       char     valid;
     } Fl_Input_UD;
     //...
     lua_newuserdata( L, sizeof( Fl_Input_UD ) );

(Unless you add some flags to the Lua `Userdata` type and an API to
access those, see below).

>
>
>> But now every (meta-)method has to check that the userdata is still
>> valid, and most Lua code that uses those (meta-)methods has to check
>> again unless it is ok with a simple getter/setter throwing an error.
>
> Most of these methods have to check that the userdata has the correct
> type, anyway. It should be easy to bundle together these two tests
> (the object has the correct type *and* it is not closed.)

I have written a wrapper around `luaL_newmetatable()`[1],
`lua_newuserdata()`[2][3], and `luaL_checkudata()`[4] to do just
that[5]. It also handles the struct vs pointer difference transparently
(i.e. the check function always returns `Fl_Input*` regardless of
whether an `Fl_Input` or an `Fl_Input*` is stored in the userdata) and
can check the validity of the parent object if you want to expose
members of structs, unions or classes as userdata.

   [1]: https://github.com/siffiejoe/lua-moon#moon_defobject
   [2]: https://github.com/siffiejoe/lua-moon#moon_newobject
   [3]: https://github.com/siffiejoe/lua-moon#moon_newpointer
   [4]: https://github.com/siffiejoe/lua-moon#moon_checkobject
   [5]: https://github.com/siffiejoe/lua-moon#moon_killobject

>
>
>> Also there are cases where explicitly "closing" an object is unsafe,
>> e.g. if another object holds a pointer to that object. You can
>> ensure the correct `__gc` order by storing references in the
>> appropriate uservalue tables, but invalidating dependent objects is
>> harder -- especially if the dependencies might change at runtime.
>> Concrete examples from the last three libraries I created bindings
>> for are renderers and textures in libSDL, memory pools and any other
>> APR object in the Apache Portable Runtime library, and
>> Fl_Input_Choice and its Fl_Input and Fl_Menu_Button subwidgets in
>> FLTK.
>
> It is worth remembering that, by using simple tricks with weak tables,
> it is not difficult to allow an object to be naturally finalized (that
> is, it has its __gc called by the collector) *and* to keep a reference
> to it. So, if you are worried about that situation, you need some
> kind of extra protection anyway; it is not enough to hide the __gc
> from the user.

Ah, I didn't realize that. It seems I have to remove the metatable from
some of my finalized objects to avoid that issue for now.

> (A better technique in some cases is to hide the
> object itself from the user.)


I have (another) unrelated question:
After a `lua_pcall(L, 0, LUA_MULTRET, 0)`, does Lua have `EXTRA_STACK`
stack slots available for its API functions? I.e. if I don't want to
call `lua_checkstack()`, do I have to free only the stack slots I need
myself or the `EXTRA_STACK` stack slots for the Lua API as well?

>
> -- Roberto
>

Philipp




Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Roberto Ierusalimschy
> >Someone can trivially hold a reference to the object so that you
> >run out of resources. There is no need to remove __gc.
>
> Even if I provide a custom environment, or `lua_close()` the Lua state?

No :-)


> I have (another) unrelated question:
> After a `lua_pcall(L, 0, LUA_MULTRET, 0)`, does Lua have
> `EXTRA_STACK` stack slots available for its API functions?

No. This is explained in the new release of the manual (5.3.2):

  The function results are pushed onto the stack when the function
  returns.  The number of results is adjusted to nresults, unless
  nresults is LUA_MULTRET.  In this case, all results from the function
  are pushed.  Lua takes care that the returned values fit into the
  stack space, but it does not ensure any extra space in the stack.


> I.e. if I don't want to call `lua_checkstack()`, do I have to free
> only the stack slots I need myself or the `EXTRA_STACK` stack slots
> for the Lua API as well?

What do you mean by "free the slots"? You never free any slot; only
when the function returns its stack space is reclaimed.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Viacheslav Usov
In reply to this post by Patrick Donnelly
On Tue, Nov 24, 2015 at 3:27 PM, Patrick Donnelly <[hidden email]>
 
> Sorry, one more message to respond to the "breaks a major assumption": are you talking about the issue of calling __gc more than once on a object?

The major assumption is that once the finalizer is called (and it does not resurrect the object), the object is never accessible by any user code, and so no library routine, including "close" and __gc, will ever be called with that object.

That assumption is not 100% correct because a dedicated user (given a lenient library) could still obtain __gc and call it. I doubt this is widely understood by everybody who writes code callable from Lua, but the implications are not severe because this will not happen in normal use. Same for handling multiple "close" calls: they won't happen in normal use.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Patrick Donnelly
On Tue, Nov 24, 2015 at 10:52 AM, Viacheslav Usov <[hidden email]> wrote:

> On Tue, Nov 24, 2015 at 3:27 PM, Patrick Donnelly <[hidden email]>
>
>> Sorry, one more message to respond to the "breaks a major assumption": are
>> you talking about the issue of calling __gc more than once on a object?
>
> The major assumption is that once the finalizer is called (and it does not
> resurrect the object), the object is never accessible by any user code, and
> so no library routine, including "close" and __gc, will ever be called with
> that object.
>
> That assumption is not 100% correct because a dedicated user (given a
> lenient library) could still obtain __gc and call it. I doubt this is widely
> understood by everybody who writes code callable from Lua, but the
> implications are not severe because this will not happen in normal use. Same
> for handling multiple "close" calls: they won't happen in normal use.

A dedicated user could also use proxy userdata and weak tables to
force a non-transient resurrection (to use the Lua manual's terms).
Have the proxy userdata's lifetime tied to the userdata you want to
resurrect. Use the proxy's __gc metamethod to force resurrection:

$ lua
Lua 5.3.1  Copyright (C) 1994-2015 Lua.org, PUC-Rio
> do
>> local f = io.open("/dev/null");
>> local t = setmetatable({}, {__mode = "k"})
>> t[f] = setmetatable({}, {__gc = function() print 'resurrecting f'; _G.file = f; end})
>> end
> collectgarbage "collect"; collectgarbage "collect";
resurrecting f
> print(file)
file (closed)

For Lua 5.3 we use a table instead of a proxy userdata.

The user may not be able to call a protected __gc metamethod more than
once but they can still call any method on a "closed" object. All done
with simple Lua primitives. It's easy to see how this could be done
accidentally. All libraries should handle closed objects robustly for
all methods.

--
Patrick Donnelly

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Philipp Janda
In reply to this post by Roberto Ierusalimschy
Am 24.11.2015 um 16:42 schröbte Roberto Ierusalimschy:

>> I have (another) unrelated question:
>> After a `lua_pcall(L, 0, LUA_MULTRET, 0)`, does Lua have
>> `EXTRA_STACK` stack slots available for its API functions?
>
> No. This is explained in the new release of the manual (5.3.2):
>
>    The function results are pushed onto the stack when the function
>    returns.  The number of results is adjusted to nresults, unless
>    nresults is LUA_MULTRET.  In this case, all results from the function
>    are pushed.  Lua takes care that the returned values fit into the
>    stack space, but it does not ensure any extra space in the stack.
>
>
>> I.e. if I don't want to call `lua_checkstack()`, do I have to free
>> only the stack slots I need myself or the `EXTRA_STACK` stack slots
>> for the Lua API as well?
>
> What do you mean by "free the slots"? You never free any slot; only
> when the function returns its stack space is reclaimed.

I thought about reserving stack slots (using `lua_settop`) in front of
the arguments/results of `lua_pcall` and then shifting the results so
that there are free stack slots available at the top. However, if I
can't even call Lua API functions (`lua_remove` or `lua_copy`) in a save
way after the `lua_pcall`, that won't work.

Are there any functions that are guaranteed to work without any
available stack slots (e.g. `lua_settop` and `lua_pop` should work, as
well as `lua_checkstack`)?

>
> -- Roberto
>

Philipp





Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Viacheslav Usov
In reply to this post by Patrick Donnelly
On Tue, Nov 24, 2015 at 5:16 PM, Patrick Donnelly <[hidden email]> wrote:
> All libraries should handle closed objects robustly for all methods.

I do not disagree with the "should" part of that.

I do not think the sentence is correct if it loses "should". The proposed alternative, ref counting, does block scope finalization without breaking any existing code.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Philipp Janda
Am 24.11.2015 um 17:28 schröbte Viacheslav Usov:
> On Tue, Nov 24, 2015 at 5:16 PM, Patrick Donnelly <[hidden email]>
> wrote:
>> All libraries should handle closed objects robustly for all methods.
>
> I do not disagree with the "should" part of that.
>
> I do not think the sentence is correct if it loses "should". The proposed
> alternative, ref counting, does block scope finalization without breaking
> any existing code.

The two things are unrelated. Currently you can access a finalized
object by iterating the keys of a weak table. This case should be
handled in all libraries. Adding ref counting wouldn't unbreak the
currently broken libraries.

And adding ref counting might break existing code: a simple
`lua_replace()` can now run finalizers and thus raise errors and cause
leaks in the C code.

>
> Cheers,
> V.
>

Philipp




Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Roberto Ierusalimschy
In reply to this post by Philipp Janda
> I thought about reserving stack slots (using `lua_settop`) in front
> of the arguments/results of `lua_pcall` and then shifting the
> results so that there are free stack slots available at the top.
> However, if I can't even call Lua API functions (`lua_remove` or
> `lua_copy`) in a save way after the `lua_pcall`, that won't work.
>
> Are there any functions that are guaranteed to work without any
> available stack slots (e.g. `lua_settop` and `lua_pop` should work,
> as well as `lua_checkstack`)?

All API functions that do not push stuff (as flagged in its documentaion)
should work without any available stack slots.  That includes both
lua_remove and lua_copy (and lua_rotate, which seems your best option
for moving the results).

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Roberto Ierusalimschy
In reply to this post by Patrick Donnelly
What is wrong with the following proposal?

  local <some mark to be invented> name = exp

Unlike a regular 'local' declaration, this one must define only one
variable and it must be initialized. (Both restrictions could be easily
removed; they are more about programming style.)

When the local 'name' goes out of scope, then:

1 - if its value is a function, that function is called (no parameters)
2 - if its value is a table/userdata, its __close (or some other new
name) metamethod, if present, is called.

Otherwise, 'name' is like any other local variable.

-- Roberto


Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Javier Guerra Giraldez
On Tue, Nov 24, 2015 at 11:52 AM, Roberto Ierusalimschy
<[hidden email]> wrote:
> 1 - if its value is a function, that function is called (no parameters)

i like that option!

of course, when cleaning up is releasing a "resource", then it's
reasonable to call a (library defined) metamethod, but also allowing a
function to be called makes it so much nicer as an end user tool.


--
Javier

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Javier Guerra Giraldez
In reply to this post by Roberto Ierusalimschy
On Tue, Nov 24, 2015 at 11:52 AM, Roberto Ierusalimschy
<[hidden email]> wrote:
>   local <some mark to be invented> name = exp

now the _really_ hard part: coming up with a name for that "mark to be
invented"...

--
Javier

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Viacheslav Usov
In reply to this post by Roberto Ierusalimschy
On Tue, Nov 24, 2015 at 5:52 PM, Roberto Ierusalimschy <[hidden email]> wrote:

> 2 - if its value is a table/userdata, its __close (or some other new name) metamethod, if present, is called.

This is good and bad. Good, because it cannot break anything. Bad, because it takes an extra effort for lib developers and may or may not be done uniformly.

The "some mark" is also somewhat bad, because the users will need to be taught a new trick.

These considerations have led me to ref counting as a viable option. I would certainly appreciate hearing your opinion about that.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Paul K-2
In reply to this post by Javier Guerra Giraldez
>>   local <some mark to be invented> name = exp
> now the _really_ hard part: coming up with a name for that "mark to be invented"...

"block"? And I think at least the north side of the shed should be
painted yellow...

Paul.

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Philipp Janda
In reply to this post by Roberto Ierusalimschy
Am 24.11.2015 um 17:50 schröbte Roberto Ierusalimschy:

>> I thought about reserving stack slots (using `lua_settop`) in front
>> of the arguments/results of `lua_pcall` and then shifting the
>> results so that there are free stack slots available at the top.
>> However, if I can't even call Lua API functions (`lua_remove` or
>> `lua_copy`) in a save way after the `lua_pcall`, that won't work.
>>
>> Are there any functions that are guaranteed to work without any
>> available stack slots (e.g. `lua_settop` and `lua_pop` should work,
>> as well as `lua_checkstack`)?
>
> All API functions that do not push stuff (as flagged in its documentaion)
> should work without any available stack slots.

That criteria appears to not be sufficient, e.g. `lua_error` doesn't
push elements according to the manual, but `luaG_errormsg` uses
`EXTRA_STACK` space. And some `luaL_` functions might use functions that
push internally (e.g. `luaL_len` or `luaL_ref`).

> That includes both lua_remove and lua_copy (and lua_rotate, which
> seems your best option for moving the results).

I try to target older Lua versions as well (Lua 5.2 and up since I need
yieldable C functions), and a reimplementation of `lua_rotate` would
have to push values for swapping ...

>
> -- Roberto
>

Philipp



Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Viacheslav Usov
In reply to this post by Philipp Janda
On Tue, Nov 24, 2015 at 5:48 PM, Philipp Janda <[hidden email]> wrote:

> This case should be handled in all libraries.

Sure. The question is, is that really the case?

> Adding ref counting wouldn't unbreak the currently broken libraries.

The goal is not to unbreak. It is to keep those broken libs that manage to work in the hands of their users, working.

> And adding ref counting might break existing code: a simple `lua_replace()` can now run finalizers and thus raise errors and cause leaks in the C code.

I do not think I understand this part. Why would ref counting + lua_replace() run finalizers?

Cheers,
V.

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Patrick Donnelly
In reply to this post by Roberto Ierusalimschy
On Tue, Nov 24, 2015 at 11:52 AM, Roberto Ierusalimschy
<[hidden email]> wrote:

> What is wrong with the following proposal?
>
>   local <some mark to be invented> name = exp
>
> Unlike a regular 'local' declaration, this one must define only one
> variable and it must be initialized. (Both restrictions could be easily
> removed; they are more about programming style.)
>
> When the local 'name' goes out of scope, then:
>
> 1 - if its value is a function, that function is called (no parameters)
> 2 - if its value is a table/userdata, its __close (or some other new
> name) metamethod, if present, is called.
>
> Otherwise, 'name' is like any other local variable.

+1! I think a new metamethod is required as an object may be opened
and closed repeatedly (like a mutex).

Open question: should __close be called after or before the "message
handler" for lua_pcall? I think after makes sense as the handler may
want to look at the open objects on the stack.

--
Patrick Donnelly

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Roberto Ierusalimschy
In reply to this post by Viacheslav Usov
> These considerations have led me to ref counting as a viable option. I
> would certainly appreciate hearing your opinion about that.

I don't think ref couting is a viable option. It has problems with
cycles and it is slow.

-- Roberto

12345