block-scope finalization

classic Classic list List threaded Threaded
86 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Roberto Ierusalimschy
> >2 - if its value is a table/userdata, its __close (or some other new
> >name) metamethod, if present, is called.
> What's the problem with the __gc metamethod?

As it was discussed here, you may not want to allow users to call __gc
directly. It does not seem too complex to use a diferent metamethod, and
it is simple to do __close=__gc if you want.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Tom Sutcliffe
In reply to this post by Roberto Ierusalimschy
On 24 Nov 2015, at 4:52 pm, Roberto Ierusalimschy <[hidden email]> wrote:

> What is wrong with the following proposal?
>
>  local <some mark to be invented> name = exp
>
> Unlike a regular 'local' declaration, this one must define only one
> variable and it must be initialized. (Both restrictions could be easily
> removed; they are more about programming style.)
>
> When the local 'name' goes out of scope, then:
>
> 1 - if its value is a function, that function is called (no parameters)
> 2 - if its value is a table/userdata, its __close (or some other new
> name) metamethod, if present, is called.
>

I do like the sound of that. It nicely sidesteps the whole GC-vs-refcounting scoping debate, because it won't be finalising or deallocing the value concerned, only calling its __close metamethod, which really is all I've ever wanted from such a construct. I don't really ever care about deterministically deallocing memory, only about making sure cleanup code can be run when I need it (and if I do need deterministic deallocing, I can write a native-code __close method to do that). I am generally happy with the GC behaviour for all scenarios other than it not offering deterministic cleanup of block-scoped values, and by not having to alter the language's memory management in the slightest this sounds like the best of both worlds.

I've never looked at how C Python implements its hybrid refcount-plus-gc model, so I don't know if it's a horrible mess that we'd never want in a Language like Lua, but I'm not automatically against the concept of a hybrid memory management mechanism. But I think it's a separate issue to what we actually want here which is deterministic cleanup based on variable scope, and we shouldn't confuse the two things. In other words refcounting is one way of achieving that, but clearly not the only way, and Roberto's proposal of extending the upvalue mechanism seems just as good and much cleaner. Indeed I've seem Python code which implicitly *relies* on objects getting refcounted cleanup, which is very fragile, and we don't want to move toward a syntax which relies on a particular memory model implementation.

I'd add one suggestion however, which would be instead of "local <mark> name" there is a new one-word keyword to indicate this special meaning, perhaps "blocklocal name".

Doing this would avoid the sorts of ambiguity/potential for mistakes that C constructs like "char* foo, bar" introduce when more than one variable is declared (written like that you might think that bar was a char* not just a char). Also I've just never liked multi-word types/modifiers like C's "unsigned int". I'd also point to Lua's use of "elseif" rather than "else if" as a previous example of where one token is preferred over two.

I'd suggest also that it might be simpler to not have the requirements that this only takes one variable which must be initialised - since presumably you must still be allowed to reassign to the variable (including assigning it nil) and thus there shouldn't be a requirement that the variable never changes its value during the block scope, it seems a bit confusing to require that it have a value at the start. I assume that only the value which the variable had at scope exit would be the thing that got __close()d? And having a single "blocklocal" keyword makes it clearer that the block-ness applies to every variable declared and avoids even having to document that "local foo, block bar" isn't valid.

Regards,

Tom
Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Viacheslav Usov
On Thu, Jan 14, 2016 at 1:38 PM, Tom Sutcliffe <[hidden email]> wrote:
On 24 Nov 2015, at 4:52 pm, Roberto Ierusalimschy <[hidden email]> wrote:

> What is wrong with the following proposal?
>
>  local <some mark to be invented> name = exp
>
> Unlike a regular 'local' declaration, this one must define only one
> variable and it must be initialized. (Both restrictions could be easily
> removed; they are more about programming style.)
>
> When the local 'name' goes out of scope, then:
>
> 1 - if its value is a function, that function is called (no parameters)
> 2 - if its value is a table/userdata, its __close (or some other new
> name) metamethod, if present, is called.
>

I do like the sound of that. It nicely sidesteps the whole GC-vs-refcounting scoping debate, because it won't be finalising or deallocing the value concerned, only calling its __close metamethod, which really is all I've ever wanted from such a construct.


I agree that doing something about this issue is better than doing nothing at all. The issue is quite real. Yet, the mechanism proposed by Roberto is a low-level mechanism and I would like to repeat what I wrote earlier on the subject:

(begin)

But I do not like the idea that Lua programs need to be using lowest levels of abstraction and micromanagement that comes with that.

If we forget, for a second, "external" resources. such as files, then Lua's internal resources (memory) are managed automatically. The user does not have to deal with that at all, at least in principle. Why should that be different for external resources? The only reason the user has to be involved now is because the language does not provide any means of deterministic finalization to library writers. If we decorate Lua with additional low level means of resource management, they should primarily be means available to library writers; we should not make the user even MORE involved.

On the other hand, if those new means require library writers to follow some new paradigm, then we cannot expect they will be universally adopted. We will have a mess not unlike what C++ is, with its multiple resource management paradigms. The proposed reference counting mechanism ensures that neither users nor library writers need to do anything new. We just improve the behaviour of all the existing code, and let users write simpler yet more efficient code in future.

(end)

I do not really care about ref-counting per se. I only mentioned it because that is the mechanism I am familiar with and also a mechanism widely used for this purpose. What I really want is something that is simple for all developers, including library writers, yet effective. I would list our options thus, more preferable (to me) first:

1. Deterministic garbage collection for all objects without new metamethods and without new language constructs. Optional metamethods controlling this are OK.

2. Deterministic garbage collection with new (not optional) metamethods but without new language constructs

3. Deterministic garbage collection with new language constructs.

4. Block-exit metamethod invocation without new language constructs.

5. Block-exit metamethod invocation with new language constructs.

Roberto's proposal, just like my initial proposal, is #5, least preferable to me, because both library users and library writers need to know and use the new mechanism, and the mechanism is somewhat awkward because the library writer does not really know that the object for which __close is called will not be used again; basically, the user has to ensure that.

Cheers,
V.

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Sean Conner
In reply to this post by Roberto Ierusalimschy
It was thus said that the Great Roberto Ierusalimschy once stated:

> What is wrong with the following proposal?
>
>   local <some mark to be invented> name = exp
>
> Unlike a regular 'local' declaration, this one must define only one
> variable and it must be initialized. (Both restrictions could be easily
> removed; they are more about programming style.)
>
> When the local 'name' goes out of scope, then:
>
> 1 - if its value is a function, that function is called (no parameters)

  I don't think this is a good idea.  For example:

        do
          local magic print = print -- because I heard this was faster!

          print("hello")
          print("goodbye")
        end

  print() gets called an extra time and an extraneous line of output is
generated.  I purposely picked a function I know has a side effect when no
parameters are called to show the issue here.  We don't know *which*
functions will be assigned to a local value.  So now what?

        do
          local magic print = print
          print("hello")
          print("goodbye")
          print = nil -- WTF?  
        end

  Another issue---how to __call and __close interact?  

        do
          local magic foo = {}
          setmetatable(foo,{ __call = function(self) ... end })

          foo(3)
        end

foo has call semantics, so is it

        foo(nil)

or

        getmetatable(foo).__close(foo)

  Yes, it might seem clear that __close() happens because foo is a table,
but a reasonable case might be made for the confusion.

> 2 - if its value is a table/userdata, its __close (or some other new
> name) metamethod, if present, is called.

  This I don't have an issue with.

  -spc


Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Patrick Donnelly
In reply to this post by Tom Sutcliffe
On Thu, Jan 14, 2016 at 7:38 AM, Tom Sutcliffe <[hidden email]> wrote:
> I'd add one suggestion however, which would be instead of "local <mark> name" there is a new one-word keyword to indicate this special meaning, perhaps "blocklocal name".
>
> Doing this would avoid the sorts of ambiguity/potential for mistakes that C constructs like "char* foo, bar" introduce when more than one variable is declared (written like that you might think that bar was a char* not just a char). Also I've just never liked multi-word types/modifiers like C's "unsigned int". I'd also point to Lua's use of "elseif" rather than "else if" as a previous example of where one token is preferred over two.

This will likely complicate assignment from functions that return
multiple values. Take io.open:

local <mark> f, err = io.open"foo"

I don't really want this to become the new way of doing things:

<mark> f
local err
f, err = io.open "foo"

It would be better to annotate each name somehow when it is declared,
in my opinion.

> I'd suggest also that it might be simpler to not have the requirements that this only takes one variable which must be initialised - since presumably you must still be allowed to reassign to the variable (including assigning it nil) and thus there shouldn't be a requirement that the variable never changes its value during the block scope, it seems a bit confusing to require that it have a value at the start. I assume that only the value which the variable had at scope exit would be the thing that got __close()d? And having a single "blocklocal" keyword makes it clearer that the block-ness applies to every variable declared and avoids even having to document that "local foo, block bar" isn't valid.

I think what Roberto was trying to say is you shouldn't assign a value
that is not ready to be closed. For example, don't assign an unlocked
mutex as an error may happen before you try to lock it. This would
cause Lua to try to unlock (__close) it possibly causing an error.

Roberto didn't say this either but I suspect it was implied: if the
value is not a function and has no __close metamethod (e.g. nil), then
nothing happens when the "local <mark>" scope ends. This is consistent
with the __gc metamethod. And actually, Lua doesn't even check the
__gc metamethod for some types. Probably it wouldn't check __close for
the same types.

--
Patrick Donnelly

Reply | Threaded
Open this post in threaded view
|

Re: block-scope finalization

Tom Sutcliffe

> On 14 Jan 2016, at 5:42 pm, Patrick Donnelly <[hidden email]> wrote:
>
> This will likely complicate assignment from functions that return
> multiple values. Take io.open:
>
> local <mark> f, err = io.open"foo"

I don't think there's any obvious problem with 'err' getting the <magic> treatment as well? It (generally) won't be a table with a __close so the extra attribute won't have any effect? I occasionally find myself writing things like

local err
f, err = func(...)

because I want err to be local (if there is an error) but f should go straight into _ENV, but I don't think this will make that construct any more common than it is already.

> I think what Roberto was trying to say is you shouldn't assign a value
> that is not ready to be closed. For example, don't assign an unlocked
> mutex as an error may happen before you try to lock it. This would
> cause Lua to try to unlock (__close) it possibly causing an error.

I was assuming that if one is going to implement a __close metamethod, you have to go to effort of making sure it behaves itself, ie that it can be safely called multiple times, has no effect if the object in question isn't actually open (for whatever 'open' means in this situation) etc. That's established good practice in pretty much all "close" APIs I think. For Lua file objects for example, __gc already does this anyway, so no additional work would be necessary other than to make f.__close = f.__gc.

> Roberto didn't say this either but I suspect it was implied: if the
> value is not a function and has no __close metamethod (e.g. nil), then
> nothing happens when the "local <mark>" scope ends. This is consistent
> with the __gc metamethod. And actually, Lua doesn't even check the
> __gc metamethod for some types. Probably it wouldn't check __close for
> the same types.

Agreed, that's what I assumed.

Cheers,

Tom


12345