Language lawyer corner: object liveness

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Language lawyer corner: object liveness

Sergey Zakharchenko
Hello list,

5.4 somehow turned attention of many people to finalization, and how
they were relying on something that was there in the code but never
actually documented. Lua (just like many other languages) can (but
doesn't have to) clean up stuff as soon as it detects it cannot
possibly be used by regular code (I assume this means using
debug.getlocal & friends is not permitted). E.g. consider the
following:

local function f()
   local t={g()}
   ... -- code that never uses t
   return t.x
end

A sufficiently advanced optimizer knows that t's constructor cannot
create any non-numeric fields. It also knows t doesn't escape (as it
isn't used anywhere), so t.x == nil. Therefore, it doesn't need t at
all, and rewrites f to:

local function f()
   g() -- called for side effects only
   ... -- code that never uses t
   return nil
end

One way to avoid this if needed seems to be to put code that does
something side-effectful (and possibly error-throwing) but only on a
condition that you know never happens, but the optimizer can't prove
it. E.g. "_G[nil] and print(t)".

The problem is that people often need to have "trust me, I need the
variable to stay put but I also don't want to have to remember to do
something meaningful at the end of the block to it" semantics and it
seems a bit tricky to achieve. Maybe this new annotation syntax could
help?..

Best regards,

--
DoubleF

Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Thijs Schreijer


> On 6 Jun 2019, at 08:21, Sergey Zakharchenko <[hidden email]> wrote:
>
> Hello list,
>
> 5.4 somehow turned attention of many people to finalization, and how
> they were relying on something that was there in the code but never
> actually documented. Lua (just like many other languages) can (but
> doesn't have to) clean up stuff as soon as it detects it cannot
> possibly be used by regular code (I assume this means using
> debug.getlocal & friends is not permitted). E.g. consider the
> following:
>
> local function f()
>   local t={g()}
>   ... -- code that never uses t
>   return t.x
> end
>
> A sufficiently advanced optimizer knows that t's constructor cannot
> create any non-numeric fields. It also knows t doesn't escape (as it
> isn't used anywhere), so t.x == nil. Therefore, it doesn't need t at
> all, and rewrites f to:
>
> local function f()
>   g() -- called for side effects only
>   ... -- code that never uses t
>   return nil
> end
>
> One way to avoid this if needed seems to be to put code that does
> something side-effectful (and possibly error-throwing) but only on a
> condition that you know never happens, but the optimizer can't prove
> it. E.g. "_G[nil] and print(t)".
>
> The problem is that people often need to have "trust me, I need the
> variable to stay put but I also don't want to have to remember to do
> something meaningful at the end of the block to it" semantics and it
> seems a bit tricky to achieve. Maybe this new annotation syntax could
> help?..
>
> Best regards,
>
> --
> DoubleF
>

I get the explanation. But what is the problem? In the example you gave everything works as expected, optimised or not.

Thijs


Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Sergey Zakharchenko

Thijs,

> I get the explanation. But what is the problem? In the example you gave everything works as expected, optimised or not.

Some would expect objects returned by g() to survive until the end of scope (e.g. they're OS locks being held). This is not necessary by the word of documentation and developers, but it happens to be implemented that way and some people rely on that. The problem is that "create-local-and-forget"-style 'RAII' seems to be, as of now, fundamentally incompatible with the "might die as soon as it's surely not accessible" semantics. You must not forget to unlock the lock manually, otherwise you risk having it unlocked for you *earlier*, which is a bit unexpected by some people but makes perfect sense for others.

Best regards,

--
DoubleF

Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Sergey Zakharchenko
List,

Now that I re-read the 5.4 reference on to-be-closed variables, *if*
the requirement to call the __close metamethod *exactly* when the
variable goes out of scope counts as the object being needed for
normal execution (though it's a bit of a stretch to call it accessible
from Lua outright, as __close might be a C function never calling back
into Lua), looks like to-be-closed variables *do* implement the
"create-local-and-forget"-style 'RAII' we're after. Maybe this needs
extra mention in the manual. Note that variable scope isn't mentioned
much in the manual (it's defined but not used too much, and I'm not
quite sure what a "non-void" statement is in Lua).

Best regards,

--
DoubleF

Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Gé Weijers
The garbage collector in 5.4 implements a generational mode. If an object survives the minor collections it may take a very, very long time before its __gc metamethod gets called after is becomes inaccessible, especially if your program mostly creates short lived objects. This makes __gc less useful as a poor man's RAII replacement. 

The new "toclose" feature is much more useful to release resources and unlock locks in a timely matter.

(If you use finalizers in such a way that the point at which it is run matters much your code is very likely buggy for other reasons as well)





Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Roberto Ierusalimschy
> (If you use finalizers in such a way that the point at which it is run
> matters much your code is very likely buggy for other reasons as well)

+1. (I just got tired of trying to say that.)  That is why I don't
think the manual should be clearer about when objects are or are not
accessible. It shouldn't matter.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Sergey Zakharchenko
Roberto,

> +1. (I just got tired of trying to say that.)  That is why I don't
> think the manual should be clearer about when objects are or are not
> accessible. It shouldn't matter.

What matters is whether the requirement to call the __close metamethod
on out of scope counts as the object being needed for normal
execution. Please just say yes:)

Best regards,

--
DoubleF

Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Philippe Verdy
Le jeu. 6 juin 2019 à 16:30, Sergey Zakharchenko <[hidden email]> a écrit :
What matters is whether the requirement to call the __close metamethod
on out of scope counts as the object being needed for normal
execution. Please just say yes:)

We can't reply yes or no to your statement which is not clear enough:
"counts as the object being needed for normal execution"
is very fuzzy, if it ever has a meaning (how does "a requirement counts" ?)

And if the object is needed for normal execution, then it should have a reference effectively used and really available at runtime. If there's no such reference, no possible access, then the object can be finalized at any time (but not at any predictable time, not even by trying to use an explicit call to the garbage collector, which may just do what is necessary for the program to continue working without being impacted by important shortage of resource, and may still use multiple passes to perform a "complete" garbage collection (which in most cases is just undesirable, as it would be very costly in CPU time and performance, freeing more objects than needed (and notably all weak references that are still useful for having efficient caches).

If you ever need an ordered finalization of any object in a predictable time, don't make your programe depend on finalizers (and don't make explicit calls to the garbage collector, which will seriously impact the performance of your application), but use explicit close()/free() methods and fix your program to use them (finalizers will just run as a last chance to limit the impact of resource leaks over some prolonged period of use), or remove all references (and let the garbage collector do his work only when it really needs to; you can still control separately the threshold limits that cause the garbage collector to run, and set tuning parameters for the free resources it must preserve or the maximum amount of unfinalized objects to maintain).

At best, calling the garbage collector explicitly should be done only in a stress test, to see how if your program still behaves correctly and does not hang or crash, in case of severe limitations of available resources, and does not also cause other running threads to stop working completely (allowing a possible denial of service attack, if your program is used to implement a network service accessible by third parties, forcing you to stop the program abruptly and restart it, possibly with the help of a watchdog timer or a time limit in the hosting server: note that an abrupt interruption will not allow any finalizer to run, so only basic resources will be freed, such as the total memory used by the Lua engine instance, or the file handles and other OS resources, or any open sockets... as a consequence this will be leaving any remote objects or storages in an inconsistant state, and will force your service to restart by first running a state check and a recovery cleanup or rollback process to repair the detected inconsistancies).

Every well-behaved program should explicitly declare (by a best effort) the resources it no longer needs at all, independantly of  finalizers that are just helper guards to manage bugs in your program (resource leaks) forgetting to explicitly free these resources. Luya already is smart enough to automatically detect objects that are no longer in scope and without a secondary reference stored elsewhere in another accessible object.

Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Philippe Verdy
In reply to this post by Sergey Zakharchenko
Le jeu. 6 juin 2019 à 08:21, Sergey Zakharchenko <[hidden email]> a écrit :
The problem is that people often need to have "trust me, I need the
variable to stay put but I also don't want to have to remember to do
something meaningful at the end of the block to it" semantics and it
seems a bit tricky to achieve. Maybe this new annotation syntax could
help?..

An annotation is not necessary, the best way to achieve it is to insert a valid reference to the object in the relevant function.

That reference is not necessarily a read access, it can be a simple "delete x" or "x:close()" or "x:free()" statement (which uses the reference, but does not necessarily perform any read access to its inner fields or value).
 
Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Sergey Zakharchenko
In reply to this post by Philippe Verdy

Philippe, Gé,

I 'like' how you reduce my __close-related question to __gc ones. Unfortunately you're missing the point. __close is not a finalizer. Why you keep insisting I shouldn't rely on the precise timing of calling __close is beyond my understanding. The thing is, I want it (obviously) to run on a non-garbage-collected object. Just like any regular code.

There's a very simple use case documented right in the 5.4 manual for an object that might have no references to it visible from Lua (and possibly from any C function Lua may call), yet it has to live to see its its metamethod __close called at the right time. Look at the generic for loop.

The answer to my question cannot be 'no', that's the point:).

Best regards,

--
DoubleF

Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Dibyendu Majumdar
On Thu, 6 Jun 2019 at 17:10, Sergey Zakharchenko
<[hidden email]> wrote:
> There's a very simple use case documented right in the 5.4 manual for an object that might have no references to it visible from Lua (and possibly from any C function Lua may call), yet it has to live to see its its metamethod __close called at the right time. Look at the generic for loop.
>
> The answer to my question cannot be 'no', that's the point:).
>

Hi, I don't know if I understood the point here so forgive me if I didn't.

I think you are saying that if a variable is marked 'toclose' and has
a __toclose() method then Lua must _guarantee_ the execution of the
__toclose() method when the variable goes out of scope, regardless of
GC.

If I understood you correctly then it seems a valid requirement. In
languages such as Java this problem doesn't arise because one writes
like this:

Resource r = null;
try {
   r = acquire();
}
finally {
   if (r != null)
      r.close();
}

So by the very structure the variable is alive at the time of cleanup.

Regards

Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Sergey Zakharchenko

Dibyendu,

> I think you are saying that if a variable is marked 'toclose' and has
> a __toclose() method then Lua must _guarantee_ the execution of the
> __toclose() method when the variable goes out of scope, regardless of
> GC.

Quite close. I would also expect the guarantee that the object isn't finalized (e.g. its __gc metamethod hasn't run) by the time __close is called. Just want to make this explicit. And yes, I think there's nothing out of the ordinary in what I'm asking as well, just a confirmation of this in the manual would be sufficient. Thanks for your reply!

Best regards,

--
DoubleF

Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Gé Weijers
In reply to this post by Sergey Zakharchenko
Hi Sergey,

On Thu, Jun 6, 2019 at 9:10 AM Sergey Zakharchenko <[hidden email]> wrote:

I 'like' how you reduce my __close-related question to __gc ones. Unfortunately you're missing the point. __close is not a finalizer. Why you keep insisting I shouldn't rely on the precise timing of calling __close is beyond my understanding. The thing is, I want it (obviously) to run on a non-garbage-collected object. Just like any regular code.

__close works pretty much like the "using (var x = new YourType) { ... }" construct in C#. The object must implement an interface, and that interface gets called at the end of the block or when handling an error/exception that exits the block. The behavior is completely deterministic, and the order of events does not depend on the garbage collector. It's the best new feature in 5.4 IMHO.

If the object implements both __close and __gc then __close is called first, unless you do things like using "local <toclose>" or changing the metatable inside of __gc.



Reply | Threaded
Open this post in threaded view
|

Re: Language lawyer corner: object liveness

Sergey Zakharchenko
In reply to this post by Sergey Zakharchenko
List,

I'm afraid I'm able to confer what I want to say to a small fraction
of people only, so I'll try to refrain from posting further on this
topic after this email. I'm not after any changes to the Lua
implementation. Rather, I propose some clarifications in the Lua
manual (which is the closest thing we have to a standard) for
consistency with "no longer accessible => OK to GC" rule (otherwise
the to-be-closed variables don't make sense at all).

Minor edit:

> A to-be-closed variable behaves like a constant local variable, except that its value is closed whenever the variable goes out of scope, including normal block termination, exiting its block by break/goto/return, or exiting by an error.

to read:

> A to-be-closed variable behaves like a constant local variable, except that its value is closed whenever the variable goes out of scope, including normal block termination, exiting its block by break/goto/return, or exiting by an error caught in the same coroutine (see 2.3 Error Handling).

Rationale: otherwise, this obviously contradicts 'Similarly, if a
coroutine ends with an error, it does not unwind its stack, so it does
not close any variable' below.

Then, and this is the main thing: after:

> Here, to close a value means to call its __close metamethod. If the value is nil, it is ignored; otherwise, if it does not have a __close metamethod, an error is raised. When calling the metamethod, the value itself is passed as the first argument and the error object (if any) is passed as a second argument; if there was no error, the second argument is nil.

add the following paragraph:

> To ensure the above guarantee of calling the value's __close metamethod, a reference to it is stored in the coroutine being run until the call is performed.

Rationale: all that I've been trying to tell you so far. No reference
means OK to GC, most likely not what we want. Dibyendu's Java sample
and Gé's C# sample are obvious because the variable is clearly visible
in scope. Not so for a Lua generic for loop closing value.

Here, we make it clear the coroutine references the value of the
to-be-closed variable, so as long as the coroutine lives, the value of
the to-be-closed variable cannot be GCd.

Minor nitpick: edit:

> If a coroutine yields inside a block and is never resumed again, the variables visible at that block will never go out of scope, and therefore they will never be closed.

to read:

> If a coroutine yields inside a block and is never resumed again, its stored to-be-closed variables will never go out of scope, and therefore they will never be closed.

Here we reuse the notion of to-be-closed variables belonging to the
coroutine, and no longer restrict the statement to variables visible
at that block only, but also to previous call stack levels (which I
believe the original intent was).

I honestly don't see how the above changes could restrict any
implementation that wouldn't be absurd without them.

Best regards,

--
DoubleF