Reentrant GC

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Reentrant GC

Soni L.
Has anyone made a reentrant Lua GC, to allow garbage collection within __gc metamethod?

I could really use this, but it doesn't seem to be a thing.
Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

云风 Cloud Wu
Soni L. <[hidden email]> 于2018年11月27日周二 上午10:53写道:
>
> Has anyone made a reentrant Lua GC, to allow garbage collection within __gc metamethod?
>
> I could really use this, but it doesn't seem to be a thing.

I think you can minimize the __gc metamethod , just putting the dead
object to a table in __gc . And call the destructor of objects by
iterating this table in main loop.


--
http://blog.codingnow.com

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Coda Highland
In reply to this post by Soni L.
On Mon, Nov 26, 2018 at 8:53 PM Soni L. <[hidden email]> wrote:
>
> Has anyone made a reentrant Lua GC, to allow garbage collection within __gc metamethod?
>
> I could really use this, but it doesn't seem to be a thing.

That's not the first thing I think of when I hear "reentrant." Lua's
GC *is* reentrant by the definition I usually use, which is to say
that multiple threads can GC distinct Lua states concurrently.

Reentrancy by the notion you suggest sounds pretty tricky. I can
envision how to do it if I were to build a GC from scratch, but given
how much of a niche use case it is I'm not sure if it would be worth
the effort.

What's your use case? Why can't you do something like queue up an
action inside your __gc metamethod that you can process after the
sweep has ended?

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Philippe Verdy
That's a good approach, given that GC performs naturally a loop of cycles that allows a object to not be swept immediately but collected again in the next cycle (after a new collect phase).

I think that what Sony L. wants is to be able to GC recursively within the same thread, so that a __gc metamethod can call any other function that could allocate new objects (so that it would potentially call a GC; however this call is blocked by the fact the thread has already has an active sweeping phase, so any attempt to perform another GC does nothing and the GC will have to wait for the next cycle).

I can imagine that there are cases wher such blocking may cause memory to never be deallocated beause each GC actually will increase the number of objects allocated at each cycle without being able to free any one.

This is not really lack of "reentrance" but lack of "recursivity" (which is already blocked silently as soon as the GC is invoked) which may cause problem (only if your __gc metamethods are doing very complex things where it would require allocating new additional objects that also need such __gc metamethod for their own finalization).

Reentrance on the opposite should not be a problem (unless there's some communication/synchronization mechanism between threads, that forces one thread to wait for the completion of the GC of another thread, in which case you would get a deadlock where one of the thread would have their GC to be blocked and doing then nothing in their own GC cycle, all being delayed for the next cycle).

It's not easy to control that recursivity only during the finalization of objects (in the __gc metamethods), without the help of a background thread dedicated to control GC in all threads (I note for example that Java, Javascript, and other languages perform GC only via a dedicated thread to serialize things correctly; and I see similar things at OS level for their OS-level threads; and as well there's a dedicated background process in the OS dedicated to managing the global heaps needed and used by all other processes, or system drivers; may be this is caused by the complex problem of recursion: instead of recursing a thread (or process) just is blocked by serializing into the system helper thread or process; this can caus a process or thread to freeze temporarily all other processes or threads).

On a single-tasking system where there's no multithreading or multiprocessing, but only cooperative coroutines, the GC becomes blocking if ever there's a need for recursion (but recursion is not possible from within the dedicated coroutine) and the application is naturally blocked as another coroutine which is not being resumed before the mark/sweep phase is complete and finalization has been properly applied without being delayed indefinitely to a never ending loop of GC cycles (which could cause significant CPU processing time without ever entering the thread in a idle cycle, except by forced pauses to do nothing).

There's not a lot of programs that use __gc metamethods for finalization. Basically most uses is for terminating I/O and freeing resources when I/O completes (i.e. it is for emulating async I/O, e.g. for network sockets, which don't necessarily run in a separate thread but in a coroutine of the same thread): for such use, there's normally no need to allocate new memory resources, iut's enough to "free" them by removing a reference, that will then cause the objects to be collected and swept in the next cycle without needing additional I/O, so the finalization is much simpler and can be delayed safely without creating infinite loops.

If your __gc finlization metamethods makes more complex things requiring allocation of new resources, in my opinion the program has a design bug: these resources needed to free objects should have been allocated wit hthe object long before it gets dereferenced and then garbage collected for finalization. Finalization should not allocate memory except for very simple objects that have NO such __gc finalization routine (such as strings or simple preallocated buffers/indexed arrays).

Such program can be modified to use a message loop based on timers to schedule worker coroutines for its actual work, and another helper coroutine for performing explicit GC cycles on objects whose finalization has been delayed and will be managed externally (not in the worker coroutines themselves): this emulates pseudo-threads (like those in old versions of Windows with non-reemptive kernels and cooperative programs via message loops and some priority lists).

True multithreading in Lua is not easy: we can create threads but there's little way to synchronize them cleanly (except by using external I/O): we don't have things like mutexes, and no critical sections for atomic operations for creating these mutexes or controlling accesses to shared variables and communciation buffers. And there's no scheduler we can control (all threads created in Lua are equal). Only coroutines in the same thread are easily controlable. Threads were added mostly to support applications on servers communicating with many independant clients (where no client can control what the other client does, each client having its own resources that can be freed at once in a single operation where all objects of the thred are instantly marked for finalization and finalization will then run in a tight loop).



Le mar. 27 nov. 2018 à 04:52, Coda Highland <[hidden email]> a écrit :
On Mon, Nov 26, 2018 at 8:53 PM Soni L. <[hidden email]> wrote:
>
> Has anyone made a reentrant Lua GC, to allow garbage collection within __gc metamethod?
>
> I could really use this, but it doesn't seem to be a thing.

That's not the first thing I think of when I hear "reentrant." Lua's
GC *is* reentrant by the definition I usually use, which is to say
that multiple threads can GC distinct Lua states concurrently.

Reentrancy by the notion you suggest sounds pretty tricky. I can
envision how to do it if I were to build a GC from scratch, but given
how much of a niche use case it is I'm not sure if it would be worth
the effort.

What's your use case? Why can't you do something like queue up an
action inside your __gc metamethod that you can process after the
sweep has ended?

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Tim Hill


On Nov 26, 2018, at 8:54 PM, Philippe Verdy <[hidden email]> wrote:

If your __gc finlization metamethods makes more complex things requiring allocation of new resources, in my opinion the program has a design bug: these resources needed to free objects should have been allocated wit hthe object long before it gets dereferenced and then garbage collected for finalization. Finalization should not allocate memory except for very simple objects that have NO such __gc finalization routine (such as strings or simple preallocated buffers/indexed arrays).


Exactly. It seems to me the OP is concerned about temporary allocations during the __gc metamethod not being cleaned up until the next GC cycle. Well, welcome to the world of GC!

—Tim

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Soni L.
I am concerned about an attacker setting a __gc metamethod that loops forever and can't be broken.

On Tue, Nov 27, 2018, 04:31 Tim Hill <[hidden email] wrote:


On Nov 26, 2018, at 8:54 PM, Philippe Verdy <[hidden email]> wrote:

If your __gc finlization metamethods makes more complex things requiring allocation of new resources, in my opinion the program has a design bug: these resources needed to free objects should have been allocated wit hthe object long before it gets dereferenced and then garbage collected for finalization. Finalization should not allocate memory except for very simple objects that have NO such __gc finalization routine (such as strings or simple preallocated buffers/indexed arrays).


Exactly. It seems to me the OP is concerned about temporary allocations during the __gc metamethod not being cleaned up until the next GC cycle. Well, welcome to the world of GC!

—Tim

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Philippe Verdy-2
There's a way to restrict that infinite loop so that the __gc metamorphic will not be systematically be called at each cycle but only after a growing delay (exponential growth) has passed. This will rapidly stop the infinite loop even if these objects get never finalized, and at least this will avoid consuming lot of CPU resources.

Le mar. 27 nov. 2018 à 10:30, Soni L. <[hidden email]> a écrit :
I am concerned about an attacker setting a __gc metamethod that loops forever and can't be broken.

On Tue, Nov 27, 2018, 04:31 Tim Hill <[hidden email] wrote:


On Nov 26, 2018, at 8:54 PM, Philippe Verdy <[hidden email]> wrote:

If your __gc finlization metamethods makes more complex things requiring allocation of new resources, in my opinion the program has a design bug: these resources needed to free objects should have been allocated wit hthe object long before it gets dereferenced and then garbage collected for finalization. Finalization should not allocate memory except for very simple objects that have NO such __gc finalization routine (such as strings or simple preallocated buffers/indexed arrays).


Exactly. It seems to me the OP is concerned about temporary allocations during the __gc metamethod not being cleaned up until the next GC cycle. Well, welcome to the world of GC!

—Tim

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Philippe Verdy-2
Metafunction, not metamorphic. Damned Android substitution...

Le mar. 27 nov. 2018 à 11:29, Philippe Verdy <[hidden email]> a écrit :
There's a way to restrict that infinite loop so that the __gc metamorphic will not be systematically be called at each cycle but only after a growing delay (exponential growth) has passed. This will rapidly stop the infinite loop even if these objects get never finalized, and at least this will avoid consuming lot of CPU resources.

Le mar. 27 nov. 2018 à 10:30, Soni L. <[hidden email]> a écrit :
I am concerned about an attacker setting a __gc metamethod that loops forever and can't be broken.

On Tue, Nov 27, 2018, 04:31 Tim Hill <[hidden email] wrote:


On Nov 26, 2018, at 8:54 PM, Philippe Verdy <[hidden email]> wrote:

If your __gc finlization metamethods makes more complex things requiring allocation of new resources, in my opinion the program has a design bug: these resources needed to free objects should have been allocated wit hthe object long before it gets dereferenced and then garbage collected for finalization. Finalization should not allocate memory except for very simple objects that have NO such __gc finalization routine (such as strings or simple preallocated buffers/indexed arrays).


Exactly. It seems to me the OP is concerned about temporary allocations during the __gc metamethod not being cleaned up until the next GC cycle. Well, welcome to the world of GC!

—Tim

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Pierre Chapuis
In reply to this post by Soni L.
On Tue, Nov 27, 2018, at 10:29, Soni L. wrote:
I am concerned about an attacker setting a __gc metamethod that loops forever and can't be broken.

So this is more about debug hooks not running during `__gc` then?

This is a very real problem that has existed for a very long time [1].
I don't know another solution than not allowing untrusted users to set `__gc`.

All sandboxes I know about (including those implemented in C) that do and
don't do something very violent like spawning a thread and killing it after some
time when unresponsive are somehow vulnerable to this.


-- 
Pierre Chapuis
Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Soni L.
The another solution is a reentrant/recursive GC, as far as I know.

On Tue, Nov 27, 2018, 08:44 Pierre Chapuis <[hidden email] wrote:
On Tue, Nov 27, 2018, at 10:29, Soni L. wrote:
I am concerned about an attacker setting a __gc metamethod that loops forever and can't be broken.

So this is more about debug hooks not running during `__gc` then?

This is a very real problem that has existed for a very long time [1].
I don't know another solution than not allowing untrusted users to set `__gc`.

All sandboxes I know about (including those implemented in C) that do and
don't do something very violent like spawning a thread and killing it after some
time when unresponsive are somehow vulnerable to this.


-- 
Pierre Chapuis
Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Pierre Chapuis
Yes maybe, I meant with stock Lua.

On Tue, Nov 27, 2018, at 11:49, Soni L. wrote:
The another solution is a reentrant/recursive GC, as far as I know.

On Tue, Nov 27, 2018, 08:44 Pierre Chapuis <[hidden email] wrote:

On Tue, Nov 27, 2018, at 10:29, Soni L. wrote:
I am concerned about an attacker setting a __gc metamethod that loops forever and can't be broken.

So this is more about debug hooks not running during `__gc` then?

This is a very real problem that has existed for a very long time [1].
I don't know another solution than not allowing untrusted users to set `__gc`.

All sandboxes I know about (including those implemented in C) that do and
don't do something very violent like spawning a thread and killing it after some
time when unresponsive are somehow vulnerable to this.


-- 
Pierre Chapuis

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

pocomane
In reply to this post by Soni L.
On Tue, Nov 27, 2018 at 10:30 AM Soni L. <[hidden email]> wrote:
>
> I am concerned about an attacker setting a __gc metamethod that loops forever and can't be broken.

How a recursive/renetrant GC can solve this issue? The first example I
can think of a "__gc metamethod that loops forever and can't be
broken" is:

```
local x = {}
setmetatable({},{__gc=fucntion()
  while true do
    x[#x+1] = {} -- Not necessary, but funny
  end
end}
```

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Roberto Ierusalimschy
In reply to this post by Soni L.
> The another solution is a reentrant/recursive GC, as far as I know.

I still fail to see your point. The GC in Lua already is
"reentrant/recursive": you can freely call a garbage collection inside
a __gc metamethod. (Although, as others pointed out, this is smelly.)

And I do not see how this feature would solve the problem of
non-terminating finalizers set by bad actors.

Moreover, finalizers set inside a sandbox can get to run outside the
sandbox, so finalizers set by bad actors seem to have many other
problems besides non termination. The best solution here is, as
Pierre pointed out, not to allow setting __gc inside sandboxes.
(In general, both 'getmetatable' and 'setmetatable' should not be
available in sandboxes.)

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Soni L.


On Tue, Nov 27, 2018, 09:33 Roberto Ierusalimschy <[hidden email] wrote:
> The another solution is a reentrant/recursive GC, as far as I know.

I still fail to see your point. The GC in Lua already is
"reentrant/recursive": you can freely call a garbage collection inside
a __gc metamethod. (Although, as others pointed out, this is smelly.)

It disables debug hooks.


And I do not see how this feature would solve the problem of
non-terminating finalizers set by bad actors.

It would no longer disable debug hooks.


Moreover, finalizers set inside a sandbox can get to run outside the
sandbox, so finalizers set by bad actors seem to have many other
problems besides non termination. The best solution here is, as
Pierre pointed out, not to allow setting __gc inside sandboxes.
(In general, both 'getmetatable' and 'setmetatable' should not be
available in sandboxes.)

How would they run outside the sandbox?


-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Roberto Ierusalimschy
> > I still fail to see your point. The GC in Lua already is
> > "reentrant/recursive": you can freely call a garbage collection inside
> > a __gc metamethod. (Although, as others pointed out, this is smelly.)
>
> It disables debug hooks.

The disabling of debug hooks has nothing to do with being reentrant (at
least for I understand as reentrant). This is mainly to avoid confusion
when debugging, but maybe it would be better to allow hooks during
finalizers. The implementation can work both ways, without other
modifications than the setting of "allowhook'.

(Similarly, a finalizer stops the GC only to avoid confusion, such
as another finalizer being called during the finalizer. There is
nothing in the code that demands those restrictions.)

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Soni L.
Hmm...

I think it would be quite good to allow debugging of finalizers by default. A call/return hook can quite easily detect them and partially disable itself if desired.

I've had to debug finalizers before, and it sucked, altho it's not as annoying as the effects no-debug-gc has on sandboxing.

On Tue, Nov 27, 2018, 12:01 Roberto Ierusalimschy <[hidden email] wrote:
> > I still fail to see your point. The GC in Lua already is
> > "reentrant/recursive": you can freely call a garbage collection inside
> > a __gc metamethod. (Although, as others pointed out, this is smelly.)
>
> It disables debug hooks.

The disabling of debug hooks has nothing to do with being reentrant (at
least for I understand as reentrant). This is mainly to avoid confusion
when debugging, but maybe it would be better to allow hooks during
finalizers. The implementation can work both ways, without other
modifications than the setting of "allowhook'.

(Similarly, a finalizer stops the GC only to avoid confusion, such
as another finalizer being called during the finalizer. There is
nothing in the code that demands those restrictions.)

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Philippe Verdy
In reply to this post by pocomane
This example is "blocking" all threads, but does not use recursion or reentrance.

The real problematic example is this one:

  local mt;
  mt = {__gc=function(o)  setmetatable(o,mt) end}
  do
    local x={}
    setmetatable(x, mt)
  end

After the "end", "x" should be finalized but will never be finalized and we'll enter an infinite loop of gc cycles where "x" will be always present in the finalization list, so it will remain resident as if it was static. Now imagine the effect of:

  while true do
    local x={}
    setmetatable(x, mt)
  end

It will allocate an infinite number of new objects that will never be finalized. This thread will exhaust rapidly all resources allowed by the OS for the process, possibly allocating gigabytes or more that will never be used and will constantly grow the finalization list. The CPU will be running at 100%, with lot of paging to disk, causing severe problems to the host OS if the Lua host process is not given hard limits using the OS API (if not, the OS will finally hang and probably crash: this gives then a successful DOS attack). But if we can force a __gc to not finalize the object immediately by putting it in a "delay" list that will be marked as noit finalizable and will remain reachable, we can limit the frequency for finalization of these objects.

We still need a way to restrict the resources allocated by each thread/coroutine (and that's why a suggest a new "__alloc" metamethod for all Lua objects (and especially for threads created from a coordinating parent thread), so they cannot exhaust what is allowed and cannot freeze/crash the other threads or their host process or the whole host OS.


Le mar. 27 nov. 2018 à 12:33, pocomane <[hidden email]> a écrit :
On Tue, Nov 27, 2018 at 10:30 AM Soni L. <[hidden email]> wrote:
>
> I am concerned about an attacker setting a __gc metamethod that loops forever and can't be broken.

How a recursive/renetrant GC can solve this issue? The first example I
can think of a "__gc metamethod that loops forever and can't be
broken" is:

```
local x = {}
setmetatable({},{__gc=fucntion()
  while true do
    x[#x+1] = {} -- Not necessary, but funny
  end
end}
```

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Tim Hill

> On Nov 27, 2018, at 4:12 PM, Philippe Verdy <[hidden email]> wrote:
>
> This example is "blocking" all threads, but does not use recursion or reentrance.
>
>
>   while true do
>     local x={}
>     setmetatable(x, mt)
>   end
>
> It will allocate an infinite number of new objects that will never be finalized. This thread will exhaust rapidly all resources allowed by the OS for the process, possibly allocating gigabytes or more that will never be used and will constantly grow the finalization list. The CPU will be running at 100%, with lot of paging to disk, causing severe problems to the host OS if the Lua host process is not given hard limits using the OS API (if not, the OS will finally hang and probably crash: this gives then a successful DOS attack). But if we can force a __gc to not finalize the object immediately by putting it in a "delay" list that will be marked as noit finalizable and will remain reachable, we can limit the frequency for finalization of these objects.

  t = {}
   while true do
       table.insert(t, “foo”)
   end

So does this code. So do any number of similar run-away blocks of code. What is so special about your example that it should be prevented, when simple examples like mine can do the same thing?

—Tim


Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

pocomane
In reply to this post by Philippe Verdy
On Wed, Nov 28, 2018 at 1:13 AM Philippe Verdy <[hidden email]> wrote:
>
> This example is "blocking" all threads, but does not use recursion or reentrance.

This is EXACTLY my point. In other words, I wanted to highlight that
the issue pointed out by the OP (at least as far a I can undestand)
can arise regardless of recursion or reentrance.

By the way, Roberto already replayed with a clearer post :)

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Philippe Verdy
In reply to this post by Tim Hill
There's a fundamental difference even if the effect looks the same: the loop is not supposed to leak memory, but nothing is garbage collected. and most CPU time will be used by the garbage collector trying to finalize more and more objects without success, not in the Lua program itself but within the Lua virtual machine itself becoming completely out of control and probably crashing even before it can use the panic function.

This should not occur if there was a way to set safe limits to the allocations a "thread" (in fact just a coroutine) can do: only the single thread would run out of limit, would be affected by error() or nil objects being allocated to them, the other threads would run unaffected. No panic event would occur. And a parent thread could detect these situations, and terminate only that offending thread and could take other preventive measures (by being able to detect the origin of this thread).

This is something needed for Lua programs that run threads to implement a web service, or simply to be able (like in Javascript) to use Lua as an hypervisor to virtualize a complete OS: there's a need to be able to track and restrict resource usages in each thread (this is also needed to protect against Meltdown-like time-based attacks targetting the Lua's allocator or garbage collector whose execution time becomes extremely high and thus easily exploitable to create very effective "spying" side-channels).


Le mer. 28 nov. 2018 à 02:37, Tim Hill <[hidden email]> a écrit :

> On Nov 27, 2018, at 4:12 PM, Philippe Verdy <[hidden email]> wrote:
>
> This example is "blocking" all threads, but does not use recursion or reentrance.
>
>
>   while true do
>     local x={}
>     setmetatable(x, mt)
>   end
>
> It will allocate an infinite number of new objects that will never be finalized. This thread will exhaust rapidly all resources allowed by the OS for the process, possibly allocating gigabytes or more that will never be used and will constantly grow the finalization list. The CPU will be running at 100%, with lot of paging to disk, causing severe problems to the host OS if the Lua host process is not given hard limits using the OS API (if not, the OS will finally hang and probably crash: this gives then a successful DOS attack). But if we can force a __gc to not finalize the object immediately by putting it in a "delay" list that will be marked as noit finalizable and will remain reachable, we can limit the frequency for finalization of these objects.

  t = {}
   while true do
       table.insert(t, “foo”)
   end

So does this code. So do any number of similar run-away blocks of code. What is so special about your example that it should be prevented, when simple examples like mine can do the same thing?

—Tim

12