Question about the major gc

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Question about the major gc

重归混沌
Hello.

I'm learning Lua's generational GC recently.

It seems that:

After a enought amount of `youngcollection`, Lua will execute the `fullgen`.

If there is a huge amount of OLD object, the `fullgen` will make the STW
for a long while.


I'm curious why not switch to incremental mode instead of executing
`fullgen`.


Please forgive me if I have any mistakes.
Reply | Threaded
Open this post in threaded view
|

Re: Question about the major gc

Roberto Ierusalimschy
> It seems that:
>
> After a enought amount of `youngcollection`, Lua will execute the `fullgen`.
>
> If there is a huge amount of OLD object, the `fullgen` will make the STW
> for a long while.
>
>
> I'm curious why not switch to incremental mode instead of executing
> `fullgen`.

If it swithces to incremental mode, it will never switch back.
The idea of this behavior is to try to adapt the collector
temporarily. (I am not sure we will keep it that way.)

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: Question about the major gc

Matthew Wild
On Wed, 15 Jul 2020, 19:06 Roberto Ierusalimschy, <[hidden email]> wrote:
The idea of this behavior is to try to adapt the collector
temporarily. (I am not sure we will keep it that way.)

I'm curious about this last part, do you have specific changes in mind?

I'm asking because the GC is currently our main pain point with Lua in a long-running application. Tweaking the default parameters helps somewhat in previous versions, and we've not yet tried 5.4 in production (maybe after .1 ;) ).

I'm excited to try the new generational mode but would be keen to know of any open issues (such as reported here) or any planned improvements.

Regards,
Matthew
Reply | Threaded
Open this post in threaded view
|

Re: Question about the major gc

Roberto Ierusalimschy
> > The idea of this behavior is to try to adapt the collector
> > temporarily. (I am not sure we will keep it that way.)
>
> I'm curious about this last part, do you have specific changes in mind?

Maybe remove this particular behavior or else to change its parameters.

The idea behind this behavior is that it is somewhat expensive to change
from generational to incremental. So, if it seems likely that a full
collection will be followed by another full collection, it may be wise
not to change the collector back to generational mode between these two
collections. If the collector knew the future, everything would be much
easier :-) As it does not, its guesses sometimes may make things worse.

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: Question about the major gc

Philippe Verdy-2
generational GC is good, but is not restricted to just manage a single "young" and a single "old" generation. You can have several layers of "old" generations, and then perform incremental GC and compression inside the younger "old" generations. With just two layers of old generations, this generally solves the problem, because the youngest old generation will still change less often. Once a younger old generation has been processed, its data has been moved to the lower layer for older generations (from which they can still be resurrected to the young generation instantly, possibly by splitting the older blocker into several subblocks in the immediate younger old generation, and one subblock in the youngest active generation).
A full GC will be more rarely needed, it will only occur in the oldest generation layer once all incremental GC on younger generation failed to collect more space needed. And as several layers of old generations can be collected incrementally in the background, they can be collected by the full GC for the oldest generation faster.
The full GC is only costly if it has to work not just on the old generations but also on the young one when this youngest generation is very populated and has never been incrementally collected to some older generations: this is where the application can "freeze" for a long time: the full GC should first make all attempts to handle the youngest generation into a young old generation, to unlock the application fast, and then delegate the rest of the tesk to the incremental GC of young and old generations.

Note that incremental GC does not work very well if the application does not create coroutines for its work and oes not suspend itself to allow the incremental GC to run in its own coroutines. But the memory allocator used by the main thread could still yield to a coroutine to allow the GC to run and yield itself where needed to other incremental GC for older generations. It is the GC that will resume the yielder, returning the memory it demanded. The same could be used for "to-close" finalizers freeing their objects: they should yield to the GC which will decide if it should start another incremental collection or yield to a lower layer of old generation GC.

And in the main loop of the application, there should be some warranty where it will yield to the GC it uses: this is simple to ensure if the main layer performs allocation or freeing of some objects of ANY datatype, such as building small tables for new events or states. Even a single thread application not using coroutines will allocate and free mlany objects (unless the application continuous runs on the same set of varaibles and tables without creating new ones, in which case it will never be "interrupted" by GC during thee long runs, this case is exceptional, very rare, in fact you don't need GC at all for this case as all memory requirement is checked from the start before doing such long run): this is where the coroutines for the GC can run concurrently.



Le mar. 21 juil. 2020 à 20:21, Roberto Ierusalimschy <[hidden email]> a écrit :
> > The idea of this behavior is to try to adapt the collector
> > temporarily. (I am not sure we will keep it that way.)
>
> I'm curious about this last part, do you have specific changes in mind?

Maybe remove this particular behavior or else to change its parameters.

The idea behind this behavior is that it is somewhat expensive to change
from generational to incremental. So, if it seems likely that a full
collection will be followed by another full collection, it may be wise
not to change the collector back to generational mode between these two
collections. If the collector knew the future, everything would be much
easier :-) As it does not, its guesses sometimes may make things worse.

-- Roberto