Re: Change in GC behaviour -- Ever-increasing memory usage

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Change in GC behaviour -- Ever-increasing memory usage

Pedro Tammela-2
> > I came across a supposed leak[1,2] with some C code. I think I distilled
> > the behaviour that causes problems into the attached Lua program and
> > would now like to get some input on "which code is to blame".
> >
> > I noticed that the problem occurs with Lua 5.3, but does not occur with
> > Lua 5.1 or 5.2. Thus, I could do a Git bisect based on the GitHub mirror
> > of the Lua source code[3]. The result is that commit [4] introduced the
> > problem.
> >
> > My main question now is: What to do about this?
> >
> >
> > Some more information about the program: The library in question
> > provides bindings to GObject-based C libraries. To support callbacks, it
> > has to luaL_ref() the callback function and it will then luaL_unref()
> > later from the __gc metamethod of some of the involved object.
> >
> > This is just what the attached program does: In a tight loop, it creates
> > a table, luaL_ref()s it and arranges for a call to luaL_unref() later.
> > Apparently the code can now create references faster than they are released.
>
> Lua controls de pace of its GC like this: during each cycle, it tries to
> estimate how much memory it is actually using. Then, after it completes
> the cycle, it waits for its memory use to grow by some percentage
> (typically 100%) before starting a new cycle.
>
> Because the program uses finalizers to release most of its memory, it
> needs two GC cycles to collect that memory. For objects with finalizers,
> Lua knows about that, and therefore these objects do not enter in
> that previous computation (how much memory is actually in use).
> However, Lua knows nothing about references. For Lua, all objects
> in the 'register' table are non garbage. So, by the end of a GC cycle,
> it looks like it is using all that memory, so it wait a little more
> for the next cycle. With this waiting, more objects will be in the
> registry by the end of the next cycle, and there is a vicious cycle.
>
> In my tests, Lua 5.1 presented the same problem. I am not sure why
> Lua 5.2 does not.
>
> -- Roberto

I believe this issue happens in 'normal' tables as well.

I've managed to reproduce this issue in both Lua 5.3 and Lua 5.4-work2
and could isolate this to a simple C code. The memory allocation
pattern in both version is pretty different, probably thanks to the
new GC in 5.4.

This code [1] will grow the memory steadily and will exhaust the
computer memory given enough iterations.

In my 8GB machine, this happened in 11 seconds when compiled with 5.3.

[2][3] both show the memory allocation pattern using massif from valgrind.

[1] https://pastebin.com/KCsHduKB
[2] https://pastebin.com/p2b6NE0Q
[3] https://pastebin.com/3hEkk2j9

Reply | Threaded
Open this post in threaded view
|

Re: Change in GC behaviour -- Ever-increasing memory usage

Sergey Zakharchenko
Hello Pedro,

Pedro Tammela <[hidden email]>:
> I believe this issue happens in 'normal' tables as well.
> This code [1] will grow the memory steadily and will exhaust the
> [1] https://pastebin.com/KCsHduKB

for (i = 0; i < (1UL << 24); i++) {
      void *ptr;
      ptr = lua_newuserdata(L, 4096);
      luaL_setmetatable(L, "mt");
      ref = luaL_ref(L, 1);
      luaL_unref(L, 1, ref);
   }

I may be misunderstanding but ... Don't you just keep creating
userdata and filling up the Lua stack? There's no lua_pop or similar.
luaL_unref [-0, +0, -] only unrefs the object, but doesn't pop it.

Best regards,

--
DoubleF

Reply | Threaded
Open this post in threaded view
|

Re: Change in GC behaviour -- Ever-increasing memory usage

Gabriel de Quadros Ligneul
> I may be misunderstanding but ... Don't you just keep creating
> userdata and filling up the Lua stack? There's no lua_pop or similar.
> luaL_unref [-0, +0, -] only unrefs the object, but doesn't pop it.

luaL_ref pops the userdata from the stack.
Reply | Threaded
Open this post in threaded view
|

Re: Change in GC behaviour -- Ever-increasing memory usage

Pedro Tammela-2
In reply to this post by Pedro Tammela-2
> I may be misunderstanding but ... Don't you just keep creating userdata and filling up the Lua stack?

Hi, from the 5.3 manual for luaL_ref:

"Creates and returns a reference, in the table at index t, for the object at the top of the stack (and pops the object)."
Reply | Threaded
Open this post in threaded view
|

Re: Change in GC behaviour -- Ever-increasing memory usage

Sergey Zakharchenko
In reply to this post by Gabriel de Quadros Ligneul

> > I may be misunderstanding but ... Don't you just keep creating
> > userdata and filling up the Lua stack? There's no lua_pop or similar.
> > luaL_unref [-0, +0, -] only unrefs the object, but doesn't pop it.
>
> luaL_ref pops the userdata from the stack.

Ah, right, sorry for the noise.

Best regards,

--
DoubleF

Reply | Threaded
Open this post in threaded view
|

Re: Change in GC behaviour -- Ever-increasing memory usage

Pedro Tammela-2
In reply to this post by Pedro Tammela-2
> I believe this issue happens in 'normal' tables as well.
>
> I've managed to reproduce this issue in both Lua 5.3 and Lua 5.4-work2
> and could isolate this to a simple C code. The memory allocation
> pattern in both version is pretty different, probably thanks to the
> new GC in 5.4.
>
> This code [1] will grow the memory steadily and will exhaust the
> computer memory given enough iterations.
>
> In my 8GB machine, this happened in 11 seconds when compiled with 5.3.
>
> [2][3] both show the memory allocation pattern using massif from valgrind.
>
> [1] https://pastebin.com/KCsHduKB
> [2] https://pastebin.com/p2b6NE0Q
> [3] https://pastebin.com/3hEkk2j9

Playing a little bit more with the issue above I've stumbled upon an
interesting result. It seems that references to strings does not
reproduce the issue. The code is show in [1]. Valgrind's massif
reports the expected result, a constant usage of the memory [2].

[1] https://pastebin.com/C9p3EVfv
[2] https://pastebin.com/4Tdg4jUg

Reply | Threaded
Open this post in threaded view
|

Re: Change in GC behaviour -- Ever-increasing memory usage

Gabriel de Quadros Ligneul
I reproduced the ever-increasing memory behavior with a simple script.

local mt = {
        __gc = function() end
}
for i = 1, math.pow(10, 9) do
        setmetatable({}, mt)
        if (i % math.pow(10, 6)) == 0 then print(collectgarbage'count') end
end

In Lua 5.3.5, the memory grows to 1GB within a few seconds.
In Lua 5.2 and 5.4, it still grows to few MBs, but very slowly.
Setting the GC pause to 100 or less seems to solve this issue. Is this the correct solution?

--
Gabriel
Reply | Threaded
Open this post in threaded view
|

Re: Change in GC behaviour -- Ever-increasing memory usage

Philippe Verdy
I don't understand why this line:
  setmetatable({}, mt)
needs to recreate a new table instance each time for "{}", give nthat this table constructor is using static code which does not depend on any context, any variable, any upvalues and will evaulate the same. Why can't that static code generated for the new empty table constructor return the same table on each iteration, i.e. just consider that this table instanciation can be extracted out of the loop, as if the code was:

local mt = {
        __gc = function() end
}
local _t = {}
for i = 1, math.pow(10, 9) do
        setmetatable(_t, mt)
        if (i % math.pow(10, 6)) == 0 then print(collectgarbage'count') end
end

OK this new table instanciation is modifying one of its internal fields by using setmetatable() on it, but the modified table object is not used anywhere else in the loop, so the setmetatable() can also be factorized outside the loop:

local mt = {
        __gc = function() end
}
local _t = {}
setmetatable(_t, mt)
for i = 1, math.pow(10, 9) do
        if (i % math.pow(10, 6)) == 0 then print(collectgarbage'count') end
end

And then _t has no other use and can be eliminated as a dead code, giving:

local mt = {
        __gc = function() end
}
for i = 1, math.pow(10, 9) do
        if (i % math.pow(10, 6)) == 0 then print(collectgarbage'count') end
end

And finally the metatable itself is also now unused and eliminated, so the code should behave exactly like:

for i = 1, math.pow(10, 9) do
        if (i % math.pow(10, 6)) == 0 then print(collectgarbage'count') end
end

Visibly the LuA compilers never tracks if any object is actually referenced after it's been modified or created, it leaves the creation leaking, hoping that later it will be garbage-collected when it comes out of scope.

However in the initial code, "mt" never reachs out of scope before the billion loops are terminated, only the empty table created at each iteration goes out of scope just before the next iteration (and it gets "forcibly" garbage-collected only every million loop with the embedded "if" test on the value of i. Actually it is just marked and hoping that it will be garbage collected in the next recycling generation (which almost never occurs because we force a mark phase, and never get a chance to sweep (because there was no change of scope and stack during that loop, exept when calling the print() and collectgarbage() functions exceptionnally.

So the allocation gets rapidly out of control, we allocate millions of objects that are not used and given no chance of being swept (they are only marked, but too rarely to allow the actual sweep phgase to occur, because we never cross a threshold with enough marks).

The GC in standard Lua still has major defects, it does not work as it should (but unfortunately to do better things, there should be an actual thread, not a yielded task, runing in the background, ideally the GC should work concurrently without needing any collaboration with other competing threads.

But Lua is still not designed to use actual threads (at least POXIX-compliant threads) and still works with explicit change of contexts by yield/resume made by concurrent but cooperating threads, and Lua still lacks a mechanism to allow actual threads to synchronize on critical objects like the GC. This is a major difference with Javascript, Java or .Net whose VM implementation includes support for preemptive threads and all needed synchronization.

But except the case of small devices running specific low-performance tasks in Lua, or sandboxed luas runing in separate single-threaded workers per client in web servers, I don't see why we cannot use any existing POSIX thread or native OS thread, at least for some critical system-level threads (and notably the GC which should preempt every other thread with high priority to preserve resource quotas). The GC itself is not even written in Lua, it's part of the VM itself and should use native the LuaC library, provinging only proxy methods for use in 100% pure Lua-written applications. All systems (including now small IoT devices) have native threads (even if they don't necessarily have multiprocess kernels for isolation). But the GC and the application are necessarily working on the same memory space and such isolation is not necessary: true multithreading is always possible, including on the smallest 1-core CPUs (it works even on the smallest microcontrolers, and worked on all the first personal computers in the 80's using 8-bit processors, and very lopw amout of RAM (even just a few kilobytes), and it did not needed anything from the underlying OS, just needed an access to a single timer interrupt which could even be written in high-level languages like C or Pascal, and just a dozen of assembly instructions.

Why isn't there in Lua a basic multithreading engine capable of running one user thread and one or several threads for the VM itself(a GC thread, a system call thread for integration and managing external resources, and a watchdog thread that would detect hanged/blocked user thread and will kill the whole, terminating the VM instance itself, and freeing up the OS-level handles to other system resources; everything else would then remain written in Lua, compiled and running in a single thread, including yield/resume cooperating threads actually running in the same preemptabled low-priority thread but separately from the critical threads which would be the only ones capable of preempt the user-level lua thread).

The minimum set of threads would use 4 priority levels:
- level 0: the main thread for initialization of the VM and runing the "idle" watchdog loop.
- level 1: the system call integration thread (used indirectly by the GC thread to free memory, or by the user-level Lua thread to allocate memory)
- level 2: the GC thread
- level 3: the user Lua thread where all Lua collaborative threads are running, using yeild/resume or blocking I/O and system calls).
The timer interrupt would run in the context of level 0 and would mark a new time limit for the idle/watchdoc loop, before giving the hand to the sys call thread if it is polling for syscall completion, otherwise to the GC thread.
The syscall thread would perform the pending task and would check their completion, but if it must block, it should give the hand to the watchdog idle loop (which will only be premeted by the timer interrupt)
The GC thread would perform mark/sweep pages, making some blocking calls to the level 1 syscall thread to manage reclaimed memory
All the rest would happend in level 3 within pure Lua code or using the standard Lua library (excluding the LuaC library which would remain inaccessible and not exposed directly without creating a "proxying" blocking gate call to other lower levels, and this proxy would be implemented only in the standard Lua library).

Reply | Threaded
Open this post in threaded view
|

Re: Change in GC behaviour -- Ever-increasing memory usage

Gabriel de Quadros Ligneul
> Why can't that static code generated for the new empty table constructor return the same table on each iteration

That is not the point, this is the smallest snippet I came up with that reproduces the odd behavior described in the thread.
The simplification you described don't apply to my full program (which isn't this snippet).

--
Gabriel
Reply | Threaded
Open this post in threaded view
|

Re: Change in GC behaviour -- Ever-increasing memory usage

Roberto Ierusalimschy
In reply to this post by Gabriel de Quadros Ligneul
> I reproduced the ever-increasing memory behavior with a simple script.
>
> local mt = {
>         __gc = function() end
> }
> for i = 1, math.pow(10, 9) do
>         setmetatable({}, mt)
>         if (i % math.pow(10, 6)) == 0 then print(collectgarbage'count') end
> end
>
> In Lua 5.3.5, the memory grows to 1GB within a few seconds.
> In Lua 5.2 and 5.4, it still grows to few MBs, but very slowly.
> Setting the GC pause to 100 or less seems to solve this issue. Is this the
> correct solution?

I tested with the current 5.4 version in github. It oscilates around
40~70 kilobytes. The maximum grows slowly until a top of ~132K (after
243426 new tables), and then it stops growing.

But it seems to be a "fact of live" that programs that create too
many objects with finalizers are subject to garbage-collection
issues, not only in Lua.

-- Roberto