Collection of arguments on a function called with the C API

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Collection of arguments on a function called with the C API

André Henriques
Hi everyone!

I am working on a code that calls a Lua function with an object argument with the C API.

When the Lua function body is empty, the program consumes a lot of memory:

function callback(object) end

However, if I add a variable declaration, the memory usage is low throughout the execution:

function callback(object)
    local _ = {}
end

I would like to know if this is expected and why.

I made a test program to measure memory consumption depending on some parameters:

usage: ./test dummy_table_bool test_iterations userdata_size

Results:

./test 0 5000000 8
lua memory usage = 186.95 MB
alive objects = 1378824

./test 1 5000000 8
lua memory usage = 0.05 MB
alive objects = 36

./test 0 5000000 1024
lua memory usage = 178.31 MB
alive objects = 173103

./test 1 5000000 1024
lua memory usage = 0.68 MB
alive objects = 240

Code:

#include <stdio.h>
#include <stdlib.h>
#include <lua.h>
#include <lualib.h>
#include <lauxlib.h>

long long aliveObjectsCount = 0;

int destructObject(lua_SdestroyeduaState) {
    --aliveObjectsCount;
    return 0;
}

void registerObjectMetatable(lua_State *luaState) {
    luaL_newmetatable(luaState, "Object");
    luaL_Reg functions[] = {{"__gc", destructObject}, {NULL, NULL}};
    luaL_setfuncs(luaState, functions, 0);
}

void pushObject(lua_State *luaState, size_t userDataSize) {
    lua_newuserdata(luaState, userDataSize);
    luaL_setmetatable(luaState, "Object");
    ++aliveObjectsCount;
}

void test(lua_State *luaState, unsigned testIterationsCount, size_t userDataSize) {
    for (unsigned i = 0; i < testIterationsCount; ++i) {
        lua_getglobal(luaState, "callback");
        pushObject(luaState, userDataSize);
        lua_call(luaState, 1, 0);
    }
}

int main(int argc, char *argv[]) {
    if (argc != 4) {
        puts("parameters: dummy_table_bool test_iterations userdata_size ");
        return EXIT_FAILURE;
    }
    const int withDummyTable = atol(argv[1]);
    const unsigned testIterationsCount = atol(argv[2]);
    const size_t userDataSize = atol(argv[3]);
    lua_State *luaState = luaL_newstate();
    luaL_openlibs(luaState);
    registerObjectMetatable(luaState);
    if (withDummyTable) {
        luaL_dostring(luaState, "function callback(object) local _ = {} end");
    } else {
        luaL_dostring(luaState, "function callback(object) end");
    }
    test(luaState, testIterationsCount, userDataSize);
    printf("lua memory usage = %.2f MB\n", lua_gc(luaState, LUA_GCCOUNT)/1024.0);
    printf("alive objects = %lld\n", aliveObjectsCount);
    lua_close(luaState);
    return EXIT_SUCCESS;
}

André Henriques
Reply | Threaded
Open this post in threaded view
|

Re: Collection of arguments on a function called with the C API

彭 书呆
On 2020/11/28 2:44, André Henriques wrote:

> Hi everyone!
>
> I am working on a code that calls a Lua function with an object argument with the C API.
>
> When the Lua function body is empty, the program consumes a lot of memory:
>
> function callback(object) end
>
> However, if I add a variable declaration, the memory usage is low throughout the execution:
>
> function callback(object)
>     local _ = {}
> end
>
> I would like to know if this is expected and why.
>
> I made a test program to measure memory consumption depending on some parameters:
>
> usage: ./test dummy_table_bool test_iterations userdata_size
>
> Results:
>
> ...
>     if (withDummyTable) {
>         luaL_dostring(luaState, "function callback(object) local _ = {} end");
>     } else {
>         luaL_dostring(luaState, "function callback(object) end");
>     }
> ...
>
> André Henriques
>

Just a wild guess, when you create a dummy table, this triggers a gc cycle, while the empty
callback doesn't allocate anything thus gc didn't have a chance to run.

try the following to verify whether this is the case, this code force a full gc:


```
void test(lua_State *luaState, unsigned testIterationsCount, size_t userDataSize) {
    for (unsigned i = 0; i < testIterationsCount; ++i) {
        lua_getglobal(luaState, "callback");
        pushObject(luaState, userDataSize);
        lua_call(luaState, 1, 0);
    }
    lua_gc(LUA_GCCOLLECT);
}
```

or alternatively, you may run a single gc step in every iteration to amortize the gc overhead.
Reply | Threaded
Open this post in threaded view
|

Re: Collection of arguments on a function called with the C API

André Henriques
In reply to this post by André Henriques
On 2020/12/13 10:39, nerditation wrote:

> Just a wild guess, when you create a dummy table, this triggers a gc cycle, while the empty
> callback doesn't allocate anything thus gc didn't have a chance to run.
>
> try the following to verify whether this is the case, this code force a full gc:
>
>
> ```
> void test(lua_State *luaState, unsigned testIterationsCount, size_t userDataSize) {
>     for (unsigned i = 0; i < testIterationsCount; ++i) {
>         lua_getglobal(luaState, "callback");
>         pushObject(luaState, userDataSize);
>         lua_call(luaState, 1, 0);
>     }
>     lua_gc(LUA_GCCOLLECT);
> }
> ```
>
> or alternatively, you may run a single gc step in every iteration to amortize the gc overhead.

Thanks for the attention nerditation!

Indeed, I believe the dummy table triggers a gc cycle.

I tried your ideas. The full gc will collect part of the dead objects as
expected. The gc step (stepsize = userDataSize) on each iteration will keep
memory usage low, although with a performance penalty far greater (about 2x the
time) than using the dummy table without explicitly calling the garbage collector.

I have little knowledge about garbage collectors. Does anyone know why the
allocated function argument objects do not (or barely) affect the gc? Is this
such and edge case that it is not worth the trouble to be dealt with? That is
not a flaw in my opinion, I am asking just out of curiosity =].

André Henriques
Reply | Threaded
Open this post in threaded view
|

Re: Collection of arguments on a function called with the C API

彭 书呆
On 2020/12/16 2:38, André Henriques wrote:
> On 2020/12/13 10:39, nerditation wrote:
> [...]
> I have little knowledge about garbage collectors. Does anyone know why the
> allocated function argument objects do not (or barely) affect the gc? Is this
> such and edge case that it is not worth the trouble to be dealt with? That is
> not a flaw in my opinion, I am asking just out of curiosity =].
>
> André Henriques
>

well, it turns out it's not because gc is not triggered, but it's a interesting
interaction between you testing methodology and the default gc parameters.

since all the new objects are created with finalizers, the gc earns many "credits"
by running finalizers, but it takes more and more "steps" between gc pauses.
the to-be-finalized object list gets longer and longer for subsequent gc cycles.

usually, when an application reaches somewhat steady state, the gc will catch up.
if your application must create many short lived objects, considering switching
to the generational gc. but you should only tweak the gc corresponding to
measurements done in real scenarios.

btw, per my testing, for your particular testing code, simply reduce the gc pause
a little bit (e.g. 190) and it will keep the memory usage reasonably low. don't
ask me how and why, I don't really understand the gc algorithm. I just tinkered
with the code a bit inside a debugger.