Reentrant GC

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Francisco Olarte
Philipe:
On Wed, Nov 28, 2018 at 9:03 AM Philippe Verdy <[hidden email]> wrote:
> There's a fundamental difference even if the effect looks the same: the loop is not supposed to leak memory, but nothing is garbage collected. and most CPU time will be used by the garbage collector trying to finalize more and more objects without success, not in the Lua program itself but within the Lua virtual machine itself becoming completely out of control and probably crashing even before it can use the panic function.

Au contraire. Although your bottom-quoting style hides the code in
question, it can be read as "lets mt be a metatable which forbids
garbage collection, enter an infinite loop setting the metatable of a
local table to said gc-forbiding table". It's clearly supposed to leak
memory, tha same as the table.insert example by tim ( although the
second one does not leak, it consumes memory but it is clearly
reachable and not leaked.

...
> This is something needed for Lua programs that run threads to implement a web service,

NO, it is not. Lots of people run thread oriented web services without this.

Francisco Olarte.

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Philippe Verdy


Le mer. 28 nov. 2018 à 12:55, Francisco Olarte <[hidden email]> a écrit :
 This is something needed for Lua programs that run threads to implement a web service,

NO, it is not. Lots of people run thread oriented web services without this.

It is not raisonnable at all to run any web service with threads for servicing each concurrent request that can steal unlimited amounts of resources needed by concurrent threads. You need a way to police them or at least ensure that each thread will get an equitable share of resource, i.e. equal share of time and memory, without also exceeding a maximum allowed quantum so that each thread can be sure to also have a minimum share and perform without failing always.

If the web service dies not do that, it is extremely easy to block with a DOS attack. All reasonable requests should then succeed and other requests exceeding their quota will be the only one to fail as expected.

We need limits always because absolutely no server has infinite amounts of resources. So we always need a way to strictly enforce these quota.

With basic Lua threads, we have absolutely no control in any thread that can deliberately use the allocator as much as they want, when they should fail early without causing damages to other competing threads, including hidden internal threads needed for the maintenance and monitoring of the web service itself, or for enforcing the policy, and perform cleanup of failing threads that have exhausted their allowed quota.

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Philippe Verdy
So Lua alone is not sufficient: the pseudo threads created by coroutine.create have unlimited time resource if they never yield (because the routines are not preempted at all by any scheduler enforcing the time quota) and they also have unlimited access to the same allocator (there's no equivalent to the Luac function lua_setallocator, and in fact no way to create such allocator in Lua, but we could still have the way to have a __alloc metamethod in thread object types that would be used to measure the memory resources requested by the thread in order to limit them by just returning a Boolean authorizing or not the use of the global allocator.

For time resources, there's no solution with Lua coroutines if there's no preemptive mechanism that would force them to yield after a maximum timeslice quantum... For that purpose, we need TRUE threads plus a scheduler, not just cooperating routines.

Si we can't do that in pure Lua with its existing standard library. We need an extension to threads, or we need to use OS threads to instantiate as many separate Lua engines, using a hos application written in C and using the reentrant luac library...



Le mer. 28 nov. 2018 à 13:19, Philippe Verdy <[hidden email]> a écrit :


Le mer. 28 nov. 2018 à 12:55, Francisco Olarte <[hidden email]> a écrit :
 This is something needed for Lua programs that run threads to implement a web service,

NO, it is not. Lots of people run thread oriented web services without this.

It is not raisonnable at all to run any web service with threads for servicing each concurrent request that can steal unlimited amounts of resources needed by concurrent threads. You need a way to police them or at least ensure that each thread will get an equitable share of resource, i.e. equal share of time and memory, without also exceeding a maximum allowed quantum so that each thread can be sure to also have a minimum share and perform without failing always.

If the web service dies not do that, it is extremely easy to block with a DOS attack. All reasonable requests should then succeed and other requests exceeding their quota will be the only one to fail as expected.

We need limits always because absolutely no server has infinite amounts of resources. So we always need a way to strictly enforce these quota.

With basic Lua threads, we have absolutely no control in any thread that can deliberately use the allocator as much as they want, when they should fail early without causing damages to other competing threads, including hidden internal threads needed for the maintenance and monitoring of the web service itself, or for enforcing the policy, and perform cleanup of failing threads that have exhausted their allowed quota.

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Gé Weijers


On Wed, Nov 28, 2018 at 4:40 AM Philippe Verdy <[hidden email]> wrote:
So Lua alone is not sufficient: the pseudo threads created by coroutine.create have unlimited time resource if they never yield (because the routines are not preempted at all by any scheduler enforcing the time quota) and they also have unlimited access to the same allocator (there's no equivalent to the Luac function lua_setallocator, and in fact no way to create such allocator in Lua, but we could still have the way to have a __alloc metamethod in thread object types that would be used to measure the memory resources requested by the thread in order to limit them by just returning a Boolean authorizing or not the use of the global allocator

The Lua team designed an extension language that's powerful and small, and you want it to be something else, a multithreaded language with advanced scheduling and fine control over memory allocation, and because it's not you complain it's lacking in that respect. That's like buying a small roadster car and then complaining you can't haul building materials with it.

The coroutines in Lua solve a very useful problem: you can code event-driven programs without having to resort to the callback hell you see in Node.js. It's not meant to be used for single programs that can keep a 48 core server busy, although you can run multiple Lua states and figure out a way to make them communicate (Lanes etc.)



Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Tim Hill
In reply to this post by Philippe Verdy


> On Nov 28, 2018, at 4:39 AM, Philippe Verdy <[hidden email]> wrote
> Si we can't do that in pure Lua with its existing standard library. We need an extension to threads, or we need to use OS threads to instantiate as many separate Lua engines, using a hos application written in C and using the reentrant luac library...

Which is exactly what we did on our project .. each Lua VM was event driven from an event queue and each event call was monitored for time via debug hooks and memory usage via a custom allocator. We then built a dispatcher using an OS thread pool to drive the VMs. The fact that we could do all that without a single patch to the Lua source was a testament to its clean design.

And imho that is the point of Lua .. it’s a clean efficient building block you can mold to your needs, not an imposing edifice that must be used in a particular manner.

—Tim
Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Pierre Chapuis
On Wed, Nov 28, 2018, at 22:22, Tim Hill wrote:

> Which is exactly what we did on our project .. each Lua VM was event
> driven from an event queue and each event call was monitored for time
> via debug hooks and memory usage via a custom allocator. We then built a
> dispatcher using an OS thread pool to drive the VMs. The fact that we
> could do all that without a single patch to the Lua source was a
> testament to its clean design.

Almost the same here. Sadly we cannot open source it.

>From the point of view the user, it works a bit like Unix:
they can spawn Lua "processes" that run in separate Lua
threads, can respond to events and communicate through
some kind of "IPC".

>From the point of view of the implementer, the execution
interface is abstract, so you can run the processes on top of
an OS thread pool, in an event loop, etc.

>From a debug console we can do things like this:

    > ps
      PID          STATE REF         MEM     MAX_MEM          TIME NAME
        1        RUNNING   2    515.8KiB    636.3KiB       166.7ms init
    > spawn diagnose
    PID=9
    > ps
      PID          STATE REF         MEM     MAX_MEM          TIME NAME
        1        RUNNING   2    530.6KiB    636.3KiB       209.8ms init
        9           IDLE   1    723.5KiB    748.7KiB        10.0us diagnose
    > kill 9
    > ps
      PID          STATE REF         MEM     MAX_MEM          TIME NAME
        1        RUNNING   2    544.3KiB    636.3KiB       214.8ms init

The entire library is a mere 3000 lines of C and 500 lines of Lua,
and it runs on Mac, Linux, Windows, Android and iOS.

I wonder how many companies have re-built basically this,
with all the same tricks to interact with C code, event loops...

--
Pierre Chapuis

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Philippe Verdy
In reply to this post by Gé Weijers


Le mer. 28 nov. 2018 à 20:30, Gé Weijers <[hidden email]> a écrit :


On Wed, Nov 28, 2018 at 4:40 AM Philippe Verdy <[hidden email]> wrote:
So Lua alone is not sufficient: the pseudo threads created by coroutine.create have unlimited time resource if they never yield (because the routines are not preempted at all by any scheduler enforcing the time quota) and they also have unlimited access to the same allocator (there's no equivalent to the Luac function lua_setallocator, and in fact no way to create such allocator in Lua, but we could still have the way to have a __alloc metamethod in thread object types that would be used to measure the memory resources requested by the thread in order to limit them by just returning a Boolean authorizing or not the use of the global allocator

The Lua team designed an extension language that's powerful and small, and you want it to be something else, a multithreaded language with advanced scheduling and fine control over memory allocation, and because it's not you complain it's lacking in that respect. That's like buying a small roadster car and then complaining you can't haul building materials with it.

That's not what I requested. a full scheduler is not needed in the language, but the language should allow building one entirely written in Lua.

What is needed is a way to control in Lua programs (essentially in a parent thread controling all the other threads it creates and monitors) the resource usage and inform the VM which metered operations are allowed or not, and the VM should provide such basic metrics: the timeslot which is requested or the amount of memory requested or no longer in use. Nothing more. Now we can design a parent thread to do whatever it wants to apply its own policy for the children threads it creates and controls so none of these children threads can abusely take all the resources needed by other competing children threads (including the parent thread itself which can be protected from being blocked by its children).

Reply | Threaded
Open this post in threaded view
|

RE: Reentrant GC

Tim McCracken
In reply to this post by Pierre Chapuis
Pierre Chapuis wrote:
> I wonder how many companies have re-built basically this, with all the same tricks to interact with C code, event loops...

I suspect a lot! I am currently working on a similar system. I am following a "soft RTOS" model:   A single OS process will handle multiple "jobs", include a "global common" for shared data, and an embedded real-time(ish) database. Each "job" will include a Lua interpreter, prioritized event queue, recurring event scheduler,  and a few other features and additional/alternative standard libraries. Each "task" will be a Lua function.  And lots of value-added functionality - both generalized as well as application specific, which may vary based on the specific product.

I use the Lua interpreter  "out of the box" stock, although I have added some additional libraries. I have been very happy with Lua. I took the time to understand it's limitations, which in reality are very few since it can be easily extended through the API.  IMO, It is much easier to extend and embed than either TCL or Python. It is a great platform to add lots of value-added C/C++ code  and give users the ability to programmatically control functionality rather than just having access to configuration variables and tables - which is the norm for all of the systems I commonly work with. But I don't expect (or want) Lua to offer features that are part of the OS, or a 3GL library. What I really want from Lua is a structured, simple, scripting environment for end users to customize as required - any Lua is outstanding at that.  

There are (of course) language features I would like, many of which get discussed on this list regularly. But, in my case, they can all be accomplished with a few more keystrokes. *** For me the number one concern is the stability of the Lua language. ***   I am more than willing to adapt to, and take advantage of, changes in the C API with each new release. But in my ideal world, the Lua scripts that customers may develop need to work from one release to the next. So going forward, I really hope we ___rarely___ see changes that are not backward compatible.


> Sadly we cannot open source it.
The products I am working on will not be open source, but I am planning to have a fully functional (but limited in size) system for hobbyist/evaluation use at zero cost.
Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

aryajur
In reply to this post by Philippe Verdy


What is needed is a way to control in Lua programs (essentially in a parent thread controling all the other threads it creates and monitors) the resource usage and inform the VM which metered operations are allowed or not, and the VM should provide such basic metrics: the timeslot which is requested or the amount of memory requested or no longer in use. Nothing more. Now we can design a parent thread to do whatever it wants to apply its own policy for the children threads it creates and controls so none of these children threads can abusely take all the resources needed by other competing children threads (including the parent thread itself which can be protected from being blocked by its children).

There is a Lua Library to do that: LuaStepper (https://luarocks.org/modules/aryajur/luastepper)
Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Philippe Verdy
May be, but it depends on an external C module implementing the preemptive meachanism (also its "tasks" are limited to scripts written as strings, so they cannot work with Lua functions and closures without first marshalling all parameters and its environment into the script (which will be unmarshalled by the Task engine), and then marshall the results of the task to string using some "print" statement that will be unmarshalled by parsing the output with getTaskData().

So I think it will be quite slow (and not easy to integrate to communicate complex Lua structures made with tables with possible cyclic references)

I think it uses a separate instance of the Lua engine, so these are not "threads" sharing data with other threads and coroutines (they may envtually share the same allocator and heap, but how to pass references to objects in a Lua thread with objects in a Task or in another task ? You also need extra communication channels I think.

What is needed is (I think) a native integration with threads in the same engine instance to benefit of closures and avoid the extra (un)marshalling using possibly very long strings.

The typical usage shown is extremely complex to start with, and I don't understand the concept of "number of steps" in the "runLoop". We probably don't need that or this dramatically reduces the possible usages to some limited scriptable tasks that can run independantly of any context and jsut take some input, run, then return some ouput, without communiating anything with other tasks or normal Lua threads.


Le jeu. 29 nov. 2018 à 23:11, Milind Gupta <[hidden email]> a écrit :


What is needed is a way to control in Lua programs (essentially in a parent thread controling all the other threads it creates and monitors) the resource usage and inform the VM which metered operations are allowed or not, and the VM should provide such basic metrics: the timeslot which is requested or the amount of memory requested or no longer in use. Nothing more. Now we can design a parent thread to do whatever it wants to apply its own policy for the children threads it creates and controls so none of these children threads can abusely take all the resources needed by other competing children threads (including the parent thread itself which can be protected from being blocked by its children).

There is a Lua Library to do that: LuaStepper (https://luarocks.org/modules/aryajur/luastepper)
Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

aryajur
The C module just uses the Lua API and does not do anything OS specific.
    Sorry I do not quite understand the other remarks about having tasks only as strings. Every Lua program is a string to start with. 
   Run loop number of steps is just the number of opcodes you want to execute before returning control to the parent program.
     Communication between tasks is limited to passing tables and strings and not fancy shared objects. Recursive tables however should not be a problem. Communication right now will be managed by the parent task.



On Nov 29, 2018 2:54 PM, "Philippe Verdy" <[hidden email]> wrote:
May be, but it depends on an external C module implementing the preemptive meachanism (also its "tasks" are limited to scripts written as strings, so they cannot work with Lua functions and closures without first marshalling all parameters and its environment into the script (which will be unmarshalled by the Task engine), and then marshall the results of the task to string using some "print" statement that will be unmarshalled by parsing the output with getTaskData().

So I think it will be quite slow (and not easy to integrate to communicate complex Lua structures made with tables with possible cyclic references)

I think it uses a separate instance of the Lua engine, so these are not "threads" sharing data with other threads and coroutines (they may envtually share the same allocator and heap, but how to pass references to objects in a Lua thread with objects in a Task or in another task ? You also need extra communication channels I think.

What is needed is (I think) a native integration with threads in the same engine instance to benefit of closures and avoid the extra (un)marshalling using possibly very long strings.

The typical usage shown is extremely complex to start with, and I don't understand the concept of "number of steps" in the "runLoop". We probably don't need that or this dramatically reduces the possible usages to some limited scriptable tasks that can run independantly of any context and jsut take some input, run, then return some ouput, without communiating anything with other tasks or normal Lua threads.


Le jeu. 29 nov. 2018 à 23:11, Milind Gupta <[hidden email]> a écrit :


What is needed is a way to control in Lua programs (essentially in a parent thread controling all the other threads it creates and monitors) the resource usage and inform the VM which metered operations are allowed or not, and the VM should provide such basic metrics: the timeslot which is requested or the amount of memory requested or no longer in use. Nothing more. Now we can design a parent thread to do whatever it wants to apply its own policy for the children threads it creates and controls so none of these children threads can abusely take all the resources needed by other competing children threads (including the parent thread itself which can be protected from being blocked by its children).

There is a Lua Library to do that: LuaStepper (https://luarocks.org/modules/aryajur/luastepper)

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Philippe Verdy
You don't understand: Lua programs are parsed only once then executed or natively compield from the bytecode only, strings are no longer used except for loading the program the first time, but never when instanciating objects written in that language. There's no marshalling/unmarshaling between Lua functions calls, references are working natively (via indexes in tables stored in the heap or via closures in the stack, it works directly at binary level, and there's never any need to marshall/unmarshall complex data structures or serialize them by building and parsing strings all the time).

I'm not convinced this is the right approach, even when using the LuaC libary, there's a better way to do that without this huge overhead (which also requires lot of heap allocations for these temporary scripts containing everything: the complete source code of all functions the task needs to do, or the code that will use the loader, and all the data they'll work with and that they'll finally return).
We can do that with LuaC by using existing Lua "thread" objects, to which we can attach metadata for adding info about how the threads can operate and can be scheduled.
The LuaC library is needs only to allow creating new threads with theior own Allocator and setup the metadata attached to threads and possibly some "userdata" object to tune the Allocator behavior.

You need however small changes in the Lua bytecode interpret so that it can test a flag indicating that there's a pending interrupt for which the bytecode should stop interpreting continuously, but should insert a call to yield() before continuing with the next instruction. Finally your C library will be used to create the OS-level timer that will just asynchrnously set the interrupt flag, without modifying directyly the state of any running Lua thread. You also need a way for threads to handle some critical sections using some basic test-and-set instruction (this can be implemetned as an unary boolean operator, similar in syntax to "not" except that it must occur before arbitrary expressions but only before a variable name, like with the "local" instruction; this operator can use a new reserved keyword like "testset" or a new lexical token like "?!"; the variable will bet set to a boolean or nil, it does not need a new type, but it will compile a new bytecode instruction which will be atomic and just needs one memory operand or a register index in the Lua closure's window on the stack)

If we don't define this additional bytecode, there's no possibility to synchronize concurrent threads that can preempt the exuecution while working on shared objects, leaving them in inconsistant state (this limits a lot the possility for real multithreading when working with shared data structures, or when performing I/O, so I'm convinced we need some mutex mechanism). Adding thys additional bytecode is a trivial modification to the VM.

However we don't need a pure scheduler with relative priorities between threads. All that is needed is to allow threads to run really concurrently without having them to be modified so they include many "yield()" calls every where in their code or within all their loops or before/after each blocking I/O. 

If Lua is really reentrant as stated in its documentation, then even its blocking I/O libary would be interruptible and reentrant, so that we can force them to yield() in Lua instead of blocking inside the OS thread with its own internal yield to give hand to other processes and OS-native threads (but may be this requires modifying a bit the standard I/O library to force them to yield() in Lua instead of blocking in the native C standard I/O library, or in an OS API used by the native C standard I/O libary.


Le ven. 30 nov. 2018 à 00:22, Milind Gupta <[hidden email]> a écrit :
The C module just uses the Lua API and does not do anything OS specific.
    Sorry I do not quite understand the other remarks about having tasks only as strings. Every Lua program is a string to start with. 
   Run loop number of steps is just the number of opcodes you want to execute before returning control to the parent program.
     Communication between tasks is limited to passing tables and strings and not fancy shared objects. Recursive tables however should not be a problem. Communication right now will be managed by the parent task.



On Nov 29, 2018 2:54 PM, "Philippe Verdy" <[hidden email]> wrote:
May be, but it depends on an external C module implementing the preemptive meachanism (also its "tasks" are limited to scripts written as strings, so they cannot work with Lua functions and closures without first marshalling all parameters and its environment into the script (which will be unmarshalled by the Task engine), and then marshall the results of the task to string using some "print" statement that will be unmarshalled by parsing the output with getTaskData().

So I think it will be quite slow (and not easy to integrate to communicate complex Lua structures made with tables with possible cyclic references)

I think it uses a separate instance of the Lua engine, so these are not "threads" sharing data with other threads and coroutines (they may envtually share the same allocator and heap, but how to pass references to objects in a Lua thread with objects in a Task or in another task ? You also need extra communication channels I think.

What is needed is (I think) a native integration with threads in the same engine instance to benefit of closures and avoid the extra (un)marshalling using possibly very long strings.

The typical usage shown is extremely complex to start with, and I don't understand the concept of "number of steps" in the "runLoop". We probably don't need that or this dramatically reduces the possible usages to some limited scriptable tasks that can run independantly of any context and jsut take some input, run, then return some ouput, without communiating anything with other tasks or normal Lua threads.


Le jeu. 29 nov. 2018 à 23:11, Milind Gupta <[hidden email]> a écrit :


What is needed is a way to control in Lua programs (essentially in a parent thread controling all the other threads it creates and monitors) the resource usage and inform the VM which metered operations are allowed or not, and the VM should provide such basic metrics: the timeslot which is requested or the amount of memory requested or no longer in use. Nothing more. Now we can design a parent thread to do whatever it wants to apply its own policy for the children threads it creates and controls so none of these children threads can abusely take all the resources needed by other competing children threads (including the parent thread itself which can be protected from being blocked by its children).

There is a Lua Library to do that: LuaStepper (https://luarocks.org/modules/aryajur/luastepper)

Reply | Threaded
Open this post in threaded view
|

Re: Reentrant GC

Tim Hill
In reply to this post by Philippe Verdy


> On Nov 29, 2018, at 2:53 PM, Philippe Verdy <[hidden email]> wrote:
>
>
> So I think it will be quite slow (and not easy to integrate to communicate complex Lua structures made with tables with possible cyclic references)
>

We handled that with out own high-performance table pack/unpack logic, including arbitrary graphs of tables, string interning and support for closure passing as well. Developed a pretty efficient algorithm (in C) which ended up giving us all the performance we needed. In fact, we ended up using the packed format (which was a compact binary form) as a network wire format AND a persistence format, so every part of the system saved/exchanged data in one well-defined format.

—Tim



12