Multi-process C calling Lua: sharing data between Lua scripts?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Multi-process C calling Lua: sharing data between Lua scripts?

caco
From C I have master process in parallel with number of agent processes each binding a luaL_State. I want the agents Lua scripts to communicate through Lua supervised (i.e. gc'd) data. How can I do this? 


Sent with ProtonMail Secure Email.

v
Reply | Threaded
Open this post in threaded view
|

Re: Multi-process C calling Lua: sharing data between Lua scripts?

v
On Sat, 2021-02-27 at 04:28 +0000, caco wrote:
> From C I have master process in parallel with number of agent
> processes each binding a luaL_State. I want the agents Lua scripts to
> communicate through Lua supervised (i.e. gc'd) data. How can I do
> this? 
>
You can't directly move GCd data from one Lua state to another (do not
use lua_xmove, it will break horribly when moving between different Lua
states). You have to read it from one state's stack into some form and
then recreate it on another one. Trivial fror strings, numbers and
lightuserdata, much harder for tables and full userdata.
I'd suggest to take a look at some existing multithreading libraries
for Lua since they typically include some mechanism of passing data
between states. There are also separate serialization libraries that
you can use in your application.
--
v <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Multi-process C calling Lua: sharing data between Lua scripts?

Laurent FAILLIE
Hello,

On Sat, 2021-02-27 at 04:28 +0000, caco wrote:
> From C I have master process in parallel with number of agent
> processes each binding a luaL_State. I want the agents Lua scripts to
> communicate through Lua supervised (i.e. gc'd) data. How can I do
> this? 

You may have a look on my Séléné framework : destroyedlolo/Selene


My smart home automation is fully MQTT driven and is a mix of C / C++ and Lua so I was needing fully multi-threaded solution.
In Selene, new thread can be created when MQTT messages arrives or you can detach your own processes.
Processes can communicate using shared variables, shared collections of data (potentially multi-valued and timed), messages and TODO queues (sharing Lua code b/w thread is resource consuming, so a sub task can queue only a  reference of functions it want to be launched on another one - very light).
In addition, there are optional plugins to created dashboard directly on Linux framebuffer so X is not needed or on tiny Oled screen.

Unfortunately, almost no documentation but comment in the source code and large brunch of comprehensive examples.
Obviously if someone want to participate, he's welcome :)

Bye

Laurent
Reply | Threaded
Open this post in threaded view
|

Re: Multi-process C calling Lua: sharing data between Lua scripts?

Viacheslav Usov
In reply to this post by caco
On Sat, Feb 27, 2021 at 5:29 AM caco <[hidden email]> wrote:
>
> From C I have master process in parallel with number of agent processes each binding a luaL_State. I want the agents Lua scripts to communicate through Lua supervised (i.e. gc'd) data. How can I do this?

Since you said "process", not "thread", it should be said that in a
modern OS such as (a recent version of) Linux, MacOS and Windows,
distinct processes have isolated memory spaces and cannot touch memory
in another process, with one exception.

Without the exception, your only option is to serialize and
deserialize data as byte chunks and use some IPC to transport the
chunks between processes.

The exception is shared memory. With it, you get all the complications
of multiple threads and then some more. Depending on how badly you
want your thing, this might be something to consider.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: Multi-process C calling Lua: sharing data between Lua scripts?

Philippe Verdy-2
There are also memory-mapped files. The access control and synchronization across processes being offered by the hosting file-system, each process of thread can get a consistent view. But you have to use filesystem's locking mechanisms for atomic operations.

Mmap'ed memory is very fast (much more than conventional file I/O as they are implicitly buffered for at least the size of your mapped file segment).
Caveats:
* if you have to work on very large datafiles, moving the window mapped in memory at another location of th file would destroy your buffer and would consume lot of I/O (or the default shared cache of the filesystem) just to refill it, and would require adding large VM space to the process. If you have many threads doing this in the same process, the process memory may explode.
* exclusive file locking (with the filesytem calls/API) does not work across threads, unless the OS provides isolation level with calls/API at thread level (and Lua states are not necessarily mapped to a native thread); inside the same Lua app, you will need other locking mechanisms from the Lua machine itself (across its "light" threads). Using data serialization is still the way to go to avoid dead interlocking situations for atomic operations using locks in random order 
* the last alternative is to use an external database (or a mcached store for its speed). You just need a connector library to connect to the "remote" database or store.

And be aware of possible breaches of privacy or security on caches (i.e. implement a cache eviction policy, using segregated pools, instead of using simple LRU-based eviction; this is true for all sorts of caches, including DNS client caches, web caches in browsers, or in routers; not also that the file system caches are NOT secure by default as they rarely provide a cache evition policy with segregated pools you'd want; not doing this expose your online services to data leaks, without knowing secrets in advance).

Unfortunately all modern computing devices, OSes, drivers and application softwares, and most websites you visit are using many levels of caches which are not secured at all (but not in control by the clients using them), most of them using basic LRU eviction policies (there may exist some segration in multiple pools, but no way to segregate them in application-controled domains, as the subdivision is most often arbitrary, only optimistic, and only tuned for best average global performance, and not at all tuned for security. Those breaches are massively harnessed by advertizers (to abuse our privacy), and bad hackers to steal secrets and then money, or to gain access to sites even when they are secured by the best firewalls, the best encryption/authentication/quota mechanisms or other isolation mechanisms (threads, processes, process groups, containers, virtual machines...) of the OS (possibly implemented by hardware in CPU/GPU/bus controlers, SSD/HDD, all of them having some caching mechanisms with too basic eviction policies as they are clearly optimized optimiscly, only for speed and global average performance).

For now the best solution is to use multifactor authentication, but it's not enough as attacks also exist across authorized users of the system which are insufficiently sandboxed).

Caches are the worst nightmare in all modern architectures, we highly depend on them for modern performances. So it's very hard to isolate them all, and to define and implement the correct eviction policies. without sacrifying a lot of performance or adding lot of "idle" redundancy to the system to secure. Even if you do that, you'll pay the huge price of energy, and power saving strategies will ruin all your efforts because you'll reintroduce variable latency for conditional on-demand wake ups, which are also a form of cache (except that there's little of no segregation at all, the system is sleeping or awake and offers no application-controled separation of domains) ! And even today, we continue to train people with basic LRU mechanism or never teach them to make them constantly aware about the risks of ALL caches.


Le sam. 27 févr. 2021 à 14:22, Viacheslav Usov <[hidden email]> a écrit :
On Sat, Feb 27, 2021 at 5:29 AM caco <[hidden email]> wrote:
>
> From C I have master process in parallel with number of agent processes each binding a luaL_State. I want the agents Lua scripts to communicate through Lua supervised (i.e. gc'd) data. How can I do this?

Since you said "process", not "thread", it should be said that in a
modern OS such as (a recent version of) Linux, MacOS and Windows,
distinct processes have isolated memory spaces and cannot touch memory
in another process, with one exception.

Without the exception, your only option is to serialize and
deserialize data as byte chunks and use some IPC to transport the
chunks between processes.

The exception is shared memory. With it, you get all the complications
of multiple threads and then some more. Depending on how badly you
want your thing, this might be something to consider.

Cheers,
V.
bel
Reply | Threaded
Open this post in threaded view
|

Re: Multi-process C calling Lua: sharing data between Lua scripts?

bel
I can share some experience with communicating Lua states:

"shared"
In a program where I have several Lua states within one process, I have a "table" called "shared" - in fact it is "userdata" with __index and __newindex metamethods, but for the Lua programmer it behaves similar to a table. You can store strings, numbers and booleans there. You cannot store threads/coroutines, functions or userdata - they cannot be transferred to another state anyway. Currently it cannot store tables, but it would be possible to extend it to recursively store non-cyclic tables containing only strings, numbers and booleans (maybe I will do this some time). Currently users can serialize such tables themself (e.g. to a JSON string or whatever). All read/write operations are protected by a mutex, and the data is stored in another Lua state that basically only acts as a hashmap here. The code is in production use in a webserver using multiple Lua states in the same process, so it might not fit to your application without adaptations. However, it is MIT licensed, so you can take from there do whatever modification you like: https://github.com/civetweb/civetweb/blob/master/src/mod_lua_shared.inl

"lsqlite"
Another option to share data is through a database used by several processes. This can be done from Lua using "lsqlite3" ... or any other database binding.
Again you can share string, numbers and booleans. You can use transaction based read/write operations on complete records (tables with proper elements).
It will work with different processes, and also offer persistency: when all your Lua processes are shut down, the data is still in the database.
Depending on the database binding, you may even exchange data between processes running on a different host.
From my experience, it depends on the background of the users whether or not they will like this. A database is a "strange" element if you are programming pure Lua - you can wrap these operations somehow, but they do not "feel" that natural for a Lua programmer than a table (as in "shared" above).

"lsh"
I also created a module to access shared memories in Lua (Linux only, but it would be easy to add Windows support). A shared memory can be used to exchange data between Lua states in different processes on the same machine - when sharing between processes owned by different users on the same machine, you need to take care to set the user access rights correctly - this might be bothersome but doable. I used shared memory to exchange data between Lua processes and C/C++ processes - between a language with dynamic typing (Lua) and static typing (C). For a shared memory, you need to define a static memory layout - you need to work with address offsets in this shared memory. You also need to take into consideration that C does not have a "string" in the same sense as Lua does - instead it uses a character array with a fixed size, and the string can never grow larger than that.
If you only share between Lua and Lua, you would still need to know where (what memory offset) to put what element - you still have to use a fixed, static memory layout.
You cannot store "whatever you like", but only what has been provided in the shared memory layout - it is not like a "table" where you can add new elements as you like. You cannot do any duck typing with this solution. Using low level shared memory addressing functions directly requires some additional training for a Lua programmer.
I did not use it for "Lua to Lua", but only for "Lua to C", with a static memory layout predefined as a C structure.

files
So unspectacular, I almost forgot about it: Of course you can use files to share data between Lua states.
Not really a "high performance" solution, but works out of the box without any additional C library.

Combining stuff:
From my experience, an important criterion is the type system you need to support.
"shared" behaves like a Lua table.
"lsqlite" behaves like a database - you define a table structure and add rows.
"lsh" (and probably any other shared memory solution) behaves like a fixed C data structure.
If you are fine with more or less fixed data structures, you can go with a database or a shared memory.
Variable data structure works better with an approach similar to "shared" - it's currently limited to Lua states in the same process, but that could be adapted to work with multiple processes by combining it with some interprocess communication mechanism. All the "read" and "write" operations in "shared" could be sent through domain sockets (Linux) or a named pipe (Windows) - or any other IPC mechanism, to a process holding all data (the "shared" state). This will keep the "look and feel" of a Lua table without any need to predefine any table structure.










On Sat, Feb 27, 2021 at 3:48 PM Philippe Verdy <[hidden email]> wrote:
There are also memory-mapped files. The access control and synchronization across processes being offered by the hosting file-system, each process of thread can get a consistent view. But you have to use filesystem's locking mechanisms for atomic operations.

Mmap'ed memory is very fast (much more than conventional file I/O as they are implicitly buffered for at least the size of your mapped file segment).
Caveats:
* if you have to work on very large datafiles, moving the window mapped in memory at another location of th file would destroy your buffer and would consume lot of I/O (or the default shared cache of the filesystem) just to refill it, and would require adding large VM space to the process. If you have many threads doing this in the same process, the process memory may explode.
* exclusive file locking (with the filesytem calls/API) does not work across threads, unless the OS provides isolation level with calls/API at thread level (and Lua states are not necessarily mapped to a native thread); inside the same Lua app, you will need other locking mechanisms from the Lua machine itself (across its "light" threads). Using data serialization is still the way to go to avoid dead interlocking situations for atomic operations using locks in random order 
* the last alternative is to use an external database (or a mcached store for its speed). You just need a connector library to connect to the "remote" database or store.

And be aware of possible breaches of privacy or security on caches (i.e. implement a cache eviction policy, using segregated pools, instead of using simple LRU-based eviction; this is true for all sorts of caches, including DNS client caches, web caches in browsers, or in routers; not also that the file system caches are NOT secure by default as they rarely provide a cache evition policy with segregated pools you'd want; not doing this expose your online services to data leaks, without knowing secrets in advance).

Unfortunately all modern computing devices, OSes, drivers and application softwares, and most websites you visit are using many levels of caches which are not secured at all (but not in control by the clients using them), most of them using basic LRU eviction policies (there may exist some segration in multiple pools, but no way to segregate them in application-controled domains, as the subdivision is most often arbitrary, only optimistic, and only tuned for best average global performance, and not at all tuned for security. Those breaches are massively harnessed by advertizers (to abuse our privacy), and bad hackers to steal secrets and then money, or to gain access to sites even when they are secured by the best firewalls, the best encryption/authentication/quota mechanisms or other isolation mechanisms (threads, processes, process groups, containers, virtual machines...) of the OS (possibly implemented by hardware in CPU/GPU/bus controlers, SSD/HDD, all of them having some caching mechanisms with too basic eviction policies as they are clearly optimized optimiscly, only for speed and global average performance).

For now the best solution is to use multifactor authentication, but it's not enough as attacks also exist across authorized users of the system which are insufficiently sandboxed).

Caches are the worst nightmare in all modern architectures, we highly depend on them for modern performances. So it's very hard to isolate them all, and to define and implement the correct eviction policies. without sacrifying a lot of performance or adding lot of "idle" redundancy to the system to secure. Even if you do that, you'll pay the huge price of energy, and power saving strategies will ruin all your efforts because you'll reintroduce variable latency for conditional on-demand wake ups, which are also a form of cache (except that there's little of no segregation at all, the system is sleeping or awake and offers no application-controled separation of domains) ! And even today, we continue to train people with basic LRU mechanism or never teach them to make them constantly aware about the risks of ALL caches.


Le sam. 27 févr. 2021 à 14:22, Viacheslav Usov <[hidden email]> a écrit :
On Sat, Feb 27, 2021 at 5:29 AM caco <[hidden email]> wrote:
>
> From C I have master process in parallel with number of agent processes each binding a luaL_State. I want the agents Lua scripts to communicate through Lua supervised (i.e. gc'd) data. How can I do this?

Since you said "process", not "thread", it should be said that in a
modern OS such as (a recent version of) Linux, MacOS and Windows,
distinct processes have isolated memory spaces and cannot touch memory
in another process, with one exception.

Without the exception, your only option is to serialize and
deserialize data as byte chunks and use some IPC to transport the
chunks between processes.

The exception is shared memory. With it, you get all the complications
of multiple threads and then some more. Depending on how badly you
want your thing, this might be something to consider.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: Multi-process C calling Lua: sharing data between Lua scripts?

caco
Thanks all. I'm inclined to go for a database solution with redis. My main reason is that I need to keep control logic (C side) separate from application logic (Lua side).
Thanks again. 


Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Saturday, 27 February 2021 19:00, bel <[hidden email]> wrote:

I can share some experience with communicating Lua states:
"shared"
In a program where I have several Lua states within one process, I have a "table" called "shared" - in fact it is "userdata" with __index and __newindex metamethods, but for the Lua programmer it behaves similar to a table. You can store strings, numbers and booleans there. You cannot store threads/coroutines, functions or userdata - they cannot be transferred to another state anyway. Currently it cannot store tables, but it would be possible to extend it to recursively store non-cyclic tables containing only strings, numbers and booleans (maybe I will do this some time). Currently users can serialize such tables themself (e.g. to a JSON string or whatever). All read/write operations are protected by a mutex, and the data is stored in another Lua state that basically only acts as a hashmap here. The code is in production use in a webserver using multiple Lua states in the same process, so it might not fit to your application without adaptations. However, it is MIT licensed, so you can take from there do whatever modification you like: https://github.com/civetweb/civetweb/blob/master/src/mod_lua_shared.inl
"lsqlite"
Another option to share data is through a database used by several processes. This can be done from Lua using "lsqlite3" ... or any other database binding.
Again you can share string, numbers and booleans. You can use transaction based read/write operations on complete records (tables with proper elements).
It will work with different processes, and also offer persistency: when all your Lua processes are shut down, the data is still in the database.
Depending on the database binding, you may even exchange data between processes running on a different host.
From my experience, it depends on the background of the users whether or not they will like this. A database is a "strange" element if you are programming pure Lua - you can wrap these operations somehow, but they do not "feel" that natural for a Lua programmer than a table (as in "shared" above).

"lsh"
I also created a module to access shared memories in Lua (Linux only, but it would be easy to add Windows support). A shared memory can be used to exchange data between Lua states in different processes on the same machine - when sharing between processes owned by different users on the same machine, you need to take care to set the user access rights correctly - this might be bothersome but doable. I used shared memory to exchange data between Lua processes and C/C++ processes - between a language with dynamic typing (Lua) and static typing (C). For a shared memory, you need to define a static memory layout - you need to work with address offsets in this shared memory. You also need to take into consideration that C does not have a "string" in the same sense as Lua does - instead it uses a character array with a fixed size, and the string can never grow larger than that.
If you only share between Lua and Lua, you would still need to know where (what memory offset) to put what element - you still have to use a fixed, static memory layout.
You cannot store "whatever you like", but only what has been provided in the shared memory layout - it is not like a "table" where you can add new elements as you like. You cannot do any duck typing with this solution. Using low level shared memory addressing functions directly requires some additional training for a Lua programmer.
I did not use it for "Lua to Lua", but only for "Lua to C", with a static memory layout predefined as a C structure.

files
So unspectacular, I almost forgot about it: Of course you can use files to share data between Lua states.
Not really a "high performance" solution, but works out of the box without any additional C library.

Combining stuff:
From my experience, an important criterion is the type system you need to support.
"shared" behaves like a Lua table.
"lsqlite" behaves like a database - you define a table structure and add rows.
"lsh" (and probably any other shared memory solution) behaves like a fixed C data structure.
If you are fine with more or less fixed data structures, you can go with a database or a shared memory.
Variable data structure works better with an approach similar to "shared" - it's currently limited to Lua states in the same process, but that could be adapted to work with multiple processes by combining it with some interprocess communication mechanism. All the "read" and "write" operations in "shared" could be sent through domain sockets (Linux) or a named pipe (Windows) - or any other IPC mechanism, to a process holding all data (the "shared" state). This will keep the "look and feel" of a Lua table without any need to predefine any table structure.










On Sat, Feb 27, 2021 at 3:48 PM Philippe Verdy <[hidden email]> wrote:
There are also memory-mapped files. The access control and synchronization across processes being offered by the hosting file-system, each process of thread can get a consistent view. But you have to use filesystem's locking mechanisms for atomic operations.

Mmap'ed memory is very fast (much more than conventional file I/O as they are implicitly buffered for at least the size of your mapped file segment).
Caveats:
* if you have to work on very large datafiles, moving the window mapped in memory at another location of th file would destroy your buffer and would consume lot of I/O (or the default shared cache of the filesystem) just to refill it, and would require adding large VM space to the process. If you have many threads doing this in the same process, the process memory may explode.
* exclusive file locking (with the filesytem calls/API) does not work across threads, unless the OS provides isolation level with calls/API at thread level (and Lua states are not necessarily mapped to a native thread); inside the same Lua app, you will need other locking mechanisms from the Lua machine itself (across its "light" threads). Using data serialization is still the way to go to avoid dead interlocking situations for atomic operations using locks in random order 
* the last alternative is to use an external database (or a mcached store for its speed). You just need a connector library to connect to the "remote" database or store.

And be aware of possible breaches of privacy or security on caches (i.e. implement a cache eviction policy, using segregated pools, instead of using simple LRU-based eviction; this is true for all sorts of caches, including DNS client caches, web caches in browsers, or in routers; not also that the file system caches are NOT secure by default as they rarely provide a cache evition policy with segregated pools you'd want; not doing this expose your online services to data leaks, without knowing secrets in advance).

Unfortunately all modern computing devices, OSes, drivers and application softwares, and most websites you visit are using many levels of caches which are not secured at all (but not in control by the clients using them), most of them using basic LRU eviction policies (there may exist some segration in multiple pools, but no way to segregate them in application-controled domains, as the subdivision is most often arbitrary, only optimistic, and only tuned for best average global performance, and not at all tuned for security. Those breaches are massively harnessed by advertizers (to abuse our privacy), and bad hackers to steal secrets and then money, or to gain access to sites even when they are secured by the best firewalls, the best encryption/authentication/quota mechanisms or other isolation mechanisms (threads, processes, process groups, containers, virtual machines...) of the OS (possibly implemented by hardware in CPU/GPU/bus controlers, SSD/HDD, all of them having some caching mechanisms with too basic eviction policies as they are clearly optimized optimiscly, only for speed and global average performance).

For now the best solution is to use multifactor authentication, but it's not enough as attacks also exist across authorized users of the system which are insufficiently sandboxed).

Caches are the worst nightmare in all modern architectures, we highly depend on them for modern performances. So it's very hard to isolate them all, and to define and implement the correct eviction policies. without sacrifying a lot of performance or adding lot of "idle" redundancy to the system to secure. Even if you do that, you'll pay the huge price of energy, and power saving strategies will ruin all your efforts because you'll reintroduce variable latency for conditional on-demand wake ups, which are also a form of cache (except that there's little of no segregation at all, the system is sleeping or awake and offers no application-controled separation of domains) ! And even today, we continue to train people with basic LRU mechanism or never teach them to make them constantly aware about the risks of ALL caches.


Le sam. 27 févr. 2021 à 14:22, Viacheslav Usov <[hidden email]> a écrit :
On Sat, Feb 27, 2021 at 5:29 AM caco <[hidden email]> wrote:
>
> From C I have master process in parallel with number of agent processes each binding a luaL_State. I want the agents Lua scripts to communicate through Lua supervised (i.e. gc'd) data. How can I do this?

Since you said "process", not "thread", it should be said that in a
modern OS such as (a recent version of) Linux, MacOS and Windows,
distinct processes have isolated memory spaces and cannot touch memory
in another process, with one exception.

Without the exception, your only option is to serialize and
deserialize data as byte chunks and use some IPC to transport the
chunks between processes.

The exception is shared memory. With it, you get all the complications
of multiple threads and then some more. Depending on how badly you
want your thing, this might be something to consider.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: Multi-process C calling Lua: sharing data between Lua scripts?

John Logsdon
The database solution is not immediately useful where there are lua tables
etc.  sqlite can store string, binary, float and integer but not a
structure which needs to be stored as a lua table, ie as a string to be
parsed. You could use a number of SQL tables to store the individual lua
tables but then that means a number of synchronised reads. Lua (I am using
the latest luajit and the coding is pure Lua, no C) is pretty fast at
interpreting a line of data.

userdata storage has much the same structure problem - there is a lua
routine to write userdata so I don't need to get my hands dirty!

memory mapping is possible but you have to map it to each instantiation ie
thread.  I am using Lanes which generates the threads on the fly so that
would be expensive.

As mine is a write-once, read-many situation, I could use zlib to compress
the each line first but haven't bothered with that as yet.

I eventually concluded that simplest is best and used /tmpfs (on Linux) to
store data as a file in memory which is read line by line, parsed as a
mixture of number, strings and tables of numbers.  There was very little
performance hit to gulping the whole file in one go and as the data are
processed line by line, little point.

Just ensure you don't fill the /tmpfs (generally about 10% of the actual
memory).

HTH

> Thanks all. I'm inclined to go for a database solution with redis. My main
> reason is that I need to keep control logic (C side) separate from
> application logic (Lua side).
> Thanks again.
>
> Sent with [ProtonMail](https://protonmail.com) Secure Email.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Saturday, 27 February 2021 19:00, bel <[hidden email]> wrote:
>
>> I can share some experience with communicating Lua states:
>> "shared"
>> In a program where I have several Lua states within one process, I have
>> a "table" called "shared" - in fact it is "userdata" with __index and
>> __newindex metamethods, but for the Lua programmer it behaves similar to
>> a table. You can store strings, numbers and booleans there. You cannot
>> store threads/coroutines, functions or userdata - they cannot be
>> transferred to another state anyway. Currently it cannot store tables,
>> but it would be possible to extend it to recursively store non-cyclic
>> tables containing only strings, numbers and booleans (maybe I will do
>> this some time). Currently users can serialize such tables themself
>> (e.g. to a JSON string or whatever). All read/write operations are
>> protected by a mutex, and the data is stored in another Lua state that
>> basically only acts as a hashmap here. The code is in production use in
>> a webserver using multiple Lua states in the same process, so it might
>> not fit to your application without adaptations. However, it is MIT
>> licensed, so you can take from there do whatever modification you like:
>> https://github.com/civetweb/civetweb/blob/master/src/mod_lua_shared.inl
>> "lsqlite"
>> Another option to share data is through a database used by several
>> processes. This can be done from Lua using "lsqlite3" ... or any other
>> database binding.
>> Again you can share string, numbers and booleans. You can use
>> transaction based read/write operations on complete records (tables with
>> proper elements).
>> It will work with different processes, and also offer persistency: when
>> all your Lua processes are shut down, the data is still in the database.
>> Depending on the database binding, you may even exchange data between
>> processes running on a different host.
>> From my experience, it depends on the background of the users whether or
>> not they will like this. A database is a "strange" element if you are
>> programming pure Lua - you can wrap these operations somehow, but they
>> do not "feel" that natural for a Lua programmer than a table (as in
>> "shared" above).
>>
>> "lsh"
>> I also created a module to access shared memories in Lua (Linux only,
>> but it would be easy to add Windows support). A shared memory can be
>> used to exchange data between Lua states in different processes on the
>> same machine - when sharing between processes owned by different users
>> on the same machine, you need to take care to set the user access rights
>> correctly - this might be bothersome but doable. I used shared memory to
>> exchange data between Lua processes and C/C++ processes - between a
>> language with dynamic typing (Lua) and static typing (C). For a shared
>> memory, you need to define a static memory layout - you need to work
>> with address offsets in this shared memory. You also need to take into
>> consideration that C does not have a "string" in the same sense as Lua
>> does - instead it uses a character array with a fixed size, and the
>> string can never grow larger than that.
>> If you only share between Lua and Lua, you would still need to know
>> where (what memory offset) to put what element - you still have to use a
>> fixed, static memory layout.
>> You cannot store "whatever you like", but only what has been provided in
>> the shared memory layout - it is not like a "table" where you can add
>> new elements as you like. You cannot do any duck typing with this
>> solution. Using low level shared memory addressing functions directly
>> requires some additional training for a Lua programmer.
>> I did not use it for "Lua to Lua", but only for "Lua to C", with a
>> static memory layout predefined as a C structure.
>>
>> files
>> So unspectacular, I almost forgot about it: Of course you can use files
>> to share data between Lua states.
>> Not really a "high performance" solution, but works out of the box
>> without any additional C library.
>>
>> Combining stuff:
>> From my experience, an important criterion is the type system you need
>> to support.
>> "shared" behaves like a Lua table.
>> "lsqlite" behaves like a database - you define a table structure and add
>> rows.
>> "lsh" (and probably any other shared memory solution) behaves like a
>> fixed C data structure.
>> If you are fine with more or less fixed data structures, you can go with
>> a database or a shared memory.
>> Variable data structure works better with an approach similar to
>> "shared" - it's currently limited to Lua states in the same process, but
>> that could be adapted to work with multiple processes by combining it
>> with some interprocess communication mechanism. All the "read" and
>> "write" operations in "shared" could be sent through domain sockets
>> (Linux) or a named pipe (Windows) - or any other IPC mechanism, to a
>> process holding all data (the "shared" state). This will keep the "look
>> and feel" of a Lua table without any need to predefine any table
>> structure.
>>
>> On Sat, Feb 27, 2021 at 3:48 PM Philippe Verdy <[hidden email]> wrote:
>>
>>> There are also memory-mapped files. The access control and
>>> synchronization across processes being offered by the hosting
>>> file-system, each process of thread can get a consistent view. But you
>>> have to use filesystem's locking mechanisms for atomic operations.
>>>
>>> Mmap'ed memory is very fast (much more than conventional file I/O as
>>> they are implicitly buffered for at least the size of your mapped file
>>> segment).
>>> Caveats:
>>> * if you have to work on very large datafiles, moving the window mapped
>>> in memory at another location of th file would destroy your buffer and
>>> would consume lot of I/O (or the default shared cache of the
>>> filesystem) just to refill it, and would require adding large VM space
>>> to the process. If you have many threads doing this in the same
>>> process, the process memory may explode.
>>> * exclusive file locking (with the filesytem calls/API) does not work
>>> across threads, unless the OS provides isolation level with calls/API
>>> at thread level (and Lua states are not necessarily mapped to a native
>>> thread); inside the same Lua app, you will need other locking
>>> mechanisms from the Lua machine itself (across its "light" threads).
>>> Using data serialization is still the way to go to avoid dead
>>> interlocking situations for atomic operations using locks in random
>>> order
>>> * the last alternative is to use an external database (or a mcached
>>> store for its speed). You just need a connector library to connect to
>>> the "remote" database or store.
>>>
>>> And be aware of possible breaches of privacy or security on caches
>>> (i.e. implement a cache eviction policy, using segregated pools,
>>> instead of using simple LRU-based eviction; this is true for all sorts
>>> of caches, including DNS client caches, web caches in browsers, or in
>>> routers; not also that the file system caches are NOT secure by default
>>> as they rarely provide a cache evition policy with segregated pools
>>> you'd want; not doing this expose your online services to data leaks,
>>> without knowing secrets in advance).
>>>
>>> Unfortunately all modern computing devices, OSes, drivers and
>>> application softwares, and most websites you visit are using many
>>> levels of caches which are not secured at all (but not in control by
>>> the clients using them), most of them using basic LRU eviction policies
>>> (there may exist some segration in multiple pools, but no way to
>>> segregate them in application-controled domains, as the subdivision is
>>> most often arbitrary, only optimistic, and only tuned for best average
>>> global performance, and not at all tuned for security. Those breaches
>>> are massively harnessed by advertizers (to abuse our privacy), and bad
>>> hackers to steal secrets and then money, or to gain access to sites
>>> even when they are secured by the best firewalls, the best
>>> encryption/authentication/quota mechanisms or other isolation
>>> mechanisms (threads, processes, process groups, containers, virtual
>>> machines...) of the OS (possibly implemented by hardware in CPU/GPU/bus
>>> controlers, SSD/HDD, all of them having some caching mechanisms with
>>> too basic eviction policies as they are clearly optimized optimiscly,
>>> only for speed and global average performance).
>>>
>>> For now the best solution is to use multifactor authentication, but
>>> it's not enough as attacks also exist across authorized users of the
>>> system which are insufficiently sandboxed).
>>>
>>> Caches are the worst nightmare in all modern architectures, we highly
>>> depend on them for modern performances. So it's very hard to isolate
>>> them all, and to define and implement the correct eviction policies.
>>> without sacrifying a lot of performance or adding lot of "idle"
>>> redundancy to the system to secure. Even if you do that, you'll pay the
>>> huge price of energy, and power saving strategies will ruin all your
>>> efforts because you'll reintroduce variable latency for conditional
>>> on-demand wake ups, which are also a form of cache (except that there's
>>> little of no segregation at all, the system is sleeping or awake and
>>> offers no application-controled separation of domains) ! And even
>>> today, we continue to train people with basic LRU mechanism or never
>>> teach them to make them constantly aware about the risks of ALL caches.
>>>
>>> Le sam. 27 févr. 2021 à 14:22, Viacheslav Usov <[hidden email]> a
>>> écrit :
>>>
>>>> On Sat, Feb 27, 2021 at 5:29 AM caco <[hidden email]>
>>>> wrote:
>>>>>
>>>>> From C I have master process in parallel with number of agent
>>>>> processes each binding a luaL_State. I want the agents Lua scripts to
>>>>> communicate through Lua supervised (i.e. gc'd) data. How can I do
>>>>> this?
>>>>
>>>> Since you said "process", not "thread", it should be said that in a
>>>> modern OS such as (a recent version of) Linux, MacOS and Windows,
>>>> distinct processes have isolated memory spaces and cannot touch memory
>>>> in another process, with one exception.
>>>>
>>>> Without the exception, your only option is to serialize and
>>>> deserialize data as byte chunks and use some IPC to transport the
>>>> chunks between processes.
>>>>
>>>> The exception is shared memory. With it, you get all the complications
>>>> of multiple threads and then some more. Depending on how badly you
>>>> want your thing, this might be something to consider.
>>>>
>>>> Cheers,
>>>> V.

Reply | Threaded
Open this post in threaded view
|

Re: Multi-process C calling Lua: sharing data between Lua scripts?

Oliver Kroth
In reply to this post by caco
You may like to have a look onto Luaproxy made by Marc Balmer. It works similar to shared by letting a userdata look like a table.

Oliver

Am 01.03.21 um 22:16 schrieb caco:
Thanks all. I'm inclined to go for a database solution with redis. My main reason is that I need to keep control logic (C side) separate from application logic (Lua side).
Thanks again. 


Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Saturday, 27 February 2021 19:00, bel [hidden email] wrote:

I can share some experience with communicating Lua states:
"shared"
In a program where I have several Lua states within one process, I have a "table" called "shared" - in fact it is "userdata" with __index and __newindex metamethods, but for the Lua programmer it behaves similar to a table. You can store strings, numbers and booleans there. You cannot store threads/coroutines, functions or userdata - they cannot be transferred to another state anyway. Currently it cannot store tables, but it would be possible to extend it to recursively store non-cyclic tables containing only strings, numbers and booleans (maybe I will do this some time). Currently users can serialize such tables themself (e.g. to a JSON string or whatever). All read/write operations are protected by a mutex, and the data is stored in another Lua state that basically only acts as a hashmap here. The code is in production use in a webserver using multiple Lua states in the same process, so it might not fit to your application without adaptations. However, it is MIT licensed, so you can take from there do whatever modification you like: https://github.com/civetweb/civetweb/blob/master/src/mod_lua_shared.inl
"lsqlite"
Another option to share data is through a database used by several processes. This can be done from Lua using "lsqlite3" ... or any other database binding.
Again you can share string, numbers and booleans. You can use transaction based read/write operations on complete records (tables with proper elements).
It will work with different processes, and also offer persistency: when all your Lua processes are shut down, the data is still in the database.
Depending on the database binding, you may even exchange data between processes running on a different host.
From my experience, it depends on the background of the users whether or not they will like this. A database is a "strange" element if you are programming pure Lua - you can wrap these operations somehow, but they do not "feel" that natural for a Lua programmer than a table (as in "shared" above).

"lsh"
I also created a module to access shared memories in Lua (Linux only, but it would be easy to add Windows support). A shared memory can be used to exchange data between Lua states in different processes on the same machine - when sharing between processes owned by different users on the same machine, you need to take care to set the user access rights correctly - this might be bothersome but doable. I used shared memory to exchange data between Lua processes and C/C++ processes - between a language with dynamic typing (Lua) and static typing (C). For a shared memory, you need to define a static memory layout - you need to work with address offsets in this shared memory. You also need to take into consideration that C does not have a "string" in the same sense as Lua does - instead it uses a character array with a fixed size, and the string can never grow larger than that.
If you only share between Lua and Lua, you would still need to know where (what memory offset) to put what element - you still have to use a fixed, static memory layout.
You cannot store "whatever you like", but only what has been provided in the shared memory layout - it is not like a "table" where you can add new elements as you like. You cannot do any duck typing with this solution. Using low level shared memory addressing functions directly requires some additional training for a Lua programmer.
I did not use it for "Lua to Lua", but only for "Lua to C", with a static memory layout predefined as a C structure.

files
So unspectacular, I almost forgot about it: Of course you can use files to share data between Lua states.
Not really a "high performance" solution, but works out of the box without any additional C library.

Combining stuff:
From my experience, an important criterion is the type system you need to support.
"shared" behaves like a Lua table.
"lsqlite" behaves like a database - you define a table structure and add rows.
"lsh" (and probably any other shared memory solution) behaves like a fixed C data structure.
If you are fine with more or less fixed data structures, you can go with a database or a shared memory.
Variable data structure works better with an approach similar to "shared" - it's currently limited to Lua states in the same process, but that could be adapted to work with multiple processes by combining it with some interprocess communication mechanism. All the "read" and "write" operations in "shared" could be sent through domain sockets (Linux) or a named pipe (Windows) - or any other IPC mechanism, to a process holding all data (the "shared" state). This will keep the "look and feel" of a Lua table without any need to predefine any table structure.










On Sat, Feb 27, 2021 at 3:48 PM Philippe Verdy <[hidden email]> wrote:
There are also memory-mapped files. The access control and synchronization across processes being offered by the hosting file-system, each process of thread can get a consistent view. But you have to use filesystem's locking mechanisms for atomic operations.

Mmap'ed memory is very fast (much more than conventional file I/O as they are implicitly buffered for at least the size of your mapped file segment).
Caveats:
* if you have to work on very large datafiles, moving the window mapped in memory at another location of th file would destroy your buffer and would consume lot of I/O (or the default shared cache of the filesystem) just to refill it, and would require adding large VM space to the process. If you have many threads doing this in the same process, the process memory may explode.
* exclusive file locking (with the filesytem calls/API) does not work across threads, unless the OS provides isolation level with calls/API at thread level (and Lua states are not necessarily mapped to a native thread); inside the same Lua app, you will need other locking mechanisms from the Lua machine itself (across its "light" threads). Using data serialization is still the way to go to avoid dead interlocking situations for atomic operations using locks in random order 
* the last alternative is to use an external database (or a mcached store for its speed). You just need a connector library to connect to the "remote" database or store.

And be aware of possible breaches of privacy or security on caches (i.e. implement a cache eviction policy, using segregated pools, instead of using simple LRU-based eviction; this is true for all sorts of caches, including DNS client caches, web caches in browsers, or in routers; not also that the file system caches are NOT secure by default as they rarely provide a cache evition policy with segregated pools you'd want; not doing this expose your online services to data leaks, without knowing secrets in advance).

Unfortunately all modern computing devices, OSes, drivers and application softwares, and most websites you visit are using many levels of caches which are not secured at all (but not in control by the clients using them), most of them using basic LRU eviction policies (there may exist some segration in multiple pools, but no way to segregate them in application-controled domains, as the subdivision is most often arbitrary, only optimistic, and only tuned for best average global performance, and not at all tuned for security. Those breaches are massively harnessed by advertizers (to abuse our privacy), and bad hackers to steal secrets and then money, or to gain access to sites even when they are secured by the best firewalls, the best encryption/authentication/quota mechanisms or other isolation mechanisms (threads, processes, process groups, containers, virtual machines...) of the OS (possibly implemented by hardware in CPU/GPU/bus controlers, SSD/HDD, all of them having some caching mechanisms with too basic eviction policies as they are clearly optimized optimiscly, only for speed and global average performance).

For now the best solution is to use multifactor authentication, but it's not enough as attacks also exist across authorized users of the system which are insufficiently sandboxed).

Caches are the worst nightmare in all modern architectures, we highly depend on them for modern performances. So it's very hard to isolate them all, and to define and implement the correct eviction policies. without sacrifying a lot of performance or adding lot of "idle" redundancy to the system to secure. Even if you do that, you'll pay the huge price of energy, and power saving strategies will ruin all your efforts because you'll reintroduce variable latency for conditional on-demand wake ups, which are also a form of cache (except that there's little of no segregation at all, the system is sleeping or awake and offers no application-controled separation of domains) ! And even today, we continue to train people with basic LRU mechanism or never teach them to make them constantly aware about the risks of ALL caches.


Le sam. 27 févr. 2021 à 14:22, Viacheslav Usov <[hidden email]> a écrit :
On Sat, Feb 27, 2021 at 5:29 AM caco <[hidden email]> wrote:
>
> From C I have master process in parallel with number of agent processes each binding a luaL_State. I want the agents Lua scripts to communicate through Lua supervised (i.e. gc'd) data. How can I do this?

Since you said "process", not "thread", it should be said that in a
modern OS such as (a recent version of) Linux, MacOS and Windows,
distinct processes have isolated memory spaces and cannot touch memory
in another process, with one exception.

Without the exception, your only option is to serialize and
deserialize data as byte chunks and use some IPC to transport the
chunks between processes.

The exception is shared memory. With it, you get all the complications
of multiple threads and then some more. Depending on how badly you
want your thing, this might be something to consider.

Cheers,
V.