Is it possible to run Lua threads in OS threads safely?

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Why one would need mtstates/mtmsg?

Oliver Schmidt
A minor correction:

On 18.06.20 11:34, Oliver wrote:
> mtstates.newstate() gives a state referencing object that owns the mtstate
> object. Another system thread or mtstate object (more precisley: Lua main state)
> can only access this mtstate object by using mtstates.state() using the state's
> name or id.

This should better be:

mtstates.newstate() gives a state referencing object that owns the mtstate
object. Another system thread or mtstate object (more precisley: Lua main state)
should only access this mtstate object by using mtstates.state() using the
state's name or id.

(however it would be possible to get an owning reference to this state by
invoking mtstates.singleton() with the state's name).

Best regards,
Oliver
Reply | Threaded
Open this post in threaded view
|

Re: Why one would need mtstates/mtmsg?

Andrea-2
In reply to this post by Oliver Schmidt
Thank you very  much for your clarifications!

==== for mtmsg

Let me double check something:

and one or more buffers where old messages are replaced (using buffer:setmsg()).

Does setmsg replace the latest/newest or the oldest msg in the buffer? 
(from the example on GitHub I would say the latest/newest) 

buffer and displays a volume meter. Old volume values are not needed in this use

This is really a great example. I would include it in the documentation. This completely justify the concept of "listener" for multiple buffers, because buffers can be fed in different ways by writers. Plus what you mention in your reply: the possibility to abort operations on a specific buffer to interrupt the corresponding worker/writer while others continue undisturbed.

world you have buffer referencing objects. The Blocking/NonBlocking flag is
stored in the buffer referencing object and is evaluated  and considered if you

Thank you for confirming. If possible, I would write a guide for dummies (like me) in which you simplify the concept. As an example: instead of "buffer referencing object" one can write "handle" (as for files in C) or "reference".

So the guide for dummies could say: "the 'blocking' flag is associated with the handle, not with the buffer. Handles to the same buffer will have their own flag to modify the behavior of the invoked method".

The functions mtmsg.abort() / buffer:abort() / listener:abort() can be used to
interrupt/cancel operations from other threads. These methods are setting an

Now it's clear, thank you very much. 

That raises a question: after I issue an 'abort' operation, is it possible to remove all the messages from the buffer so that the listener/reader will never process them when I resume regular operation by setting abort to false?
 
referencing this listener behind the scene. So the underlying listener object
can only be destructed if the last buffer is destructed.

Thank you for clarifying. It seems correct given the fact that one cannot have multiple listeners listening to the same buffer, right? I realized this only re-reading the documentation.

I incorrectly assumed that not only a given listener can have multiple buffers, but also a given buffer can have multiple listeners. If that were the case, then one should have the buffer keep a reference to the listener so that the listener can be destructed after all referenced buffers, and one should have the listener keep a reference to the buffers so that a buffer can be destructed only if all attached listener are destructed. The latter is not possible in current implementation, a given buffer can be attached to only one listener, therefore you do not need to have the listener store the reference to the buffer (I little bit convoluted, but I hope my reasoning is clear).
 
I'm going to rethink the current solution since I now have the impression that
this could be improved (but I'm not sure). Again thank you for your feedback!

I think the existing implementation is good! Keeping things simple is good.

If I look at buffers as if they were queues/lists, I would want some method to put a message in front of the others that may already be present (this could be useful for 'urgent' messages with some higher priority - but one could use independent buffers for that). Also, removing messages already added could be useful.
 
by intention does not serialize tables. There are several libraries that can

I totally agree. What I meant: this is worth mentioning in the documentation. So that if one asks himself how to send a table, the answer is "serialize it" or just write a constructor in a string and send the string.
 
incrementally build up a larger message without locking. Then if the message is

Do you mean instead of sending a tuple of arguments, build a tuple one element at a time?
(especially because of the efficiency gain in lock time)

Will it be easy to manage the situation where multiple writers are building a long message in the same buffer, simultaneously?
 
==== now for mtstates

- mtstates.newstate() always returns a state referencing object to a *new* state
even if there is already a state with the same name. So it's valid to have more
:
- mtstates.singleton() returns a state referencing object to a new state if the
state by this name does not exist and returns a state referencing object to an
existing state if the state by this name does exist.
some code in a larger framework or whatsoever where you cannot (or do not want
to) control the setup of everything and your code runs in different threads but
you want to have shared data/state you could use mtstates.singleton() from these

This is really clear. Thank you very much. I hope you can add this to the documentation on GitHub. I think this could really help.

is constructed by the invocation of mtstates.singleton() with the name "foo".
Then within this singleton state mtstates.singleton() is invoked with the name

Let me double check to see if I understand correctly: one can use singleton to be sure to own a reference and be sure the state is not garbage collected.
The thing to avoid is to call singleton twice with the same name from the same thread, right? 
But it should be safe to call singleton once from each thread, right?

(I hope I got it right, otherwise I may be missing something important - I am new to Lua and I am not sure I fully grasp what a "state" is, for now I am thinking it is something that holds the global environment... but I am sure there is more - I am sorry if I am missing something important here)
 
mtstates.interrupt() is easier whereas mtint provides more functionality.

Thank you very much. This could be added to the documentation so that future readers/users (as me) can benefit from this notes.

--
Andrea Vitali






Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to run Lua threads in OS threads safely?

pocomane
In reply to this post by Andrea-2
On Wed, Jun 17, 2020 at 10:07 PM Norman Ramsey wrote:
>   - To pass table values between threads, I wound up writing my own
>     serializer/deserializer.

On Thu, Jun 18, 2020 at 12:09 AM Andrea wrote:
> Yes I guess serialization (of tables) is one of the most required functionalities :) It would be useful to have such a method in the table library for that.... I guess it is not there because of the choices one can make (shallow vs deep copies, etc - the mesh of references can be really complex)

In a recent discussion I discovered a LuaProc extension that lets you
pass tables between threads. Maybe you can give it a try:
https://github.com/lmillanfdez/luaproc-master.
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to run Lua threads in OS threads safely?

Andrea-2

In a recent discussion I discovered a LuaProc extension that lets you
pass tables between threads. Maybe you can give it a try:
https://github.com/lmillanfdez/luaproc-master.

It seems there is no documentation of this additional feature... at least on the Readme.md

How did you discover this? Have you tried it? Features and limitations?

    Andrea

--
Andrea Vitali






Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to run Lua threads in OS threads safely?

Norman Ramsey
In reply to this post by Andrea-2
 > I have noticed that you write "mostly reliable" followed by the scary note
 > "occasionally computations locks up on mysterious ways"... do you mean
 > dead-locks?

Yes.  Computation halts.  I'm running on Linux, and I have to
terminate the computation by sending INT to the process.  To get it to
stop properly, I have to send INT twice.

 > I also noticed that you mention upvalues as a way to pass data between
 > threads, but based on the documentation one could also use the send/receive
 > methods, right?

Yes.  I am using send and receive extensively.

 > (searching the internet, I found that someone mention the fact that if
 > there is a table as an upvalue then newproc fails, and this may not be
 > immediately clear to the developer, is this correct?)

It definitely fails.  I seem to remember that the error was relatively
easy to diagnose.

 > Why do you write that the concurrently model is at the same time "simple"
 > but "it took a while to figure it out how to use it effectively?"...it
 > seems it is not simple after all or maybe it is "too simple" and you bumped
 > into some unintended behavior?

I am accustomed to a richer concurrency model somewhat in the style of
Tony Hoare's CSP or John Reppy's Concurrent ML.  What took time was
figuring out how to use luaproc's model effectively even though it
lacks features that I was accustomed to use.

 > Yes I guess serialization (of tables) is one of the most required
 > functionalities :) It would be useful to have such a method in the table
 > library for that.... I guess it is not there because of the choices one can
 > make (shallow vs deep copies, etc - the mesh of references can be really
 > complex)

I've created a github gist that has my implementation of serialization
and also of a "work crew" abstraction.  There are undoubtedly missing
dependencies, but some of it may be useful.  Documentation below.

https://gist.github.com/nrnrnr/0cc1d8848ffae7f99f8ed2b332f19124


Norman


=============== Overview of module luaproc.workcrew ===============

Completes a job of 'work', where work is an abstract type.
A job is represented by a table containing these fields:

  { work      : work list
  , unpack    : function(work) returns scalar, ...  -- defaults to table.unpack
  , state     : a value pickle option -- defaults to empty table

  , worker    : function pickled   -- take result of unpack, return scalar list
  , collector : function pickled   -- merge scalar list from worker into state

  , nworkers  : int option         -- how many worker threads to create
  , uid       : string             -- unique identifier
  , status    : function(work)     -- optional, for side effect
  }


Function types:

  worker    : function (work) returns results
  collector : function(state, results)
  status    : function(work)

If I write the unpickling operation with vertical bars,
workcrew.run(job) is equivalent to the following

  local state = |job.state|
  for _, w in ipairs(job.work) do -- parallel
    job.status(w)                                    -- atomic
    local results = { |job.worker|(job.unpack(w)) }  -- not atomic
    |job.collector|(state, table.unpack(results))    -- atomic
  end
  return unpickle(pickle(state))

The first result returned by job.unpack must not be `false`.

The job and state tables will be copied multiple times, but only one
thing is mutated: the state table associated with the collector
thread.  

Picklers and unpickler can be found in luaproc.serialize.
===================================================================

luaproc.workcrew.run = function(job) returns state
  As described in the overview, create a work crew, run the job's work,
  tear down the crew, and return the final state:
   
    local state = |job.state|
    for _, w in ipairs(job.work) do -- parallel
      job.status(w)                                    -- atomic
      local results = { |job.worker|(job.unpack(w)) }  -- not atomic
      |job.collector|(state, table.unpack(results))    -- atomic
    end
    return unpickle(pickle(state))


================================================================


luaproc.serialize.ofstring = function(string) returns value
  Undoes tostring, so ofstring(tostring(v)) is isomorphic to v.

luaproc.serialize.pickle = synonym for tostring

luaproc.serialize.picklechunk = function(string) returns pickle
  Pickles a sequence of statements ending in return.
  When unpickled, produces the value returned.

luaproc.serialize.picklemod = function(string) returns pickled value
  Given string x.y.z, returns "(require 'x.y')['z']", used
  to pickle a function that lives in a module.
   
  Also accepts arguments function(modname, membername)

luaproc.serialize.tostring = function(value) return string
  Return a string that, when fed to ofstring,
  reconstructs an isomorphic value.
  Input must be composed of scalars and tables,
  no functions or userdata.

luaproc.serialize.unpickle = synonym for ofstring
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to run Lua threads in OS threads safely?

Steven Johnson

>  > I have noticed that you write "mostly reliable" followed by the scary note
>  > "occasionally computations locks up on mysterious ways"... do you mean
>  > dead-locks?
>
> Yes.  Computation halts.  I'm running on Linux, and I have to
> terminate the computation by sending INT to the process.  To get it to
> stop properly, I have to send INT twice.

At some point I had to switch the if statement at https://github.com/askyrme/luaproc/blob/master/src/lpsched.c#L341 to a loop.
It was long ago, but I wonder if it's not related; your problem rings a bell, anyway.
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to run Lua threads in OS threads safely?

pocomane
In reply to this post by Andrea-2
On Fri, Jun 19, 2020 at 5:58 PM Andrea wrote:
> It seems there is no documentation of this additional feature... at least on the Readme.md
> How did you discover this? Have you tried it? Features and limitations?

Ops, sorry, I messed with the links. However I discovered it in one
recent thread of this mailing list, so maybe it is better to refer
directly to it: https://marc.info/?t=158506769500001&r=1&w=2 . I still
can not find the time to properly test it.
Reply | Threaded
Open this post in threaded view
|

Re: Is it possible to run Lua threads in OS threads safely?

pocomane
On Fri, Jun 19, 2020 at 7:05 PM pocomane <[hidden email]> wrote:
> Ops, sorry, I messed with the links. However I discovered it in one
> recent thread of this mailing list, so maybe it is better to refer
> directly to it: https://marc.info/?t=158506769500001&r=1&w=2 . I still
> can not find the time to properly test it.

Ops again. And sorry again for the double post, but I have to report a
severe failure in my memory (the one in my brain I mean).

The Errata:
1) The first link I posted [1] does actually contain the code for
passing the tables, but, as noted by Andrea, there is no reference to
the changes in the Readme
2) I fought with it some days ago, and I ended up forking the code [2]
3) I performed some VERY minimal tests on table passing, that seems to
work. But I would NOT use it in production without other extensive
tests.

If you want to give it a quick try, I have a static lua release that
includes this LuaProc extension [3].

---
[1] https://github.com/lmillanfdez/luaproc-master
[2] https://github.com/pocomane/luaproc-extended . it is forked by
[1], I just added the reference to the L.Millan thesis and a function
I needed for some test: it reports the number of free/running threads.
[3] Single binary containing lua and a handful of libraries; linux,
windows and mac packages in the Release section:
https://github.com/pocomane/lua_static_battery
Reply | Threaded
Open this post in threaded view
|

Re: Why one would need mtstates/mtmsg?

Oliver Schmidt
In reply to this post by Andrea-2
Hi,

On 19.06.20 00:16, Andrea wrote:
> ==== for mtmsg
>
> Does setmsg replace the latest/newest or the oldest msg in the buffer? 
> (from the example on GitHub I would say the latest/newest)

As the documentation says: "Sets the arguments together as one message into the
buffer. All other messages in this buffer are discarded." So all messages in the
buffer including the latest and newest are discarded.

> This is really a great example. I would include it in the documentation.

Perhaps examples and things like this could be put into another document to keep
the reference documentation as compact as possible.

> Thank you for confirming. If possible, I would write a guide for dummies (like
> me) in which you simplify the concept. As an example: instead of "buffer
> referencing object" one can write "handle" (as for files in C) or "reference".

I'm not sure which wording is best, also I'm not a native English speaker...
"handle" is perhaps also correct, althoug with a handle I would associate
something more basic like an integer id that doesn't have methods. The buffer
referencing objects are at least some kind of objects and in daily usage of
mtmsg you mustn't always strictly distinguish between a reference to an object
or an object. buffer:id() gives the integer id to the buffer and
mtmsg.buffer(id) gives an object for using the buffer.

> That raises a question: after I issue an 'abort' operation, is it possible to
> remove all the messages from the buffer so that the listener/reader will never
> process them when I resume regular operation by setting abort to false?

Yes, you can simply invoke buffer:clear() or listener:clear() even if the buffer
or listener is aborted.

>     I'm going to rethink the current solution since I now have the impression that
>     this could be improved (but I'm not sure). Again thank you for your feedback!
> I think the existing implementation is good! Keeping things simple is good.

I already improved something, but yes I'm trying to keep it simple. The listener
still does not own the buffer (and I'm not planning to change this) but if a
buffer is garbage collected the remaining messages of this buffer are still
delivered to the listener. This is already pushed to the github repo ;-)

> If I look at buffers as if they were queues/lists, I would want some method to
> put a message in front of the others that may already be present (this could be
> useful for 'urgent' messages with some higher priority - but one could use
> independent buffers for that). Also, removing messages already added could be
> useful.

IMHO this would make things too complex for a simple library. You can easily
build something like this on top of mtmsg and mtstates using pure Lua. You could
in one mtstates object organize messages in maps and list and re-order them or
anything else. This would be similar to Lane's Lind objects: as far as I
understand the documentation, the Lindas are having a hidden Lua state in the
background.

>     incrementally build up a larger message without locking. Then if the message is
> Do you mean instead of sending a tuple of arguments, build a tuple one element
> at a time?

I mean that you could add incrementally elements to the message. Once you are
finished, the message is send. The receiver could receive the message in one
call like now and will receive all arguments on the stack (there is a limit in
Lua for doing this) or the receiver could also read the message elements
incrementally (by provding an optional integer argument how many elements are
retrieved from the message in one call).

For example:
   buffer:addmsg("foo1", "foo2", "foo3")

would be equivalent to:
   write:add("foo1", "foo2"); writer:add("foo3"); writer:addmsg(buffer);

And:
   local a, b, c = buffer:nextmsg()

would be equivalent to:
   reader:nextmsg(buffer)
   local a, b = reader:next(2)
   local c    = reader:next()


> Will it be easy to manage the situation where multiple writers are building a
> long message in the same buffer, simultaneously?

Yes of course, this is the whole point of it.

> ==== now for mtstates
>
> Let me double check to see if I understand correctly: one can use singleton to
> be sure to own a reference and be sure the state is not garbage collected.
> The thing to avoid is to call singleton twice with the same name from the same
> thread, right?

No this does not cause any problems, the two calls would lead to two state
referencing Lua objects that are referencing the same state.

As you wrote:

  local s = mtstates.singleton("foo", setupFunction)

is nearly the same as:

  local ok, s = pcall(function() return mtstates.state("foo") end)
  if not ok then s = mtstates.newstate("foo", setupFunction) end

BUT: the difference is: the second version will cause race conditions if two
different threads are trying to get the same state "foo" at the same time. So
the main purpose of mtstates.singleton() is to make this an atomic operation
such that it is always the case that the two threads are getting the same
underlying mtstate object in the end (they both should supply the same
setupFunction code of course).

The problem with "owning" or "not owning" and reference cycle is just a side
problem. To avoid reference cycles I decided to have a strict concept of
ownership: the one who created the state via mtstates.newstate() owns the state.
All other threads that are getting the state using mtstates.state() are getting
a weak reference ("not owning"). This ensures that no reference cycles can be
build in normal usage. This strict ownership cannot be done by
mtstates.singleton() since it is not clear who ones the state here.


> new to Lua and I am not sure I fully grasp what a "state" is, for now I am
> thinking it is something that holds the global environment... but I am sure
> there is more - I am sorry if I am missing something important here)

In the Lua C API you can find the datatype "lua_State". This datatype is used
for the Lua main state (this is also holding the global environment) but also
for coroutines (that are referering to the same main state and are sharing the
global environment with the main state). In mtstates a state object is a Lua
main state, i.e. it is holding it's own environment. In the typical Lua multi
threading library (Lanes, luaproc, llthreads) each native system thread contains
a Lua main state and the threaded code is running is the thread's main state.

Best regards,
Oliver
Reply | Threaded
Open this post in threaded view
|

Re: Why one would need mtstates/mtmsg?

Andrea-2


As the documentation says: "Sets the arguments together as one message into the
buffer. All other messages in this buffer are discarded." So all messages in the
Yes, you can simply invoke buffer:clear() or listener:clear() even if the buffer
or listener is aborted.

Thank you for your patience. It is evident I did miss some key part in the documentation. My mistake, I am sorry.

buffer is garbage collected the remaining messages of this buffer are still
delivered to the listener. This is already pushed to the github repo ;-)

Thank you very much! I downloaded the library again today!! 

 
  local s = mtstates.singleton("foo", setupFunction)
is nearly the same as:
  local ok, s = pcall(function() return mtstates.state("foo") end)
  if not ok then s = mtstates.newstate("foo", setupFunction) end

BUT: the difference is: the second version will cause race conditions if two
different threads are trying to get the same state "foo" at the same time. So
the main purpose of mtstates.singleton() is to make this an atomic operation

Ok. Now I got it. Thank you very much for your patience. I am sorry it took so much time from my side to grasp the concept.
    
    Andrea

--
Andrea Vitali






12