problems upgrading from 5.0.3 - 5.1 with lua_newthread luaL_ref lua_gc and lua_resume

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

problems upgrading from 5.0.3 - 5.1 with lua_newthread luaL_ref lua_gc and lua_resume

subatomic
Hi, I have a small game engine that uses coroutines to run each game "actor" (NPC) as a lua class which concurently with many other game actors (think 200+ lua threads)...

My problem is that my threads in 5.1 are being garbage collected, while in 5.0.3 they are not.
I am using luaL_ref to "save" the thread from being collected until I want it to be, then I use luaL_unref.
This doesn't seem to be working in 5.1, but worked ok in 5.0.3.
I was wondering if there's a new way to do this now in 5.1,
or if there is something new about the GC in 5.1 that I don't understand?

My code goes sort of like this:

t = lua_newthread( L );
thread_reference = luaL_ref( L, LUA_REGISTRYINDEX );
     
// then some luabind stuff to construct the new actor script
luabind::object class_constructor_function = luabind::globals(t)[classname];
mMyActor = luabind::call_function<Actor*>( class_constructor_function );
mActorObj = luabind::object( t, mMyActor );

// then push the function and resume the thread for the first time:
mActorObj.push(t);          
lua_pushstring(t, "__call");
lua_gettable(t, -2);  
lua_remove(t, -2);   
mActorObj.push(t);     
resume_status = lua_resume(t, 1);

// then garbage collect (we want performance in the game to be predictable as possible)
lua_gc(L,LUA_GCCOLLECT,0);

// then next frame, resume again
// (here it crashes because 't' has been deleted by lua_gc)
resume_status = lua_resume( t, 0 );


Here 't' has been deleted, I know because I can see 0xfeeefeee's appear inside 't', which happen when I step over the lua_gc statement...

This did not happen in 5.0.3.  But does happen in 5.1

I don't want to leave the (many) threads on the stack, in case I overflow the stack size (my code does work when leaving the thread on the stack, so i know everything else is ok...)...
Any ideas?
thanks!!


(want to play with my code?  see here: http://www.subatomicglue.com/download/luacoroutines.tgz   note: this version has luaL_ref commented out, so you'd need to put it back to play with it...)

--
http://www.subatomicglue.com
Reply | Threaded
Open this post in threaded view
|

Re: problems upgrading from 5.0.3 - 5.1 with lua_newthread luaL_ref lua_gc and lua_resume

subatomic
Thumbing through the 5.1 ref manual, I tried this instead of luaL_ref:

      lua_setfield(L, LUA_GLOBALSINDEX, "bok");

and it seems to work (my thread is not garbage collected).

I don't like the string lookup, and I also wonder about the best way to generate a unique string-per-thread (just convert the thread ptr to string??).  seems hacky, I really liked the luaL_ref way of doing this...   I'd still be very interested to hear  what people know about all this, and what way (other than the stack) should I ref my threads to save them from the perils of the GC.

thanks. :-)

On 10/31/06, subatomic <[hidden email]> wrote:
Hi, I have a small game engine that uses coroutines to run each game "actor" (NPC) as a lua class which concurently with many other game actors (think 200+ lua threads)...

My problem is that my threads in 5.1 are being garbage collected, while in 5.0.3 they are not.
I am using luaL_ref to "save" the thread from being collected until I want it to be, then I use luaL_unref.
This doesn't seem to be working in 5.1, but worked ok in 5.0.3.
I was wondering if there's a new way to do this now in 5.1,
or if there is something new about the GC in 5.1 that I don't understand?

My code goes sort of like this:

t = lua_newthread( L );
thread_reference = luaL_ref( L, LUA_REGISTRYINDEX );
     
// then some luabind stuff to construct the new actor script
luabind::object class_constructor_function = luabind::globals(t)[classname];
mMyActor = luabind::call_function<Actor*>( class_constructor_function );
mActorObj = luabind::object( t, mMyActor );

// then push the function and resume the thread for the first time:
mActorObj.push(t);          
lua_pushstring(t, "__call");
lua_gettable(t, -2);  
lua_remove(t, -2);   
mActorObj.push(t);     
resume_status = lua_resume(t, 1);

// then garbage collect (we want performance in the game to be predictable as possible)
lua_gc(L,LUA_GCCOLLECT,0);

// then next frame, resume again
// (here it crashes because 't' has been deleted by lua_gc)
resume_status = lua_resume( t, 0 );


Here 't' has been deleted, I know because I can see 0xfeeefeee's appear inside 't', which happen when I step over the lua_gc statement...

This did not happen in 5.0.3.  But does happen in 5.1

I don't want to leave the (many) threads on the stack, in case I overflow the stack size (my code does work when leaving the thread on the stack, so i know everything else is ok...)...
Any ideas?
thanks!!


(want to play with my code?  see here: http://www.subatomicglue.com/download/luacoroutines.tgz   note: this version has luaL_ref commented out, so you'd need to put it back to play with it...)

--
http://www.subatomicglue.com



--
Kevin Meinert
http://www.subatomicglue.com
Reply | Threaded
Open this post in threaded view
|

Re: problems upgrading from 5.0.3 - 5.1 with lua_newthread luaL_ref lua_gc and lua_resume

Rici Lake-2

On 31-Oct-06, at 3:56 AM, subatomic wrote:

Thumbing through the 5.1 ref manual, I tried this instead of luaL_ref:

      lua_setfield(L, LUA_GLOBALSINDEX, "bok");

and it seems to work (my thread is not garbage collected).

I don't like the string lookup, and I also wonder about the best way to generate a unique string-per-thread (just convert the thread ptr to string??).  seems hacky, I really liked the luaL_ref way of doing this...   I'd still be very interested to hear  what people know about all this, and what way (other than the stack) should I ref my threads to save them from the perils of the GC.

I took a glance at your code, but it's not clear to me what might be going wrong. I'd check to make sure that the thread is still in the registry where you put it, in case perhaps luabind is playing some games with the registry. (lua_rawgeti(L, LUA_REGISTRYINDEX, thread_reference) will push it onto the stack.)

A couple of notes, though.

First, I think you should be less nervous about just keeping your threads on the Lua stack. It's not the C stack, and its capacity is pretty well what you want it to be (it's just a malloc()'d array); use lua_checkstack or luaL_checkstack to expand it to an appropriate size). You could call lua_gettop(L) after you push the thread onto the stack in order to get the stack index, which would work like your thread_reference, and you can delete a single element from the stack with something like:

  lua_pushnil(L);
  lua_replace(L, index);

That will preserve stack indexes. You could also thread a free list through the deleted entries, luaL_ref() style, by pushing an integer instead of nil:

  lua_pushinteger(L, mFree);
  lua_replace(L, index);
  mFree = index;

  // To make a new thread:
  lua_newthread(L);
  if (mFree != 0) {
int tmp = mFree; mFree = lua_tointeger(L, mFree); lua_replace(L, tmp);
  }

The other thing is that there is only one garbage collector. You cannot garbage collect threads independently. When you call:

  lua_gc(some_state, GCCOLLECT, 0);

it will garbage collect the entire Lua universe to which some_state belongs, using some_state to call finalizers. (It also will guarantee that some_state is not itself collected, in case there is no other reference to it; other than that, it doesn't make that much difference which state you use, although I'd tend to use the main state.)

The comment about having to collect actors before their thread is deleted suggests that something is wrong with your finalizer model, if you actually experienced the crash to which you refer.


Reply | Threaded
Open this post in threaded view
|

Re: problems upgrading from 5.0.3 - 5.1 with lua_newthread luaL_ref lua_gc and lua_resume

Leo Razoumov
In reply to this post by subatomic
Can your reproduce your proeblem without using luabind?
If integer keys in LUA_REGISTRYINDEX are accessed ONLY via references
your threads should still be in the registry. If, on the other hand,
luabind of something else overrides integer keys directly all bets are
off. As a simple workaround, give each thread a unique string name and
use it as a key.

--Leo--

On 10/31/06, subatomic <[hidden email]> wrote:
Hi, I have a small game engine that uses coroutines to run each game "actor"
(NPC) as a lua class which concurently with many other game actors (think
200+ lua threads)...

My problem is that my threads in 5.1 are being garbage collected, while in
5.0.3 they are not.
I am using luaL_ref to "save" the thread from being collected until I want
it to be, then I use luaL_unref.
This doesn't seem to be working in 5.1, but worked ok in 5.0.3.
I was wondering if there's a new way to do this now in 5.1,
or if there is something new about the GC in 5.1 that I don't understand?

My code goes sort of like this:

t = lua_newthread( L );
thread_reference = luaL_ref( L, LUA_REGISTRYINDEX );

// then some luabind stuff to construct the new actor script
luabind::object class_constructor_function = luabind::globals(t)[classname];
mMyActor = luabind::call_function<Actor*>( class_constructor_function );
mActorObj = luabind::object( t, mMyActor );

// then push the function and resume the thread for the first time:
mActorObj.push(t);
lua_pushstring(t, "__call");
lua_gettable(t, -2);
lua_remove(t, -2);
mActorObj.push(t);
resume_status = lua_resume(t, 1);

// then garbage collect (we want performance in the game to be predictable
as possible)
lua_gc(L,LUA_GCCOLLECT,0);

 // then next frame, resume again
// (here it crashes because 't' has been deleted by lua_gc)
resume_status = lua_resume( t, 0 );


Here 't' has been deleted, I know because I can see 0xfeeefeee's appear
inside 't', which happen when I step over the lua_gc statement...

This did not happen in 5.0.3.  But does happen in 5.1

I don't want to leave the (many) threads on the stack, in case I overflow
the stack size (my code does work when leaving the thread on the stack, so i
know everything else is ok...)...
 Any ideas?
thanks!!


(want to play with my code?  see here:
http://www.subatomicglue.com/download/luacoroutines.tgz
note: this version has luaL_ref commented out, so you'd need to put it back
to play with it...)

--
http://www.subatomicglue.com

Reply | Threaded
Open this post in threaded view
|

Re: problems upgrading from 5.0.3 - 5.1 with lua_newthread luaL_ref lua_gc and lua_resume

subatomic
In reply to this post by Rici Lake-2
Thanks Rici,

I did get it working using lua_setfield (basically the same as lua_setglobal) as a line-by-line replacement for luaL_ref.   I use sprintf( name, "%x", threadptr ) to create a unique name for it.  This seems to create the reference count increment to keep the thread from being GC'd...   Really strange. 

Any ideas what changed between 5.0 and 5.1 to cause luaL_ref not to work?  Seems funny that luabind would cause the issue, I'm using the same version (cvshead) of luabind as I was using in 5.0.3...  So if luabind is somehow stomping the luaL_ref results now, it should have stomped them before right?

>> The other thing is that there is only one garbage collector.
makes sense, thanks for pointing it out.

>> The comment about having to collect actors before their thread is
>> deleted suggests that something is wrong with your finalizer model, if
>> you actually experienced the crash to which you refer.

Could you point me to some information about "finalizer models" ?   I'm not sure what to look for on this, and searching google doesn't turn anything up.  Are you refering to the way I close down the app?   I do get a crash %50 of the time in lua_close, depending on what scripts are run.  though I can't narrow it to one script, it's always a combination of scripts resulting in a crash inside lua_close where the 'L' lua_State is 0xfeeefeee (garbage collected)...   I haven't traced it yet (will tonight) but it seems like there is a luabind object on the stack still referencing one of the dead threads that I lua_removed, and which apparently was soon after garbage collected.   I suspect I just have to find the luabind object and kill it before forcing the removal of its thread...


thanks

On 10/31/06, Rici Lake <[hidden email]> wrote:

On 31-Oct-06, at 3:56 AM, subatomic wrote:

> Thumbing through the 5.1 ref manual, I tried this instead of luaL_ref:
>
> lua_setfield(L, LUA_GLOBALSINDEX, "bok");
>
> and it seems to work (my thread is not garbage collected).
>
> I don't like the string lookup, and I also wonder about the best way
> to generate a unique string-per-thread (just convert the thread ptr to
> string??). seems hacky, I really liked the luaL_ref way of doing
> this... I'd still be very interested to hear what people know about
> all this, and what way (other than the stack) should I ref my threads
> to save them from the perils of the GC.

I took a glance at your code, but it's not clear to me what might be
going wrong. I'd check to make sure that the thread is still in the
registry where you put it, in case perhaps luabind is playing some
games with the registry. (lua_rawgeti(L, LUA_REGISTRYINDEX,
thread_reference) will push it onto the stack.)

A couple of notes, though.

First, I think you should be less nervous about just keeping your
threads on the Lua stack. It's not the C stack, and its capacity is
pretty well what you want it to be (it's just a malloc()'d array); use
lua_checkstack or luaL_checkstack to expand it to an appropriate size).
You could call lua_gettop(L) after you push the thread onto the stack
in order to get the stack index, which would work like your
thread_reference, and you can delete a single element from the stack
with something like:

   lua_pushnil(L);
   lua_replace(L, index);

That will preserve stack indexes. You could also thread a free list
through the deleted entries, luaL_ref() style, by pushing an integer
instead of nil:

   lua_pushinteger(L, mFree);
   lua_replace(L, index);
   mFree = index;

   // To make a new thread:
   lua_newthread(L);
   if (mFree != 0) {
     int tmp = mFree; mFree = lua_tointeger(L, mFree); lua_replace(L,
tmp);
   }

The other thing is that there is only one garbage collector. You cannot
garbage collect threads independently. When you call:

   lua_gc(some_state, GCCOLLECT, 0);

it will garbage collect the entire Lua universe to which some_state
belongs, using some_state to call finalizers. (It also will guarantee
that some_state is not itself collected, in case there is no other
reference to it; other than that, it doesn't make that much difference
which state you use, although I'd tend to use the main state.)

The comment about having to collect actors before their thread is
deleted suggests that something is wrong with your finalizer model, if
you actually experienced the crash to which you refer.



--
Kevin Meinert
http://www.subatomicglue.com
Reply | Threaded
Open this post in threaded view
|

Re: problems upgrading from 5.0.3 - 5.1 with lua_newthread luaL_ref lua_gc and lua_resume

Rici Lake-2

On 31-Oct-06, at 12:47 PM, subatomic wrote:

Thanks Rici,

I did get it working using lua_setfield (basically the same as lua_setglobal) as a line-by-line replacement for luaL_ref.   I use sprintf( name, "%x", threadptr ) to create a unique name for it.  This seems to create the reference count increment to keep the thread from being GC'd...   Really strange.

Actually, Lua uses a tracing garbage collector, not reference counts.

You'd probably find it faster to use a lightuserdata as the key:

lua_State *t = lua_newthread(L);
lua_pushlightuserdata(L, t);
lua_insert(L, -2); /* flip the top two stack elements */
lua_rawset(L, -3);

But keeping them all on the main stack strikes me as a better solution.

Alternatively, you could put your own reference table on the stack at a known slot (perhaps 1), and then use luaL_ref with that table instead of the registry.


Any ideas what changed between 5.0 and 5.1 to cause luaL_ref not to work?  Seems funny that luabind would cause the issue, I'm using the same version (cvshead) of luabind as I was using in 5.0.3...  So if luabind is somehow stomping the luaL_ref results now, it should have stomped them before right?

I think the indexes changed, I'm not sure. If luabind is reproducing the behaviour of luaL_ref instead of just using the interface (for "efficiency"), it's possible that it is not getting it right. I don't know much about luabind.

>> The comment about having to collect actors before their thread is
>> deleted suggests that something is wrong with your finalizer model, if
>> you actually experienced the crash to which you refer.

Could you point me to some information about "finalizer models" ?   I'm not sure what to look for on this, and searching google doesn't turn anything up.

Sorry, that was a bit telegraphic.

The problem with the luaL_ref model is that objects don't have references to other objects. They have indexes into the ref registry, which are opaque to the garbage collector. Consequently, the garbage collector will be more conservative than necessary; it often won't be able to collect cyclic pseudo-references.

The issue with finalizers is that they have to be run in the right order; if object A references object B, then B cannot be finalized while A is live. If B also references A then there is a cycle; that's not a problem for the garbage collector (except as noted above, when the references are opaque), but it is a problem for running the finalizer. It's not obviously safe to run the finalizers in either order.

In practice, it is probably the case that some order will work; even though A holds a reference to B, for example, it's finalizer may not actually use that reference. There is no way the garbage collector can know this, though, and the fact that the references are not real makes the situation worse. Often the finalizers are not run at all until you close the lua state; at that point, Lua will run all the finalizers even if the objects appear to be live. It runs them in reverse order of creation (newest object first). That can cause a crash if the newer object's finalizer expects the older object to still be around, and that's what I meant by your "finalizer model".

Since all of that is probably hidden inside luabind, I don't know what to suggest to fix it, though.

Lua 5.1 userdata have an environment table which can be used to hold real references to dependent objects; I've found that works a lot better than the luaL_ref model, in many cases.

Many people use complicated schemes involving weak pointers, etc., to work around the finalizer issue; it's generally not pretty. However, in a lot of real applications, the problem can be avoided by removing the dependencies. For example, if an object A includes a FILE* which needs to be closed when the object is garbage collected, instead of using a finalizer attached to A itself, it is possible (with Lua 5.1 environment tables) to create a dependent object A' which does not refer to anything, but which includes the FILE*. If A holds the only reference to A', then A' will be garbage collectable whenever A is, but A' can be finalized without respect to ordering considerations. This strategy can at least make finalization dependencies clearer, and that is often enough to solve finalizer cycles.


Reply | Threaded
Open this post in threaded view
|

Re: problems upgrading from 5.0.3 - 5.1 with lua_newthread luaL_ref lua_gc and lua_resume

subatomic
>> lua_State *t = lua_newthread(L);
>> lua_pushlightuserdata(L, t);
>> lua_insert(L, -2); /* flip the top two stack elements */
>> lua_rawset(L, -3);

So I was reading this, and was wondering what the -3 is refering to?
here's my stack, as you can see I have no -3 position
-- lua_newthread-----------------
thread
--------------------
-- lua_pushlightuserdata-----------------
thread
userdata
--------------------
-- lua_insert-----------------
userdata
thread
--------------------
then we crash with rawset of course, since there is no -3...


docs say this about settable (rawset docs says to look at settable):
void lua_settable (lua_State *L, int index);

Does the equivalent to t[k] = v, where t is the value at the given valid index index, v is the value at the top of the stack, and k is the value just below the top.

This function pops both the key and the value from the stack.

so here, if I'm reading correctly, v == thread, t == ??, and k == userdata... 
seems like I need some kind of table pushed into address -3 in the stack.

being the lazy guy I am (i.e. not creating my own table):...
looks like maybe I can do this (instead of -3):
   lua_settable(L, LUA_GLOBALSINDEX);


so... testing in my code, it appears to work (yes, like a charm),
here's what I did, so other people can find it in the search engines...:

// create thread:
t = lua_newthread( L );

// save off the thread, to prevent it from being GC'd
// globals[ gettop(-1) ] = gettop()
// globals[ userdata(t) ] = t
lua_pushlightuserdata(L, t);
lua_insert(L, -2);                  //< swap userdata and thread, so thread is at gettop()
lua_settable(L, LUA_GLOBALSINDEX);  //< globals[ userdata ] = thread


// clear out the thread we stored in globals[t], this allows the thread to be GC'd
// globals[ gettop(-1) ] = gettop()
// globals[ userdata(t) ] = nil
lua_pushlightuserdata(L, t);
lua_pushnil( L );
lua_settable(L, LUA_GLOBALSINDEX);



thanks, this looks like a good option for keeping the thread off the stack,
- kevin

On 10/31/06, Rici Lake <[hidden email]> wrote:

On 31-Oct-06, at 12:47 PM, subatomic wrote:

> Thanks Rici,
>
> I did get it working using lua_setfield (basically the same as
> lua_setglobal) as a line-by-line replacement for luaL_ref. I use
> sprintf( name, "%x", threadptr ) to create a unique name for it. This
> seems to create the reference count increment to keep the thread from
> being GC'd... Really strange.

Actually, Lua uses a tracing garbage collector, not reference counts.

You'd probably find it faster to use a lightuserdata as the key:

lua_State *t = lua_newthread(L);
lua_pushlightuserdata(L, t);
lua_insert(L, -2); /* flip the top two stack elements */
lua_rawset(L, -3);

But keeping them all on the main stack strikes me as a better solution.

Alternatively, you could put your own reference table on the stack at a
known slot (perhaps 1), and then use luaL_ref with that table instead
of the registry.

>
> Any ideas what changed between 5.0 and 5.1 to cause luaL_ref not to
> work? Seems funny that luabind would cause the issue, I'm using the
> same version (cvshead) of luabind as I was using in 5.0.3... So if
> luabind is somehow stomping the luaL_ref results now, it should have
> stomped them before right?

I think the indexes changed, I'm not sure. If luabind is reproducing
the behaviour of luaL_ref instead of just using the interface (for
"efficiency"), it's possible that it is not getting it right. I don't
know much about luabind.

> >> The comment about having to collect actors before their thread is
> >> deleted suggests that something is wrong with your finalizer model,
> if
> >> you actually experienced the crash to which you refer.
>
> Could you point me to some information about "finalizer models" ?
> I'm not sure what to look for on this, and searching google doesn't
> turn anything up.

Sorry, that was a bit telegraphic.

The problem with the luaL_ref model is that objects don't have
references to other objects. They have indexes into the ref registry,
which are opaque to the garbage collector. Consequently, the garbage
collector will be more conservative than necessary; it often won't be
able to collect cyclic pseudo-references.

The issue with finalizers is that they have to be run in the right
order; if object A references object B, then B cannot be finalized
while A is live. If B also references A then there is a cycle; that's
not a problem for the garbage collector (except as noted above, when
the references are opaque), but it is a problem for running the
finalizer. It's not obviously safe to run the finalizers in either
order.

In practice, it is probably the case that some order will work; even
though A holds a reference to B, for example, it's finalizer may not
actually use that reference. There is no way the garbage collector can
know this, though, and the fact that the references are not real makes
the situation worse. Often the finalizers are not run at all until you
close the lua state; at that point, Lua will run all the finalizers
even if the objects appear to be live. It runs them in reverse order of
creation (newest object first). That can cause a crash if the newer
object's finalizer expects the older object to still be around, and
that's what I meant by your "finalizer model".

Since all of that is probably hidden inside luabind, I don't know what
to suggest to fix it, though.

Lua 5.1 userdata have an environment table which can be used to hold
real references to dependent objects; I've found that works a lot
better than the luaL_ref model, in many cases.

Many people use complicated schemes involving weak pointers, etc., to
work around the finalizer issue; it's generally not pretty. However, in
a lot of real applications, the problem can be avoided by removing the
dependencies. For example, if an object A includes a FILE* which needs
to be closed when the object is garbage collected, instead of using a
finalizer attached to A itself, it is possible (with Lua 5.1
environment tables) to create a dependent object A' which does not
refer to anything, but which includes the FILE*. If A holds the only
reference to A', then A' will be garbage collectable whenever A is, but
A' can be finalized without respect to ordering considerations. This
strategy can at least make finalization dependencies clearer, and that
is often enough to solve finalizer cycles.



--
Kevin Meinert
http://www.subatomicglue.com
Reply | Threaded
Open this post in threaded view
|

Re: problems upgrading from 5.0.3 - 5.1 with lua_newthread luaL_ref lua_gc and lua_resume

Rici Lake-2

On 31-Oct-06, at 11:28 PM, subatomic wrote:

>> lua_State *t = lua_newthread(L);
>> lua_pushlightuserdata(L, t);
>> lua_insert(L, -2); /* flip the top two stack elements */
>> lua_rawset(L, -3);

So I was reading this, and was wondering what the -3 is refering to?

Oops. I was thinking of having your own registry table.

s/-3/LUA_REGISTRYINDEX/



Reply | Threaded
Open this post in threaded view
|

Re: problems upgrading from 5.0.3 - 5.1 with lua_newthread luaL_ref lua_gc and lua_resume

subatomic
In reply to this post by subatomic

>> though I can't narrow it to one script, it's always a combination of scripts resulting in a
>> crash inside lua_close where the 'L' lua_State is 0xfeeefeee (garbage collected)...   I
>> haven't traced it yet (will tonight) but it seems like there is a luabind object on the stack
>> still referencing one of the dead threads that I lua_removed, and which apparently was
>> soon after garbage collected. 

I found this problem, so am posting here in case anyone else has it as well...
The problem was that sometimes the app would crash on lua_close, inside of some luabind stuff referencing a lua_State (the thread ptr returned by lua_newthread) that was garbage collected already.

My problem was that I was calling luabind::call_function and luabind::globals with my thread state (t), instead of with the interpreter's state (L)

// brakes on lua_close:
t = lua_newthread( L );
luabind::object class_constructor_function = luabind::globals(L)[classname];
mActorPtr = luabind::call_function<Actor*>( class_constructor_function );
mActorObj = luabind::object( L, mActorPtr );

// runs without crashes
t = lua_newthread( L );
luabind::object class_constructor_function = luabind::globals(L)[classname];
mActorPtr = luabind::call_function<Actor*>( class_constructor_function );
mActorObj = luabind::object( L, mActorPtr );


sorry for the luabind post here...  though I suppose the _idea_ applies to functions in lua as well, i.e. don't use the thread to access globals, and don't store the thread in with objects that will be GC'd after you GC your thread...  since during deletion those objects could access the invalid memory inside the already deleted thread at that point...

just wanted to tie up my original post.. all problems have been solved, thanks everyone.

On 10/31/06, subatomic <[hidden email]> wrote:
Thanks Rici,

I did get it working using lua_setfield (basically the same as lua_setglobal) as a line-by-line replacement for luaL_ref.   I use sprintf( name, "%x", threadptr ) to create a unique name for it.  This seems to create the reference count increment to keep the thread from being GC'd...   Really strange. 

Any ideas what changed between 5.0 and 5.1 to cause luaL_ref not to work?  Seems funny that luabind would cause the issue, I'm using the same version (cvshead) of luabind as I was using in 5.0.3...  So if luabind is somehow stomping the luaL_ref results now, it should have stomped them before right?

>> The other thing is that there is only one garbage collector.
makes sense, thanks for pointing it out.

>> The comment about having to collect actors before their thread is
>> deleted suggests that something is wrong with your finalizer model, if
>> you actually experienced the crash to which you refer.

Could you point me to some information about "finalizer models" ?   I'm not sure what to look for on this, and searching google doesn't turn anything up.  Are you refering to the way I close down the app?   I do get a crash %50 of the time in lua_close, depending on what scripts are run.  though I can't narrow it to one script, it's always a combination of scripts resulting in a crash inside lua_close where the 'L' lua_State is 0xfeeefeee (garbage collected)...   I haven't traced it yet (will tonight) but it seems like there is a luabind object on the stack still referencing one of the dead threads that I lua_removed, and which apparently was soon after garbage collected.   I suspect I just have to find the luabind object and kill it before forcing the removal of its thread...


thanks


On 10/31/06, Rici Lake <[hidden email]> wrote:

On 31-Oct-06, at 3:56 AM, subatomic wrote:

> Thumbing through the 5.1 ref manual, I tried this instead of luaL_ref:
>
> lua_setfield(L, LUA_GLOBALSINDEX, "bok");
>
> and it seems to work (my thread is not garbage collected).
>
> I don't like the string lookup, and I also wonder about the best way
> to generate a unique string-per-thread (just convert the thread ptr to
> string??). seems hacky, I really liked the luaL_ref way of doing
> this... I'd still be very interested to hear what people know about
> all this, and what way (other than the stack) should I ref my threads
> to save them from the perils of the GC.

I took a glance at your code, but it's not clear to me what might be
going wrong. I'd check to make sure that the thread is still in the
registry where you put it, in case perhaps luabind is playing some
games with the registry. (lua_rawgeti(L, LUA_REGISTRYINDEX,
thread_reference) will push it onto the stack.)

A couple of notes, though.

First, I think you should be less nervous about just keeping your
threads on the Lua stack. It's not the C stack, and its capacity is
pretty well what you want it to be (it's just a malloc()'d array); use
lua_checkstack or luaL_checkstack to expand it to an appropriate size).
You could call lua_gettop(L) after you push the thread onto the stack
in order to get the stack index, which would work like your
thread_reference, and you can delete a single element from the stack
with something like:

   lua_pushnil(L);
   lua_replace(L, index);

That will preserve stack indexes. You could also thread a free list
through the deleted entries, luaL_ref() style, by pushing an integer
instead of nil:

   lua_pushinteger(L, mFree);
   lua_replace(L, index);
   mFree = index;

   // To make a new thread:
   lua_newthread(L);
   if (mFree != 0) {
     int tmp = mFree; mFree = lua_tointeger(L, mFree); lua_replace(L,
tmp);
   }

The other thing is that there is only one garbage collector. You cannot
garbage collect threads independently. When you call:

   lua_gc(some_state, GCCOLLECT, 0);

it will garbage collect the entire Lua universe to which some_state
belongs, using some_state to call finalizers. (It also will guarantee
that some_state is not itself collected, in case there is no other
reference to it; other than that, it doesn't make that much difference
which state you use, although I'd tend to use the main state.)

The comment about having to collect actors before their thread is
deleted suggests that something is wrong with your finalizer model, if
you actually experienced the crash to which you refer.






--
Kevin Meinert
http://www.subatomicglue.com