LuaLanes on Ubuntu 11.10, kernel 3.0

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: LuaLanes on Ubuntu 11.10, kernel 3.0

Benoit Germain-2
2011/11/10 Philipp Janda <[hidden email]>:
>
> A proper solution would probably involve using pthread_cleanup_push to free
> the Lane's memory on actual cancellation.

I am not sure I get the usage of pthread_cancel_push right. What I
could do is this:
Add a pthread_cancel_push()/pthread_cancel_pop() pair at the beginning
and end of lane_main.
The callback I provide sets some flag in the s_lane structure.
In selfdestruct_atexit(), after the call to THREAD_KILL(), I
block-wait for this flag to be set in the structure, at which point I
can free it.
Would it be ok that way? When the thread actually terminates because
of the thread_kill() call, am I supposed to call pthread_cancel_pop()
(which I suppose might not be the case if execution doesn't reach the
end of lane_main)?


--
Benoit.

Reply | Threaded
Open this post in threaded view
|

Re: LuaLanes on Ubuntu 11.10, kernel 3.0

Philipp Janda
On 10.11.2011 12:04, Benoit Germain wrote:
> 2011/11/10 Philipp Janda<[hidden email]>:
>>
>> A proper solution would probably involve using pthread_cleanup_push to free
>> the Lane's memory on actual cancellation.
>
> I am not sure I get the usage of pthread_cancel_push right. What I
> could do is this:
> Add a pthread_cancel_push()/pthread_cancel_pop() pair at the beginning
> and end of lane_main.

So far so good.

> The callback I provide sets some flag in the s_lane structure.
> In selfdestruct_atexit(), after the call to THREAD_KILL(), I
> block-wait for this flag to be set in the structure, at which point I
> can free it.

That would probably work, but an easier way would be:
static void* lane_main( void* vs ) {
   pthread_cleanup_push( free, vs ); // handles cancels and exits

   // code ...

   pthread_cleanup_pop( TRUE ); // handles reaching end of lane_main
   return 0;
}

That would pop and call 'free( vs )' automatically when/if the thread is
cancelled or calls pthread_exit. 'pthread_cleanup_pop( TRUE )' pops and
calls 'free( vs )' manually if the execution reaches the end of lane_main.
You could also replace 'free' with a function that additionally removes
its argument from the selfdestruct chain. Those functions could also be
used to release locks if you are cancelled while holding one, etc.

> Would it be ok that way? When the thread actually terminates because
> of the thread_kill() call, am I supposed to call pthread_cancel_pop()
> (which I suppose might not be the case if execution doesn't reach the
> end of lane_main)?

No, pthread_cleanup_(push|pop) only work for the calling thread and the
pop happens automatically if the thread is cancelled.

Btw. I have some leads about the SIGSEGV in tests/atexit.lua, but I
don't know how to proceed from here:
[Thread debugging using libthread_db enabled]
Core was generated by `/home/siffiejoe/downloads/lua-5.1.4/src/lua
tests/atexit.lua'.
Program terminated with signal 11, Segmentation fault.
#0  0x4067bb19 in ?? () from /lib/i386-linux-gnu/libgcc_s.so.1
(gdb) bt
#0  0x4067bb19 in ?? () from /lib/i386-linux-gnu/libgcc_s.so.1
#1  0x4067c078 in ?? () from /lib/i386-linux-gnu/libgcc_s.so.1
#2  0x4067c528 in _Unwind_Resume () from /lib/i386-linux-gnu/libgcc_s.so.1
#3  0x400d53f2 in _Unwind_Resume (exc=0x402fddc0) at
../sysdeps/gnu/unwind-resume.c:52
#4  0x4011ddeb in _IO_acquire_lock_fct (p=<optimized out>) at libioP.h:985
#5  fputc (c=10, fp=0x40234500) at fputc.c:38
#6  0x0806882b in luaB_print (L=0x40302f80) at lbaselib.c:48
#7  0x08051232 in luaD_precall (L=0x40302f80, func=0x40301990,
nresults=0) at ldo.c:319
#8  0x0805d7c0 in luaV_execute (L=0x40302f80, nexeccalls=1) at lvm.c:587
#9  0x0805148a in luaD_call (L=0x40302f80, func=0x40301978, nResults=-1)
at ldo.c:377
#10 0x0804e3ca in f_call (L=0x40302f80, ud=0x402fd33c) at lapi.c:800
#11 0x08050673 in luaD_rawrunprotected (L=0x40302f80, f=0x804e3a0
<f_call>, ud=0x402fd33c) at ldo.c:116
#12 0x080517f0 in luaD_pcall (L=0x40302f80, func=0x804e3a0 <f_call>,
u=0x402fd33c, old_top=24, ef=12) at ldo.c:463
#13 0x0804e463 in lua_pcall (L=0x40302f80, nargs=0, nresults=-1,
errfunc=1) at lapi.c:821
#14 0x40027a11 in ?? ()
#15 0x40302f80 in ?? ()
#16 0x400a7d31 in start_thread (arg=0x402fdb70) at pthread_create.c:304
#17 0x4018e0ce in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
Backtrace stopped: Not enough registers or memory available to unwind
further

The crash happens in one of the print()s. If I comment them out the
crash goes away. I think one cancelled thread leaves stdio in an invalid
state.

Philipp


Reply | Threaded
Open this post in threaded view
|

Re: LuaLanes on Ubuntu 11.10, kernel 3.0

Benoit Germain-2
2011/11/10 Philipp Janda <[hidden email]>:

> On 10.11.2011 12:04, Benoit Germain wrote:
>
> That would probably work, but an easier way would be:
> static void* lane_main( void* vs ) {
>  pthread_cleanup_push( free, vs ); // handles cancels and exits
>
>  // code ...
>
>  pthread_cleanup_pop( TRUE ); // handles reaching end of lane_main
>  return 0;
> }
>
> That would pop and call 'free( vs )' automatically when/if the thread is
> cancelled or calls pthread_exit. 'pthread_cleanup_pop( TRUE )' pops and
> calls 'free( vs )' manually if the execution reaches the end of lane_main.

We don't want to do this (freeing the structure as soon as the thread
terminates). We have to do this only after the lane userdata is GC'ed,
else we can't be sure another Lua state won't want to retrieve the
return values of the lanes's function. THREAD_KILL at main state's GC
is a last attempt to do things right (mainly freeing the lane
structure, which involves stopping the execution of the associated
thread).

> The crash happens in one of the print()s. If I comment them out the crash goes away. I think one cancelled thread leaves stdio in an invalid state.

I'd say: let's fix the first bug, just in case this one is a side effect.


--
Benoit.

Reply | Threaded
Open this post in threaded view
|

Re: LuaLanes on Ubuntu 11.10, kernel 3.0

Benoit Germain-2
In reply to this post by Philipp Janda
2011/11/10 Philipp Janda <[hidden email]>:

> On 09.11.2011 17:31, Benoit Germain wrote:
>>> The glibc memory corruption is gone, but now I get an assertion failure
>>> on
>>> 'make test' (not always on the first test, but I don't get far):
>>> lua5.1: lanes.c:1018: selfdestruct_remove: Assertion `found' failed.
>>> Aborted (core dumped)
>
> A proper solution would probably involve using pthread_cleanup_push to free
> the Lane's memory on actual cancellation.
>
> Now I get a SIGSEGV in tests/atexit.lua :-)
>


Just wondering. I changed the desinit sequence a bit. Before,
selfdestruct_atexit was an atexit() handler. Now it is called when the
main application's Lua state is closed, causing garbage collection of
a full userdata that triggers its __gc metamethod.
I changed to the new behavior because I observed when debugging the
win32 build that the threads were no longer registered in the debugger
when tracing inside the atexit sequence.

I haven't had reports of this crash with the old method. So maybe the
threads no longer run when atexit handlers are called with the pthread
build too?

In that case, maybe moving again the s_lane structures liberation
still present in the selfdestruct chain in an atexit handler would fix
the crash?


--
Benoit.

Reply | Threaded
Open this post in threaded view
|

Re: LuaLanes on Ubuntu 11.10, kernel 3.0

Benoit Germain-2
2011/11/14 Benoit Germain <[hidden email]>:
>
> In that case, maybe moving again the s_lane structures liberation
> still present in the selfdestruct chain in an atexit handler would fix
> the crash?


Another option: add a THREAD_WAIT call after the THREAD_KILL, like
what is done in thread_gc(). This is supposed to ensure the thread is
not running anymore.


--
Benoit.

Reply | Threaded
Open this post in threaded view
|

Re: LuaLanes on Ubuntu 11.10, kernel 3.0

Benoit Germain-2
2011/11/14 Benoit Germain <[hidden email]>:
>
> Another option: add a THREAD_WAIT call after the THREAD_KILL, like
> what is done in thread_gc(). This is supposed to ensure the thread is
> not running anymore.
>

I've just submitted this change. Can you get git HEAD and tell me if
it changes anything?


--
Benoit.

12