Issue with my patch to add ARM-uclibc support to LuaCoco

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Issue with my patch to add ARM-uclibc support to LuaCoco

Zachary P. Landau-4
Hello,

I recently created a patch to LuaCoco to add ARM support under uClibc.
I'm having a problem and I was hoping someone might be able to help
get me on the right track.

With my patch, all of the tests in the cotest directory pass.  I then
tried using the Coco version of Lua with a larger codebase I am
working on.  Most of the yield/resumes work without any issues.  But
when I get to a particular yield, I get an error about crossing a
metamethod/C-call boundary.  This yield should have been in a coco
coroutine, but it doesn't think it is.

I looked at every case where I call coroutine.create and
coroutine.wrap, and in none of those cases do I pass -1 for a
cstacksize, so I would expect every coroutine to be a coco one.  Also,
right before I call the offending yield I print the value of
coroutine.cstacksize() and it gives me 61440, which I would expect.
So at this point it thinks it is a coco coroutine as well.  But when
the code gets to luai_userstateyield() in lua_yield(), the LHASCOCO
macro returns false and so we try to do a non-coco yield instead.

Most of the calls to LHASCOCO do pass as I would expect, and I can't
find anything special about this particular coroutine.  Currently I
don't have a small test to exploit this issue, given the nature of it.
But I can try to create one.  I did make sure to recompile all of the
Lua C modules I am using, although I'm not sure if that would have
mattered.

Here is the code I used for adding ARM/uclibc support.  Perhaps the
issue lies in there.  Specifically, I am not sure why COCO_STACKADJUST
is needed in some cases and if I need it.

#define COCO_MAIN_PARAM     int _a, int _b, int _c, int _d, lua_State *L

#define COCO_PATCHCTX(coco, buf, func, stack, a0) \
 buf->__jmpbuf[21] = (int)(func); \
 buf->__jmpbuf[20] = (int)(stack); \
 buf->__jmpbuf[19] = (int)0; /* ebp */ \
 stack[0] = (size_t)(a0);

Any debugging suggestions would be appreciated.  Thanks.

--
Zachary P. Landau <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: Issue with my patch to add ARM-uclibc support to LuaCoco

Mike Pall-72
Hi,

Zachary P. Landau wrote:
> I recently created a patch to LuaCoco to add ARM support under uClibc.
> I'm having a problem and I was hoping someone might be able to help
> get me on the right track.

Oh, nice! Be sure to send it to me when it's finished. I'll
probably release an upgrade to Coco soon, so it patches cleanly
against Lua 5.1.2 (otherwise no functional changes).

> I looked at every case where I call coroutine.create and
> coroutine.wrap, and in none of those cases do I pass -1 for a
> cstacksize, so I would expect every coroutine to be a coco one.

Do you create any coros from C? You need to use lua_newcthread()
and not lua_newthread().

> Also, right before I call the offending yield I print the value
> of coroutine.cstacksize() and it gives me 61440, which I would
> expect.

Err, that function returns the _default_ stacksize for newly
created coros (and not any info about a particular coro).

> So at this point it thinks it is a coco coroutine as well.  But when
> the code gets to luai_userstateyield() in lua_yield(), the LHASCOCO
> macro returns false and so we try to do a non-coco yield instead.

Well, that part of the code is a one-liner. So I don't think it's
the culprit. Either the coroutine does not have a C stack to
begin with (see above) or the particular field holding the status
has been corrupted (trace modifications to it with a debugger).

> Here is the code I used for adding ARM/uclibc support.  Perhaps the
> issue lies in there.  Specifically, I am not sure why COCO_STACKADJUST
> is needed in some cases and if I need it.

Have a look at the COCO_FILL() macro. Basically you want the
stack pointer to start at the right slot below the aligned end of
the stack. This depends on the way the setjmp() function is
implemented. E.g. note the subtle differences between the Linux
and the BSD implementations of setjmp() on x86. You also need to
keep clear of any save-areas which are located above the stack
pointer and may be used by the callee (usually only if compiled
_without_ optimization -- so, test that, too).

I guess the only way to find out is to carefully read the
relevant ABI, the assembler code for setjmp() and to step through
some example function epilogues. Ensure that the stack pointer is
properly aligned when calling successive C functions, too.

Note that it may be easier (and more portable across library
versions) to add some inline assembler code than patching
setjmp() buffers. See the x86 and the MIPS entries.

Bye,
     Mike

Reply | Threaded
Open this post in threaded view
|

Re: Issue with my patch to add ARM-uclibc support to LuaCoco

Zachary P. Landau-4
Oh, nice! Be sure to send it to me when it's finished. I'll
probably release an upgrade to Coco soon, so it patches cleanly
against Lua 5.1.2 (otherwise no functional changes).

No problem.  I'll send it along once I work this issue out.

Do you create any coros from C? You need to use lua_newcthread()
and not lua_newthread().

No, I don't.  I even verified this within the code, to report to me if
anyone other than coco itself called newthread, and nobody does.

Err, that function returns the _default_ stacksize for newly
created coros (and not any info about a particular coro).

Whoops, so it does.

> So at this point it thinks it is a coco coroutine as well.  But when
> the code gets to luai_userstateyield() in lua_yield(), the LHASCOCO
> macro returns false and so we try to do a non-coco yield instead.

Well, that part of the code is a one-liner. So I don't think it's
the culprit. Either the coroutine does not have a C stack to
begin with (see above) or the particular field holding the status
has been corrupted (trace modifications to it with a debugger).

[snip]

Note that it may be easier (and more portable across library
versions) to add some inline assembler code than patching
setjmp() buffers. See the x86 and the MIPS entries.

I took your advice on this and wrote a version using inline assembler
instead.  Once I got that working, the same issue showed up.  That is,
at some point L at the end of the stack comes back as 0.  The fact
that both my setjmp and assembler versions both show the same error
probably tells me something significant.  I'm just too burned out
today to process what it is.  I'll get back on that tomorrow.

Some possibly interesting helpful I learned while debugging:

* I verified that I am always creating a C stack, and that L and
coco_State are always setup properly at that point
* Up until the point where L points to 0, L is always the correct
value.  That is, there doesn't seem to be any sort of corruption of
the L value (at least not until the end)
* The problem always shows up during the same two calls to yield: the
calls in copas.lua in the send function.  I'm not sure if/how these
are different.  I am going to try to get together a demo and see if I
can get that to fail in the same way.

Looking through the generated luaCOCO_resume and luaCOCO_yield code, I
don't see any obvious things that would cause corruption.  The first
time I notice L pointing to 0 is when we jump back to the
luaCOCO_resume function.  But jumping back to luaCOCO_resume USUALLY
works, just not in this weird case.

Anyway, I spent most of my time just getting the assembler
implementation done, so I haven't yet spent a whole lot of time
debugging it.  I'm mostly emailed you in case my recent finds help you
spot an obvious problem.

Thanks for your suggestions, they've been helpful.

--
Zachary P. Landau <[hidden email]>

Reply | Threaded
Open this post in threaded view
|

Re: Issue with my patch to add ARM-uclibc support to LuaCoco

Zachary P. Landau-4
* The problem always shows up during the same two calls to yield: the
calls in copas.lua in the send function.  I'm not sure if/how these
are different.  I am going to try to get together a demo and see if I
can get that to fail in the same way.

Guh, I think I found the issue, and I don't think it is a Coco problem
at all.  I checked all new coroutine threads to make sure they were
always created with a stack.  But I forgot about the main thread of
execution.  It seems like my copy of copas was calling
coroutine.yield() from the main thread instead of in a coroutine.  I'm
not sure why, but I'll look into that issue separately.

So I think my original setjmp implementation is fine.  The
implementation I posted earlier hasn't changed, but I can make a
proper patch of it for you.

As for my assembler implementation, it isn't quite there yet.  When
removing one of my debugging printfs I started getting a crash, so
there is some flaw somewhere.  I'll continue tracking it down when I
get a chance.  The assembler version will be more limited anyway,
depending on which ARM processor you are using, so I think it's a good
idea to have both implementations available.

--
Zachary P. Landau <[hidden email]>