Question regarding handling lua strings inside C API

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Question regarding handling lua strings inside C API

raksoras lua
I understand that C code generally should not store the pointer to a
string returned by lua_tostring beyond the lifetime of the C function
call.(PIL says "The lua_tostring function returns a pointer to an
internal copy of the string. Lua ensures that this pointer is valid as
long as the corresponding value is in the stack. When a C function
returns, Lua clears its stack; therefore, as a rule, you should never
store pointers to Lua strings outside the function that got them.")

However, is this restriction strictly true if I can guarantee - from
my code flow - that the original string in lua will continue to be in
reachable (that is, it is not garbage collected) when my C function
call returns?

Here is my actual scenario: Luaw HTTP server uses Lua coroutines to
handle multiple connections simultaneously. While generating response
a Lua coroutine calls a C function with Lua string that the C function
is eventually supposed to write to socket. But since the socket write
may block, the C function actually uses async write API (libuv to be
specific) to write to the socket. The function then returns
immediately and the Lua coroutine that called the C function yields to
suspend itself. When the socket is ready for the actual write, libuv's
async API invokes a C callback that writes the string on the socket
and then resumes the suspended coroutine.

Now in classical model I should make a copy of the string passed into
the first C function as the function will return before the callback
has a chance to write the string onto the socket. However I want to
avoid the data copy (memcpy) if I can get away with it for performance
reasons. My rationale is even though the original C function has
returned the Lua coroutine that called the C function gets suspended
as soon as C function call returns. So the original string that was
passed in the C call is still reference-able from the Lua coroutine
and hence should not be garbage collected. So I should be able to just
store the pointer to the internal Lua string returned by
lua_tostring() without making a copy of it. When the write callback is
called it first calls a C function - kind of a continuation of first C
call - that writes the string onto the socket and then resumes Lua
coroutine. This way I can guarantee that the original Lua string's
lifetime is more than the two C calls - original call and then the
callback invoked by libuv - involved.

Is it safe to not make a copy of a lua string in C function in this case?

Thanks,

- Susheel

Reply | Threaded
Open this post in threaded view
|

Re: Question regarding handling lua strings inside C API

William Ahern
On Sat, Apr 04, 2015 at 04:55:25PM -0700, raksoras lua wrote:

> I understand that C code generally should not store the pointer to a
> string returned by lua_tostring beyond the lifetime of the C function
> call.(PIL says "The lua_tostring function returns a pointer to an
> internal copy of the string. Lua ensures that this pointer is valid as
> long as the corresponding value is in the stack. When a C function
> returns, Lua clears its stack; therefore, as a rule, you should never
> store pointers to Lua strings outside the function that got them.")
>
> However, is this restriction strictly true if I can guarantee - from
> my code flow - that the original string in lua will continue to be in
> reachable (that is, it is not garbage collected) when my C function
> call returns?

AFAIK, yes, as long as you can guarantee that the string is anchored.
 

> Here is my actual scenario: Luaw HTTP server uses Lua coroutines to
> handle multiple connections simultaneously. While generating response
> a Lua coroutine calls a C function with Lua string that the C function
> is eventually supposed to write to socket. But since the socket write
> may block, the C function actually uses async write API (libuv to be
> specific) to write to the socket. The function then returns
> immediately and the Lua coroutine that called the C function yields to
> suspend itself. When the socket is ready for the actual write, libuv's
> async API invokes a C callback that writes the string on the socket
> and then resumes the suspended coroutine.
 
> Now in classical model I should make a copy of the string passed into
> the first C function as the function will return before the callback
> has a chance to write the string onto the socket. However I want to
> avoid the data copy (memcpy) if I can get away with it for performance
> reasons. My rationale is even though the original C function has
> returned the Lua coroutine that called the C function gets suspended
> as soon as C function call returns. So the original string that was
> passed in the C call is still reference-able from the Lua coroutine
> and hence should not be garbage collected.

This is only true if the string were stored in or through a local variable.
But what if I do something like this:

        socket:write(myobject:createstring())

The string isn't stored in any intermediate variable. The stack frame is
cleared when you yield (see luaD_poscall in Lua 5.3). The GC can and very
well might collect the string because it's not anchored anywhere.

> So I should be able to just store the pointer to the internal Lua string
> returned by lua_tostring() without making a copy of it. When the write
> callback is called it first calls a C function - kind of a continuation of
> first C call - that writes the string onto the socket and then resumes Lua
> coroutine. This way I can guarantee that the original Lua string's
> lifetime is more than the two C calls - original call and then the
> callback invoked by libuv - involved.
>
> Is it safe to not make a copy of a lua string in C function in this case?

In order to make your optimization work, you need to be sure that the string
is stored in a local for the duration of the yield. The only way to
guarantee that is to add an extra call with a function written by yourself
that you know stores the string in a local variable. For example:

        function lib:write(str)
                return self:uvwrite(str)
        end

The problem is that Lua will turn this into a tail call and optimize away
our str local. So we have to do something like:

        function lib:write(str)
                local retval = self:uvwrite(str)
                return retval
        end

It's possible Lua might optimize away the str local here, too. Certainly
LuaJIT is likely to optimize it away. You might need to add some function
call or other crazy call after :uvwrite and before the return statement.

If you're using Lua 5.2 or 5.3 you can use an intermediate C function to
preserve the stack frame. C function code and its stack can't be optimized
away. Using the 5.3 API you can do roughly something like the following:

        static int lib_write(lua_State *L) {
                int nargs = lua_gettop(L);
                int i;

                /*
                 * push our real libuv write binding. this would be quicker
                 * if it were cached as an upvalue.
                 */
                lua_pushcfunction(L, &lib_uvwrite);

                /*
                 * copy arguments so originals remain alive on our stack
                 * frame for the duration of the call.
                 */
                for (i = 1; i <= nargs; i++)
                        lua_pushvalue(L, i);

                return lua_callk(L, nargs, NRET, 0, &lib_poswrite);
        }

        /* return from our call to lib_uvwrite */
        static int lib_poswrite(lua_State *L, int status, lua_KContext ctx) {
                assert(status == LUA_YIELD);
                assert(lua_gettop(L) >= NRET);
                return NRET;
        }

        static int lib_uvwrite(lua_State *L) {
                /*
                 * here's your original binding that called into libuv.
                 */
        }


Reply | Threaded
Open this post in threaded view
|

Re: Question regarding handling lua strings inside C API

Alexey Melnichuk-2
Здравствуйте, William.

Вы писали 5 апреля 2015 г., 7:34:01:

> On Sat, Apr 04, 2015 at 04:55:25PM -0700, raksoras lua wrote:
>> I understand that C code generally should not store the pointer to a
>> string returned by lua_tostring beyond the lifetime of the C function
>> call.(PIL says "The lua_tostring function returns a pointer to an
>> internal copy of the string. Lua ensures that this pointer is valid as
>> long as the corresponding value is in the stack. When a C function
>> returns, Lua clears its stack; therefore, as a rule, you should never
>> store pointers to Lua strings outside the function that got them.")
>>
>> However, is this restriction strictly true if I can guarantee - from
>> my code flow - that the original string in lua will continue to be in
>> reachable (that is, it is not garbage collected) when my C function
>> call returns?

> AFAIK, yes, as long as you can guarantee that the string is anchored.
>
>> Here is my actual scenario: Luaw HTTP server uses Lua coroutines to
>> handle multiple connections simultaneously. While generating response
>> a Lua coroutine calls a C function with Lua string that the C function
>> is eventually supposed to write to socket. But since the socket write
>> may block, the C function actually uses async write API (libuv to be
>> specific) to write to the socket. The function then returns
>> immediately and the Lua coroutine that called the C function yields to
>> suspend itself. When the socket is ready for the actual write, libuv's
>> async API invokes a C callback that writes the string on the socket
>> and then resumes the suspended coroutine.
>
>> Now in classical model I should make a copy of the string passed into
>> the first C function as the function will return before the callback
>> has a chance to write the string onto the socket. However I want to
>> avoid the data copy (memcpy) if I can get away with it for performance
>> reasons. My rationale is even though the original C function has
>> returned the Lua coroutine that called the C function gets suspended
>> as soon as C function call returns. So the original string that was
>> passed in the C call is still reference-able from the Lua coroutine
>> and hence should not be garbage collected.

> This is only true if the string were stored in or through a local variable.
> But what if I do something like this:

>         socket:write(myobject:createstring())

> The string isn't stored in any intermediate variable. The stack frame is
> cleared when you yield (see luaD_poscall in Lua 5.3). The GC can and very
> well might collect the string because it's not anchored anywhere.

>> So I should be able to just store the pointer to the internal Lua string
>> returned by lua_tostring() without making a copy of it. When the write
>> callback is called it first calls a C function - kind of a continuation of
>> first C call - that writes the string onto the socket and then resumes Lua
>> coroutine. This way I can guarantee that the original Lua string's
>> lifetime is more than the two C calls - original call and then the
>> callback invoked by libuv - involved.
>>
>> Is it safe to not make a copy of a lua string in C function in this case?

> In order to make your optimization work, you need to be sure that the string
> is stored in a local for the duration of the yield. The only way to
> guarantee that is to add an extra call with a function written by yourself
> that you know stores the string in a local variable. For example:

>         function lib:write(str)
>                 return self:uvwrite(str)
>         end

> The problem is that Lua will turn this into a tail call and optimize away
> our str local. So we have to do something like:

>         function lib:write(str)
>                 local retval = self:uvwrite(str)
>                 return retval
>         end

> It's possible Lua might optimize away the str local here, too. Certainly
> LuaJIT is likely to optimize it away. You might need to add some function
> call or other crazy call after :uvwrite and before the return statement.

> If you're using Lua 5.2 or 5.3 you can use an intermediate C function to
> preserve the stack frame. C function code and its stack can't be optimized
> away. Using the 5.3 API you can do roughly something like the following:

>         static int lib_write(lua_State *L) {
>                 int nargs = lua_gettop(L);
>                 int i;

>                 /*
>                  * push our real libuv write binding. this would be quicker
>                  * if it were cached as an upvalue.
>                  */
>                 lua_pushcfunction(L, &lib_uvwrite);

>                 /*
>                  * copy arguments so originals remain alive on our stack
>                  * frame for the duration of the call.
>                  */
>                 for (i = 1; i <= nargs; i++)
>                         lua_pushvalue(L, i);

>                 return lua_callk(L, nargs, NRET, 0, &lib_poswrite);
>         }

>         /* return from our call to lib_uvwrite */
>         static int lib_poswrite(lua_State *L, int status, lua_KContext ctx) {
>                 assert(status == LUA_YIELD);
>                 assert(lua_gettop(L) >= NRET);
>                 return NRET;
>         }

>         static int lib_uvwrite(lua_State *L) {
>                 /*
>                  * here's your original binding that called into libuv.
>                  */
>         }





--
С уважением,
 Alexey                          mailto:[hidden email]


---
Это сообщение проверено на вирусы антивирусом Avast.
http://www.avast.com


Reply | Threaded
Open this post in threaded view
|

Re: Question regarding handling lua strings inside C API

Alexey Melnichuk-2
In reply to this post by raksoras lua
Hello, raksoras.

> I understand that C code generally should not store the pointer to a
> string returned by lua_tostring beyond the lifetime of the C function
> call.(PIL says "The lua_tostring function returns a pointer to an
> internal copy of the string. Lua ensures that this pointer is valid as
> long as the corresponding value is in the stack. When a C function
> returns, Lua clears its stack; therefore, as a rule, you should never
> store pointers to Lua strings outside the function that got them.")

> However, is this restriction strictly true if I can guarantee - from
> my code flow - that the original string in lua will continue to be in
> reachable (that is, it is not garbage collected) when my C function
> call returns?

> Here is my actual scenario: Luaw HTTP server uses Lua coroutines to
> handle multiple connections simultaneously. While generating response
> a Lua coroutine calls a C function with Lua string that the C function
> is eventually supposed to write to socket. But since the socket write
> may block, the C function actually uses async write API (libuv to be
> specific) to write to the socket. The function then returns
> immediately and the Lua coroutine that called the C function yields to
> suspend itself. When the socket is ready for the actual write, libuv's
> async API invokes a C callback that writes the string on the socket
> and then resumes the suspended coroutine.

> Is it safe to not make a copy of a lua string in C function in this case?

I  really  do  not  think  this  is  safe.  e.g. `sok:write("HELLO " ..
"WORLD")`. Result string has no anchor. In my binding I just make
Lua reference to string and unref it  in  write  callback [1].
Same  thing works if you want write array of strings. You do not copy
data but make reference to array. And it works With Lua >= 5.1
Also  for  Lua  5.3 and if you use userdata to store write request you
can use UserValue associated with this request.


[1] https://github.com/moteus/lua-lluv/blob/master/src/lluv_stream.c#L395-L399


---
Это сообщение проверено на вирусы антивирусом Avast.
http://www.avast.com


Reply | Threaded
Open this post in threaded view
|

Re: Question regarding handling lua strings inside C API

Luiz Henrique de Figueiredo
In reply to this post by William Ahern
> > However, is this restriction strictly true if I can guarantee - from
> > my code flow - that the original string in lua will continue to be in
> > reachable (that is, it is not garbage collected) when my C function
> > call returns?
>
> AFAIK, yes, as long as you can guarantee that the string is anchored.

Strictly speaking, you cannot assume that the address of the internal
string does not change after garbage collection because Lua might use
a moving garbage collector. The standard implementation from lua.org
does not move data and there are no plans for changing this. Other
implementations of Lua may differ, but AFAIK they don't.

Reply | Threaded
Open this post in threaded view
|

Re: Question regarding handling lua strings inside C API

raksoras lua
In reply to this post by raksoras lua
In my case all writes on the socket are guaranteed to go through
following "funnel" function:

connMT.write = function(self, str, writeTimeout)
    local status, nwritten = writeInternal(self, Luaw.scheduler.tid(),
str, writeTimeout  or DEFAULT_WRITE_TIMEOUT)
    if ((status)and(nwritten > 0)) then
        -- there is something to write, yield for libuv callback
        status, nwritten = coroutine.yield(TS_BLOCKED_EVENT)
    end
    assert(status, nwritten)
    return nwritten
end


since there is a coroutine.yield() after writeInternal(str) there
should be no tail call or LuaJIT optimization that should GC the "str"
string in write() function's argument, right?

If not, I guess I can always anchor string by calling luaL_ref() in
writerInternal() C implementation as suggested by Alexey.

Thanks,

- Susheel

On Sun, Apr 5, 2015 at 4:00 AM,  <[hidden email]> wrote:

> Send lua-l mailing list submissions to
>         [hidden email]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://listmaster.pepperfish.net/cgi-bin/mailman/listinfo/lua-l-lists.lua.org
>
> or, via email, send a message with subject or body 'help' to
>         [hidden email]
>
> You can reach the person managing the list at
>         [hidden email]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of lua-l digest..."
>
>
> Today's Topics:
>
>    1. Re: WARNING: glibc dlopen threading bug (Hisham)
>    2. Re: WARNING: glibc dlopen threading bug (Enrico Tassi)
>    3. Question regarding handling lua strings inside C API
>       (raksoras lua)
>    4. Re: Question regarding handling lua strings inside C API
>       (William Ahern)
>    5. Re: Question regarding handling lua strings inside C API
>       (Alexey Melnichuk)
>    6. Re: Question regarding handling lua strings inside C API
>       (Alexey Melnichuk)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 4 Apr 2015 16:13:56 -0300
> From: Hisham <[hidden email]>
> Subject: Re: WARNING: glibc dlopen threading bug
> To: Lua mailing list <[hidden email]>
> Message-ID:
>         <[hidden email]>
> Content-Type: text/plain; charset=UTF-8
>
> On 3 April 2015 at 19:31, Matthew Wild <[hidden email]> wrote:
>> On 3 April 2015 at 22:10, William Ahern <[hidden email]> wrote:
>>> Use libpthread.so.0 for Linux/glibc. See the getpth routine in the runlua
>>> script for the library name to load on other systems. Solaris, OS X, and
>>> Linux/musl have unified libc/libphread implementations.
>>>
>>> Long-term I think it would be better if Lua packagers linked the system
>>> interpreter with -lpthread. It's what Perl, Ruby, Python, and Node.js do.
>>
>> Which would also solve this age-old problem with gdb:
>> http://stackoverflow.com/q/2702628/15996
>
> Thanks for the heads up. Updated the GoboLinux recipe of Lua accordingly.
>
> -- Hisham
>
>
>
> ------------------------------
>
> Message: 2
> Date: Sat, 4 Apr 2015 14:44:01 +0200
> From: Enrico Tassi <[hidden email]>
> Subject: Re: WARNING: glibc dlopen threading bug
> To: [hidden email]
> Message-ID: <[hidden email]>
> Content-Type: text/plain; charset=us-ascii
>
> On Fri, Apr 03, 2015 at 02:10:12PM -0700, William Ahern wrote:
>> Long-term I think it would be better if Lua packagers linked the system
>> interpreter with -lpthread. It's what Perl, Ruby, Python, and Node.js do.
>
> Thanks for the heads up! Debian adds "-lpthread" for Hurd, since it is
> required, but does not add the flag on Linux (nor on FreeBSD).
>
> Too bad Jessie is almost released.  It will be for Jessie+1.
>
> Best,
> --
> Enrico Tassi
>
>
>
> ------------------------------
>
> Message: 3
> Date: Sat, 4 Apr 2015 16:55:25 -0700
> From: raksoras lua <[hidden email]>
> Subject: Question regarding handling lua strings inside C API
> To: [hidden email]
> Message-ID:
>         <CAO=[hidden email]>
> Content-Type: text/plain; charset=UTF-8
>
> I understand that C code generally should not store the pointer to a
> string returned by lua_tostring beyond the lifetime of the C function
> call.(PIL says "The lua_tostring function returns a pointer to an
> internal copy of the string. Lua ensures that this pointer is valid as
> long as the corresponding value is in the stack. When a C function
> returns, Lua clears its stack; therefore, as a rule, you should never
> store pointers to Lua strings outside the function that got them.")
>
> However, is this restriction strictly true if I can guarantee - from
> my code flow - that the original string in lua will continue to be in
> reachable (that is, it is not garbage collected) when my C function
> call returns?
>
> Here is my actual scenario: Luaw HTTP server uses Lua coroutines to
> handle multiple connections simultaneously. While generating response
> a Lua coroutine calls a C function with Lua string that the C function
> is eventually supposed to write to socket. But since the socket write
> may block, the C function actually uses async write API (libuv to be
> specific) to write to the socket. The function then returns
> immediately and the Lua coroutine that called the C function yields to
> suspend itself. When the socket is ready for the actual write, libuv's
> async API invokes a C callback that writes the string on the socket
> and then resumes the suspended coroutine.
>
> Now in classical model I should make a copy of the string passed into
> the first C function as the function will return before the callback
> has a chance to write the string onto the socket. However I want to
> avoid the data copy (memcpy) if I can get away with it for performance
> reasons. My rationale is even though the original C function has
> returned the Lua coroutine that called the C function gets suspended
> as soon as C function call returns. So the original string that was
> passed in the C call is still reference-able from the Lua coroutine
> and hence should not be garbage collected. So I should be able to just
> store the pointer to the internal Lua string returned by
> lua_tostring() without making a copy of it. When the write callback is
> called it first calls a C function - kind of a continuation of first C
> call - that writes the string onto the socket and then resumes Lua
> coroutine. This way I can guarantee that the original Lua string's
> lifetime is more than the two C calls - original call and then the
> callback invoked by libuv - involved.
>
> Is it safe to not make a copy of a lua string in C function in this case?
>
> Thanks,
>
> - Susheel
>
>
>
> ------------------------------
>
> Message: 4
> Date: Sat, 4 Apr 2015 20:34:01 -0700
> From: William Ahern <[hidden email]>
> Subject: Re: Question regarding handling lua strings inside C API
> To: Lua mailing list <[hidden email]>
> Message-ID: <[hidden email]>
> Content-Type: text/plain; charset=us-ascii
>
> On Sat, Apr 04, 2015 at 04:55:25PM -0700, raksoras lua wrote:
>> I understand that C code generally should not store the pointer to a
>> string returned by lua_tostring beyond the lifetime of the C function
>> call.(PIL says "The lua_tostring function returns a pointer to an
>> internal copy of the string. Lua ensures that this pointer is valid as
>> long as the corresponding value is in the stack. When a C function
>> returns, Lua clears its stack; therefore, as a rule, you should never
>> store pointers to Lua strings outside the function that got them.")
>>
>> However, is this restriction strictly true if I can guarantee - from
>> my code flow - that the original string in lua will continue to be in
>> reachable (that is, it is not garbage collected) when my C function
>> call returns?
>
> AFAIK, yes, as long as you can guarantee that the string is anchored.
>
>> Here is my actual scenario: Luaw HTTP server uses Lua coroutines to
>> handle multiple connections simultaneously. While generating response
>> a Lua coroutine calls a C function with Lua string that the C function
>> is eventually supposed to write to socket. But since the socket write
>> may block, the C function actually uses async write API (libuv to be
>> specific) to write to the socket. The function then returns
>> immediately and the Lua coroutine that called the C function yields to
>> suspend itself. When the socket is ready for the actual write, libuv's
>> async API invokes a C callback that writes the string on the socket
>> and then resumes the suspended coroutine.
>
>> Now in classical model I should make a copy of the string passed into
>> the first C function as the function will return before the callback
>> has a chance to write the string onto the socket. However I want to
>> avoid the data copy (memcpy) if I can get away with it for performance
>> reasons. My rationale is even though the original C function has
>> returned the Lua coroutine that called the C function gets suspended
>> as soon as C function call returns. So the original string that was
>> passed in the C call is still reference-able from the Lua coroutine
>> and hence should not be garbage collected.
>
> This is only true if the string were stored in or through a local variable.
> But what if I do something like this:
>
>         socket:write(myobject:createstring())
>
> The string isn't stored in any intermediate variable. The stack frame is
> cleared when you yield (see luaD_poscall in Lua 5.3). The GC can and very
> well might collect the string because it's not anchored anywhere.
>
>> So I should be able to just store the pointer to the internal Lua string
>> returned by lua_tostring() without making a copy of it. When the write
>> callback is called it first calls a C function - kind of a continuation of
>> first C call - that writes the string onto the socket and then resumes Lua
>> coroutine. This way I can guarantee that the original Lua string's
>> lifetime is more than the two C calls - original call and then the
>> callback invoked by libuv - involved.
>>
>> Is it safe to not make a copy of a lua string in C function in this case?
>
> In order to make your optimization work, you need to be sure that the string
> is stored in a local for the duration of the yield. The only way to
> guarantee that is to add an extra call with a function written by yourself
> that you know stores the string in a local variable. For example:
>
>         function lib:write(str)
>                 return self:uvwrite(str)
>         end
>
> The problem is that Lua will turn this into a tail call and optimize away
> our str local. So we have to do something like:
>
>         function lib:write(str)
>                 local retval = self:uvwrite(str)
>                 return retval
>         end
>
> It's possible Lua might optimize away the str local here, too. Certainly
> LuaJIT is likely to optimize it away. You might need to add some function
> call or other crazy call after :uvwrite and before the return statement.
>
> If you're using Lua 5.2 or 5.3 you can use an intermediate C function to
> preserve the stack frame. C function code and its stack can't be optimized
> away. Using the 5.3 API you can do roughly something like the following:
>
>         static int lib_write(lua_State *L) {
>                 int nargs = lua_gettop(L);
>                 int i;
>
>                 /*
>                  * push our real libuv write binding. this would be quicker
>                  * if it were cached as an upvalue.
>                  */
>                 lua_pushcfunction(L, &lib_uvwrite);
>
>                 /*
>                  * copy arguments so originals remain alive on our stack
>                  * frame for the duration of the call.
>                  */
>                 for (i = 1; i <= nargs; i++)
>                         lua_pushvalue(L, i);
>
>                 return lua_callk(L, nargs, NRET, 0, &lib_poswrite);
>         }
>
>         /* return from our call to lib_uvwrite */
>         static int lib_poswrite(lua_State *L, int status, lua_KContext ctx) {
>                 assert(status == LUA_YIELD);
>                 assert(lua_gettop(L) >= NRET);
>                 return NRET;
>         }
>
>         static int lib_uvwrite(lua_State *L) {
>                 /*
>                  * here's your original binding that called into libuv.
>                  */
>         }
>
>
>
>
> ------------------------------
>
> Message: 5
> Date: Sun, 5 Apr 2015 10:54:34 +0400
> From: Alexey Melnichuk <[hidden email]>
> Subject: Re: Question regarding handling lua strings inside C API
> To: Lua mailing list <[hidden email]>
> Message-ID: <[hidden email]>
> Content-Type: text/plain; charset=windows-1251
>
> Здравствуйте, William.
>
> Вы писали 5 апреля 2015 г., 7:34:01:
>
>> On Sat, Apr 04, 2015 at 04:55:25PM -0700, raksoras lua wrote:
>>> I understand that C code generally should not store the pointer to a
>>> string returned by lua_tostring beyond the lifetime of the C function
>>> call.(PIL says "The lua_tostring function returns a pointer to an
>>> internal copy of the string. Lua ensures that this pointer is valid as
>>> long as the corresponding value is in the stack. When a C function
>>> returns, Lua clears its stack; therefore, as a rule, you should never
>>> store pointers to Lua strings outside the function that got them.")
>>>
>>> However, is this restriction strictly true if I can guarantee - from
>>> my code flow - that the original string in lua will continue to be in
>>> reachable (that is, it is not garbage collected) when my C function
>>> call returns?
>
>> AFAIK, yes, as long as you can guarantee that the string is anchored.
>>
>>> Here is my actual scenario: Luaw HTTP server uses Lua coroutines to
>>> handle multiple connections simultaneously. While generating response
>>> a Lua coroutine calls a C function with Lua string that the C function
>>> is eventually supposed to write to socket. But since the socket write
>>> may block, the C function actually uses async write API (libuv to be
>>> specific) to write to the socket. The function then returns
>>> immediately and the Lua coroutine that called the C function yields to
>>> suspend itself. When the socket is ready for the actual write, libuv's
>>> async API invokes a C callback that writes the string on the socket
>>> and then resumes the suspended coroutine.
>>
>>> Now in classical model I should make a copy of the string passed into
>>> the first C function as the function will return before the callback
>>> has a chance to write the string onto the socket. However I want to
>>> avoid the data copy (memcpy) if I can get away with it for performance
>>> reasons. My rationale is even though the original C function has
>>> returned the Lua coroutine that called the C function gets suspended
>>> as soon as C function call returns. So the original string that was
>>> passed in the C call is still reference-able from the Lua coroutine
>>> and hence should not be garbage collected.
>
>> This is only true if the string were stored in or through a local variable.
>> But what if I do something like this:
>
>>         socket:write(myobject:createstring())
>
>> The string isn't stored in any intermediate variable. The stack frame is
>> cleared when you yield (see luaD_poscall in Lua 5.3). The GC can and very
>> well might collect the string because it's not anchored anywhere.
>
>>> So I should be able to just store the pointer to the internal Lua string
>>> returned by lua_tostring() without making a copy of it. When the write
>>> callback is called it first calls a C function - kind of a continuation of
>>> first C call - that writes the string onto the socket and then resumes Lua
>>> coroutine. This way I can guarantee that the original Lua string's
>>> lifetime is more than the two C calls - original call and then the
>>> callback invoked by libuv - involved.
>>>
>>> Is it safe to not make a copy of a lua string in C function in this case?
>
>> In order to make your optimization work, you need to be sure that the string
>> is stored in a local for the duration of the yield. The only way to
>> guarantee that is to add an extra call with a function written by yourself
>> that you know stores the string in a local variable. For example:
>
>>         function lib:write(str)
>>                 return self:uvwrite(str)
>>         end
>
>> The problem is that Lua will turn this into a tail call and optimize away
>> our str local. So we have to do something like:
>
>>         function lib:write(str)
>>                 local retval = self:uvwrite(str)
>>                 return retval
>>         end
>
>> It's possible Lua might optimize away the str local here, too. Certainly
>> LuaJIT is likely to optimize it away. You might need to add some function
>> call or other crazy call after :uvwrite and before the return statement.
>
>> If you're using Lua 5.2 or 5.3 you can use an intermediate C function to
>> preserve the stack frame. C function code and its stack can't be optimized
>> away. Using the 5.3 API you can do roughly something like the following:
>
>>         static int lib_write(lua_State *L) {
>>                 int nargs = lua_gettop(L);
>>                 int i;
>
>>                 /*
>>                  * push our real libuv write binding. this would be quicker
>>                  * if it were cached as an upvalue.
>>                  */
>>                 lua_pushcfunction(L, &lib_uvwrite);
>
>>                 /*
>>                  * copy arguments so originals remain alive on our stack
>>                  * frame for the duration of the call.
>>                  */
>>                 for (i = 1; i <= nargs; i++)
>>                         lua_pushvalue(L, i);
>
>>                 return lua_callk(L, nargs, NRET, 0, &lib_poswrite);
>>         }
>
>>         /* return from our call to lib_uvwrite */
>>         static int lib_poswrite(lua_State *L, int status, lua_KContext ctx) {
>>                 assert(status == LUA_YIELD);
>>                 assert(lua_gettop(L) >= NRET);
>>                 return NRET;
>>         }
>
>>         static int lib_uvwrite(lua_State *L) {
>>                 /*
>>                  * here's your original binding that called into libuv.
>>                  */
>>         }
>
>
>
>
>
> --
> С уважением,
>  Alexey                          mailto:[hidden email]
>
>
> ---
> Это сообщение проверено на вирусы антивирусом Avast.
> http://www.avast.com
>
>
>
>
> ------------------------------
>
> Message: 6
> Date: Sun, 5 Apr 2015 11:11:13 +0400
> From: Alexey Melnichuk <[hidden email]>
> Subject: Re: Question regarding handling lua strings inside C API
> To: Lua mailing list <[hidden email]>
> Message-ID: <[hidden email]>
> Content-Type: text/plain; charset=utf-8
>
> Hello, raksoras.
>
>> I understand that C code generally should not store the pointer to a
>> string returned by lua_tostring beyond the lifetime of the C function
>> call.(PIL says "The lua_tostring function returns a pointer to an
>> internal copy of the string. Lua ensures that this pointer is valid as
>> long as the corresponding value is in the stack. When a C function
>> returns, Lua clears its stack; therefore, as a rule, you should never
>> store pointers to Lua strings outside the function that got them.")
>
>> However, is this restriction strictly true if I can guarantee - from
>> my code flow - that the original string in lua will continue to be in
>> reachable (that is, it is not garbage collected) when my C function
>> call returns?
>
>> Here is my actual scenario: Luaw HTTP server uses Lua coroutines to
>> handle multiple connections simultaneously. While generating response
>> a Lua coroutine calls a C function with Lua string that the C function
>> is eventually supposed to write to socket. But since the socket write
>> may block, the C function actually uses async write API (libuv to be
>> specific) to write to the socket. The function then returns
>> immediately and the Lua coroutine that called the C function yields to
>> suspend itself. When the socket is ready for the actual write, libuv's
>> async API invokes a C callback that writes the string on the socket
>> and then resumes the suspended coroutine.
>
>> Is it safe to not make a copy of a lua string in C function in this case?
>
> I  really  do  not  think  this  is  safe.  e.g. `sok:write("HELLO " ..
> "WORLD")`. Result string has no anchor. In my binding I just make
> Lua reference to string and unref it  in  write  callback [1].
> Same  thing works if you want write array of strings. You do not copy
> data but make reference to array. And it works With Lua >= 5.1
> Also  for  Lua  5.3 and if you use userdata to store write request you
> can use UserValue associated with this request.
>
>
> [1] https://github.com/moteus/lua-lluv/blob/master/src/lluv_stream.c#L395-L399
>
>
> ---
> Это сообщение проверено на вирусы антивирусом Avast.
> http://www.avast.com
>
>
>
>
> ------------------------------
>
> _______________________________________________
> lua-l mailing list
> [hidden email]
> http://listmaster.pepperfish.net/cgi-bin/mailman/listinfo/lua-l-lists.lua.org
>
>
> End of lua-l Digest, Vol 57, Issue 5
> ************************************