Making a modifiable copy of a string in C

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Making a modifiable copy of a string in C

Marc Balmer
(sorry if I spam the list a bit with my questions)

My goal:  Create a modifiable copy of a string in a C module, so that the copy of the string can be used with the strtok_r() function (which modifies the string passed as first argument).

So far I came up with two different ways:

First variant:

        const char *s;
        char *u;

        s = luaL_checklstring(L, 2, &len);
        u = lua_newuserdata(L, len);
        memcpy(u, s, len);

Here I use 'u' as the first argument to the strtok_r() call.

Second variant:

        char *s;

        s = (char *)lua_pushfstring(L, "%s,", luaL_checkstring(L, 2));

Not that I add a command in the format string, to make the strings not similar, which apparently causes Lua to create a new string and not reuse the original string data.  The comma is part of the delimiter argument to strtok_r(), so it does not affect my code.

Are both variants valid?

fwiw, I am using strtok_r to tokenize a a string passed to a function, and then use luaL_checkoption() to get the individual values for each token, or'ing them together to a flag value. The following C code

p = dlopen('whatever', RTLD_LAZY | RTLD_GLOBAL)

becomes the following Lua code

local p = dlopen('whatever', 'lazy, global')


Reply | Threaded
Open this post in threaded view
|

Re: Making a modifiable copy of a string in C

Viacheslav Usov
On Sun, Apr 7, 2019 at 1:05 PM Marc Balmer <[hidden email]> wrote:

> Are both variants valid?

Userdata is the only option here, because Lua strings are immutable, and this should not be violated by clever C code.

Frankly, the whole approach, which would then involve creating more Lua strings and calling luaL_checkoption on them, seems heavy-footed to me.

If you need this to pass optional flags, I would instead consider passing them as multiple arguments, so that the Lua code could do this:

dlopen('whatever') -- no flags
dlopen('whatever ', 'lazy')
dlopen('whatever ', 'lazy', 'global')  

This leaves the entire question of parsing to Lua's parser, the user does not have to learn your syntax, and all you need to do is something like this:

int flags = 0, top = lua_gettop(L);
for (int i = 2; i <= top; ++i)
    flags |= (1 << luaL_checkoption(L, i, 0, list));

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: Making a modifiable copy of a string in C

Marc Balmer


Am 07.04.2019 um 13:48 schrieb Viacheslav Usov <[hidden email]>:

On Sun, Apr 7, 2019 at 1:05 PM Marc Balmer <[hidden email]> wrote:

> Are both variants valid?

Userdata is the only option here, because Lua strings are immutable, and this should not be violated by clever C code.

I somewhat agree.  But the string copy is only used within my C function, so it does not really matter if it is modified, since Lua code never sees or uses this string.  Therefore my question if this has any "technical" side/bad effects.


Frankly, the whole approach, which would then involve creating more Lua strings and calling luaL_checkoption on them, seems heavy-footed to me.

I am not pushing the strings and using luaL_checkoption, but rather a checkoption function that takes the name as const char * argument instead of getting it from the Lua stack.


If you need this to pass optional flags, I would instead consider passing them as multiple arguments, so that the Lua code could do this:

dlopen('whatever') -- no flags
dlopen('whatever ', 'lazy')
dlopen('whatever ', 'lazy', 'global')  

This leaves the entire question of parsing to Lua's parser, the user does not have to learn your syntax, and all you need to do is something like this:

int flags = 0, top = lua_gettop(L);
for (int i = 2; i <= top; ++i)
    flags |= (1 << luaL_checkoption(L, i, 0, list));


I did this before, but then I came accross a situation where additional parameters are specified after the flags:

func('abc', 'def', 'ghi, 'bar')

(def and ghi are or-able flags)

This would then beome

func('abc', 'def, ghi', 'bar')

of course, are third variant seems possible, using tables:

func('abc', { 'def', 'ghi' }, 'bar'}

I am looking for the way that is the most user friendly.

Cheers,
V.

Reply | Threaded
Open this post in threaded view
|

Re: Making a modifiable copy of a string in C

Andrew Gierth
In reply to this post by Viacheslav Usov
>>>>> "Viacheslav" == Viacheslav Usov <[hidden email]> writes:

 Viacheslav> If you need this to pass optional flags, I would instead
 Viacheslav> consider passing them as multiple arguments, so that the
 Viacheslav> Lua code could do this:

 Viacheslav> dlopen('whatever') -- no flags
 Viacheslav> dlopen('whatever ', 'lazy')
 Viacheslav> dlopen('whatever ', 'lazy', 'global')

or use a table:

dlopen('whatever', { lazy = true, global = true })

or,

dlopen{ module = 'whatever', lazy = true }

--
Andrew.

Reply | Threaded
Open this post in threaded view
|

Re: Making a modifiable copy of a string in C

Andrew Gierth
In reply to this post by Marc Balmer
>>>>> "Marc" == Marc Balmer <[hidden email]> writes:

 Marc> I somewhat agree. But the string copy is only used within my C
 Marc> function, so it does not really matter if it is modified, since
 Marc> Lua code never sees or uses this string. Therefore my question if
 Marc> this has any "technical" side/bad effects.

You can't know if lua code ever sees the string, because strings are
interned; there is only one copy of any given short string. So trying to
make a "copy" with lua_tolstring will actually just return the original
pointer for strings under a certain length.

--
Andrew.

Reply | Threaded
Open this post in threaded view
|

Re: Making a modifiable copy of a string in C

Marc Balmer
In reply to this post by Marc Balmer


Am 07.04.2019 um 14:04 schrieb Marc Balmer <[hidden email]>:



Am 07.04.2019 um 13:48 schrieb Viacheslav Usov <[hidden email]>:

On Sun, Apr 7, 2019 at 1:05 PM Marc Balmer <[hidden email]> wrote:

> Are both variants valid?

Userdata is the only option here, because Lua strings are immutable, and this should not be violated by clever C code.

I somewhat agree.  But the string copy is only used within my C function, so it does not really matter if it is modified, since Lua code never sees or uses this string.  Therefore my question if this has any "technical" side/bad effects.


Frankly, the whole approach, which would then involve creating more Lua strings and calling luaL_checkoption on them, seems heavy-footed to me.

I am not pushing the strings and using luaL_checkoption, but rather a checkoption function that takes the name as const char * argument instead of getting it from the Lua stack.


If you need this to pass optional flags, I would instead consider passing them as multiple arguments, so that the Lua code could do this:

dlopen('whatever') -- no flags
dlopen('whatever ', 'lazy')
dlopen('whatever ', 'lazy', 'global')  

This leaves the entire question of parsing to Lua's parser, the user does not have to learn your syntax, and all you need to do is something like this:

int flags = 0, top = lua_gettop(L);
for (int i = 2; i <= top; ++i)
    flags |= (1 << luaL_checkoption(L, i, 0, list));


I did this before, but then I came accross a situation where additional parameters are specified after the flags:

func('abc', 'def', 'ghi, 'bar')

(def and ghi are or-able flags)

This would then beome

func('abc', 'def, ghi', 'bar')

of course, are third variant seems possible, using tables:

func('abc', { 'def', 'ghi' }, 'bar'}

I think it is best to check where a argument is a single string and then use luaL_checkoption _or_ if it is a table and then iterate over the table values.  That only uses Lua idioms and does not introduce a new syntax for options:

dlopen('library', 'lazy')

or 

dlopen('library, { 'lazy', 'global' })

or is this still to much "automagic"?


I am looking for the way that is the most user friendly.

Cheers,
V.


Reply | Threaded
Open this post in threaded view
|

Re: Making a modifiable copy of a string in C

Viacheslav Usov
In reply to this post by Marc Balmer
On Sun, Apr 7, 2019 at 2:04 PM Marc Balmer <[hidden email]> wrote:

> I am not pushing the strings and using luaL_checkoption, but rather a checkoption function that takes the name as const char * argument instead of getting it from the Lua stack.

That means writing even more code than I originally thought. I think you could make your life easier by not using strtok_r for parsing (if you insist on it).

> I did this before, but then I came accross a situation where additional parameters are specified after the flags:

You could move all of the flag parameters after all the other parameters. The really troublesome case would be one when your (underlying) function is variadic and has flag parameters, then you would need to come up with some clever scheme or use tables.

You could create helper function just for flags that you use. For example, say you have flag group FOO and flag group BAR. The you could create variadic functions foo() and bar(), and use them as follows:

whatever(a, ,b ,c, foo('A', 'B'), d, e, bar('X', 'Y', 'Z'), f) -- eq. FOO_A | FOO_B, BAR_X | BAR_Y | BAR_Z

This, I think, is more readable than tables.

Cheers,
V.
Reply | Threaded
Open this post in threaded view
|

Re: Making a modifiable copy of a string in C

Sean Conner
In reply to this post by Marc Balmer
It was thus said that the Great Marc Balmer once stated:

>
> I think it is best to check where a argument is a single string and then
> use luaL_checkoption _or_ if it is a table and then iterate over the table
> values.  That only uses Lua idioms and does not introduce a new syntax for
> options:
>
> dlopen('library', 'lazy')
>
> or
>
> dlopen('library, { 'lazy', 'global' })
>
> or is this still to much "automagic"?

  I like this option the best.

  -spc

Reply | Threaded
Open this post in threaded view
|

Re: Making a modifiable copy of a string in C

Gé Weijers
In reply to this post by Marc Balmer
On Sun, Apr 7, 2019 at 4:05 AM Marc Balmer <[hidden email]> wrote:
So far I came up with two different ways:

First variant:

        const char *s;
        char *u;

        s = luaL_checklstring(L, 2, &len);
        u = lua_newuserdata(L, len);
        memcpy(u, s, len);


If you plan to pass a string to C string routines you have to copy the null character.

u = lua_newuserdata(L, len+1);
memcpy(u, s, len+1);

I assume the Lua string does not contain any null characters, or the C routines will stop processing in the middle of a Lua string. If you want to process null characters as part of strings you may have to write your own code.

-- 
--