Do strings move around?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Do strings move around?

Francisco Olarte
One seemingly stupid question to which I've been unable to find
definitive answer in the docs.

In lua_tostring() docs last paragrah reads "Because Lua has garbage
collection, there is no guarantee that the pointer returned by
lua_tolstring will be valid after the corresponding Lua value is
removed from the stack. ".

Must it be kept in the stack or is it enoguh if I keep the original
value referenced. I.e., I'm inside a method of one of my classes ( so
I have a this pointer ) and I've arrived there from a C function which
has the relevant userdata as first argument, index 1, and I have a
string at the TOS, if I do:

const char * s = lua_tostring(L,-1,0);

and then take it off the stack but keep it by doing

lua_setuservalue(L, 1);

or alternatively:

lua_rawsetp(L, LUA_REGISTRYINDEX, this);

Can I still keep using s safely?

And also,  can I keep using s if I guarantee nobody has messed with
the uservalue/registry?

It seems likely, and it is true on userdata, but the wording on
lua_tostring make me fear the string can me moved around by the
collector if it is not pinned especifically by a stack entry.

Francisco Olarte.
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Javier Guerra Giraldez
On Tue, 25 Aug 2020 at 08:42, Francisco Olarte <[hidden email]> wrote:
> Must it be kept in the stack or is it enoguh if I keep the original
> value referenced. I.e., I'm inside a method of one of my classes ( so
> I have a this pointer ) and I've arrived there from a C function which
> has the relevant userdata as first argument, index 1, and I have a
> string at the TOS, if I do:

The GC doesn't follow C pointers, only Lua value references.  The Lua
stack is a good temporary place.  for more "durable" references some
options are:

- in the registry
- as an upvalue to your C function
- in the userdata's metatable.

of course, any of these could be a direct reference to your string, or
a Lua table to hold it among other things.


--
Javier
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Francisco Olarte
Javier:

On Tue, Aug 25, 2020 at 3:58 PM Javier Guerra Giraldez
<[hidden email]> wrote:

> On Tue, 25 Aug 2020 at 08:42, Francisco Olarte <[hidden email]> wrote:
> > Must it be kept in the stack or is it enoguh if I keep the original
> > value referenced. I.e., I'm inside a method of one of my classes ( so
> > I have a this pointer ) and I've arrived there from a C function which
> > has the relevant userdata as first argument, index 1, and I have a
> > string at the TOS, if I do:
> The GC doesn't follow C pointers, only Lua value references.  The Lua
> stack is a good temporary place.  for more "durable" references some
> options are:
> - in the registry
> - as an upvalue to your C function
> - in the userdata's metatable.

This I know, and I do. But my problem is I want to cache the
lua_tostring reference.

If you read my original message, it's explained there. I do not have a
problem keeping the string alive, but the manual does not say if the
string contents, the internal representation ( "lua_tolstring returns
a pointer to a string inside the Lua state. This string always has a
zero ('\0') after its last character (as in C), but can contain other
zeros in its body. " ) can move around if I pop it from the stack (
i.e., if i set it as uservalue I know I can get it into the stack
again with getuservalue and do to_string again to get the pointer, and
this is what I currently do, but a part of my dessign would be
easier/faster if I can just keep the original lua_tostring result ).

I.e., if currently I do this on a function where 1 is one of my userdata:

lua_pushstring(L,"hola");
lua_setuservalue(L, 1,-1);

and then in another similar function I do

lua_getuservalue(L,1);
const char * s = lua_tostring(L,-1);

AND what I want to know is if this could be changed to:

lua_pushstring(L,"hola");
that = (myclass *)lua_touserdata(L,1);
that->s = lua_tostring(L,-1);
lua_setuservalue(L, 1,-1);

And in the second function

that = (myclass *)lua_touserdata(L,1);
const char * s = that->s;

I know the uservalue is alive because I've gotten at that from it, I
can insure nobody touches uservalues.

Francisco Olarte.
v
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

v
In reply to this post by Francisco Olarte
On Tue, 2020-08-25 at 15:41 +0200, Francisco Olarte wrote:
> It seems likely, and it is true on userdata, but the wording on
> lua_tostring make me fear the string can me moved around by the
> collector if it is not pinned especifically by a stack entry.

I'm pretty sure it is safe. AFAIK it does not matter where data is
stored when it comes to Lua managing it. And as such, storing anything
in stack should not differ from storing it anywhere else as strings are
allocated internally and every Lua "string", on stack or not, is just a
reference there.
--
v <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Roberto Ierusalimschy
In reply to this post by Francisco Olarte
> One seemingly stupid question to which I've been unable to find
> definitive answer in the docs.
>
> In lua_tostring() docs last paragrah reads "Because Lua has garbage
> collection, there is no guarantee that the pointer returned by
> lua_tolstring will be valid after the corresponding Lua value is
> removed from the stack. ".
>
> Must it be kept in the stack or is it enoguh if I keep the original
> value referenced. I.e., I'm inside a method of one of my classes ( so
> I have a this pointer ) and I've arrived there from a C function which
> has the relevant userdata as first argument, index 1, and I have a
> string at the TOS, if I do:
>
> const char * s = lua_tostring(L,-1,0);
>
> and then take it off the stack but keep it by doing
>
> lua_setuservalue(L, 1);
>
> or alternatively:
>
> lua_rawsetp(L, LUA_REGISTRYINDEX, this);
>
> Can I still keep using s safely?
>
> And also,  can I keep using s if I guarantee nobody has messed with
> the uservalue/registry?
>
> It seems likely, and it is true on userdata, but the wording on
> lua_tostring make me fear the string can me moved around by the
> collector if it is not pinned especifically by a stack entry.

I think the docs are quite clear; your fear is intentional.

Currently the collector does not move strings around, but this
is an implementation detail that can change in the future and
then create really hard-to-find bugs in your code.

- Concrete example: recently it was discussed in this list the
idea of using an unboxed representation for small strings. If we
decide to adopt that proposal, we will ensure that elements active
in the stack do not move (because the manual assures that), but
elements in a table (e.g., the registry) are free to be moved if
the table resizes.

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Francisco Olarte
Roberto:

On Tue, Aug 25, 2020 at 5:02 PM Roberto Ierusalimschy
<[hidden email]> wrote:
.......
> > It seems likely, and it is true on userdata, but the wording on
> > lua_tostring make me fear the string can me moved around by the
> > collector if it is not pinned especifically by a stack entry.

> I think the docs are quite clear; your fear is intentional.

Rest assured the doc is clear, and I coded according to it,
getuservalue plus optional index adjustments in some paths. But I have
been proven to miss some parts of the manual, so I thought better to
ask it, in case there were some relevant paragraphs extending that
which I had skipped.

By "intentional" do you mean "justified" ? I'm not quite sure I
understand that part.

> Currently the collector does not move strings around, but this
> is an implementation detail that can change in the future and
> then create really hard-to-find bugs in your code.

That is what I thought, but no problem. I will keep reloading from the
registry/uservalue and cache it in std::strings or via plain strdups
if I need to get at them without touching lua. I really wanted to
avoid some convoluted paths to get at a lua stack in some situations,
but it is not a problem.

> - Concrete example: recently it was discussed in this list the
> idea of using an unboxed representation for small strings. If we
> decide to adopt that proposal, we will ensure that elements active
> in the stack do not move (because the manual assures that), but
> elements in a table (e.g., the registry) are free to be moved if
> the table resizes.

Understood. I remember seeing something like that. I will continue
pinning them with the stack.

Francisco Olarte.
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Roberto Ierusalimschy
> > > It seems likely, and it is true on userdata, but the wording on
> > > lua_tostring make me fear the string can me moved around by the
> > > collector if it is not pinned especifically by a stack entry.
>
> > I think the docs are quite clear; your fear is intentional.
>
> [...]
>
> By "intentional" do you mean "justified" ? I'm not quite sure I
> understand that part.

By intentional I meant intentional. The intention of the docs is exactly
to instil this fear that "the string can [be] moved around by the
collector". (Well, the intention is to instil that concept; if it causes
fear, so be it.)

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Andrew Gierth
In reply to this post by Roberto Ierusalimschy
>>>>> "Roberto" == Roberto Ierusalimschy <[hidden email]> writes:

 Roberto> - Concrete example: recently it was discussed in this list the
 Roberto> idea of using an unboxed representation for small strings. If
 Roberto> we decide to adopt that proposal, we will ensure that elements
 Roberto> active in the stack do not move (because the manual assures
 Roberto> that), but elements in a table (e.g., the registry) are free
 Roberto> to be moved if the table resizes.

What about strings accessed via upvalue pseudo-indexes?

--
Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Francisco Olarte
In reply to this post by Roberto Ierusalimschy
Roberto:

On Tue, Aug 25, 2020 at 9:07 PM Roberto Ierusalimschy
<[hidden email]> wrote:
> > By "intentional" do you mean "justified" ? I'm not quite sure I
> > understand that part.
> By intentional I meant intentional. The intention of the docs is exactly
> to instil this fear that "the string can [be] moved around by the
> collector". (Well, the intention is to instil that concept; if it causes
> fear, so be it.)

Got it, I just had never seen it written that way.

I've noted it, checked I do not assume unmovable strings without an
stack entry pinning them, and prepared a custom userdata in case I
ever need an unmovable "C-string".

Thanks.
   Francisco Olarte.
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Roberto Ierusalimschy
In reply to this post by Andrew Gierth
> What about strings accessed via upvalue pseudo-indexes?

They should be valid while the corresponding call is active. Documenting
it is in my "todo" list.

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Gé Weijers
On Wed, Aug 26, 2020 at 7:01 AM Roberto Ierusalimschy
<[hidden email]> wrote:
>
> > What about strings accessed via upvalue pseudo-indexes?
>
> They should be valid while the corresponding call is active. Documenting
> it is in my "todo" list.
>
> -- Roberto

I hope uservalue objects will never get moved, because it would be
impossible to store most C++ objects in uservalues (only those types T
which have the property "trivially copyable" could be used, which
excludes most interesting C++ types like maps, vectors, lists etc.).



--

Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Roberto Ierusalimschy
> I hope uservalue objects will never get moved, [...]

They won't.

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Viacheslav Usov
On Wed, Aug 26, 2020 at 7:55 PM Roberto Ierusalimschy
<[hidden email]> wrote:

> They won't.

If Lua's GC becomes so sophisticated that object moves become a
reality, it will make sense to add a third type of user data: movable
data.

And speaking of sophisticated GC, I'd like to mention some recent work
in this area, the RC Immix memory allocator, which is based on
reference counting, and which its authors believed outperformed its
contemporary state-of-the art tracing/generational collectors [1].

Being RC-based, it might be a better answer to deterministic
finalization than to-be-closed variables.

Cheers,
V.

[1] http://rifatshahriyar.github.io/files/others/rcix-oopsla-2013.pdf
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Andrew Gierth
In reply to this post by Roberto Ierusalimschy
>>>>> "Roberto" == Roberto Ierusalimschy <[hidden email]> writes:

 >> What about strings accessed via upvalue pseudo-indexes?

 Roberto> They should be valid while the corresponding call is active.
 Roberto> Documenting it is in my "todo" list.

So here's a more tricky case: what if you do lua_tostring on some stack
index, and then move that stack item while keeping it on the stack (i.e.
with lua_rotate or one of its many derivatives such as lua_remove)?

--
Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: Do strings move around?

Roberto Ierusalimschy
> >>>>> "Roberto" == Roberto Ierusalimschy <[hidden email]> writes:
>
>  >> What about strings accessed via upvalue pseudo-indexes?
>
>  Roberto> They should be valid while the corresponding call is active.
>  Roberto> Documenting it is in my "todo" list.
>
> So here's a more tricky case: what if you do lua_tostring on some stack
> index, and then move that stack item while keeping it on the stack (i.e.
> with lua_rotate or one of its many derivatives such as lua_remove)?

The new wording in the manual already covers this case:

    When the index is a stack index,
    Lua guarantees that the pointer returned by @id{lua_tolstring}
    is valid while the value at that index is neither modified nor popped.
    When the index is a pseudo-index (an upvalue),
    the pointer returned by @id{lua_tolstring}
    is valid while the corresponding call is active and
    the corresponding upvalue is not modified.

-- Roberto