userdata and the gc

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

userdata and the gc

Adrian Sietsma
Will a userdata's address change when the gc runs ?

ie:

lua_pushstring(L,"foo");
p = lua_newuserdata(L,sizeof(foo));
lua_rawset(L,LUA_REGISTRYINDEX);

... do something to force a gc

lua_pushstring(L,"foo");
lua_rawget(L,LUA_REGISTRYINDEX);
p1 = lua_touserdata(L,-1);
assert (p==p1); <----------------

Adrian
Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

Greg Falcon
On 4/18/06, Adrian Sietsma <[hidden email]> wrote:
> Will a userdata's address change when the gc runs ?

No.  A userdata's address never changes, and is valid until the
userdata is collected.

Greg F
Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

Adrian Sietsma
In reply to this post by Adrian Sietsma
Answering myself:
> Will a userdata's address change when the gc runs ?
>
> ie:
>
> lua_pushstring(L,"foo");
> p = lua_newuserdata(L,sizeof(foo));
> lua_rawset(L,LUA_REGISTRYINDEX);
-- note the userdata is no longer on the stack here

>
> ... do something to force a gc
>
> lua_pushstring(L,"foo");
> lua_rawget(L,LUA_REGISTRYINDEX);
> p1 = lua_touserdata(L,-1);
> assert (p==p1); <----------------
>
> Adrian
>

I would guess that the answer is that a (full) usedata would be safe as long
as it is on the stack, but that the gc may move the ud (and thus invalidate
the pointer), once it is off the c stack.

This would imply that having full userdata on the stack may cause the
garbage collector to be unable to run to completion.

Adrian


Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

Ben Sunshine-Hill
In reply to this post by Adrian Sietsma
On 4/17/06, Adrian Sietsma <[hidden email]> wrote:
> Will a userdata's address change when the gc runs ?

The current garbage collector is non-compacting, so no GCed objects
ever move around. IIRC, the authors have cautioned that this may not
always be the case, though a lot of current code I've seen relies on
unmoving userdata. In any case, being on the stack wouldn't affect
this; the stack is just another element of the root set for
determining reachability for GC.

Ben
Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

Adrian Sietsma
Ben Sunshine-Hill wrote:
>
> The current garbage collector is non-compacting, so no GCed objects
> ever move around. IIRC, the authors have cautioned that this may not
> always be the case, though a lot of current code I've seen relies on
> unmoving userdata.

  In any case, being on the stack wouldn't affect
> this; the stack is just another element of the root set for
> determining reachability for GC.
>
That may be so, but consider :

int func(Lua_state* L) {
   void* p = lua_newuserdata(L,sizeof(foo));
...
--  we would all hope that p remains valid,
--  even if we trigger a gc here
...
-- save the ud
   lua_pushstring(L,"foo")
   lua_pushvalue(L,-2);
   lua_rawset(L,LUA_REGISTRYINDEX);

-- pop the ud
   lua_pop(L,1)
--  it would be reasonable if p were invalid here : it's off the stack,
--  so one could argue that invalidates the pointer.
   return 0;
}


My specific problem is thread-related - can I pass a (referenced) full
userdata pointer to another thread for background i/o, trusting that Lua
will _never_ delete (or move) the memory pointed to, as long as the ud stays
referenced ?


If that is the case, I'm confused about how the gc can shrink the pool
without invalidating userdata. Using sockets for userdata :

require"socket"
s1=socket.tcp() -- create userdata before table
t={}; for i =1,1e6 do t[i]=i end
s2=socket.tcp() -- create userdata after table
collectgarbage("collect");print (collectgarbage("count"))
=>16447.016601563
t={}
collectgarbage("collect");print (collectgarbage("count"))
=> 63.0166015625

so the memory comes back, as confirmed by task manager.
???

Adrian
Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

David Jones-2
In reply to this post by Adrian Sietsma

On Apr 18, 2006, at 06:45, Adrian Sietsma wrote:

> Will a userdata's address change when the gc runs ?
>
> ie:
>
> lua_pushstring(L,"foo");
> p = lua_newuserdata(L,sizeof(foo));
> lua_rawset(L,LUA_REGISTRYINDEX);
>
> ... do something to force a gc
>
> lua_pushstring(L,"foo");
> lua_rawget(L,LUA_REGISTRYINDEX);
> p1 = lua_touserdata(L,-1);
> assert (p==p1); <----------------

The documentation doesn't say explicitly either way.

I think the specification should be similar to the lua_tolstring case,
the pointer returned will not be valid when the corresponding value is
removed from the stack.

drj

Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

Roberto Ierusalimschy
In reply to this post by Adrian Sietsma
> >The current garbage collector is non-compacting, so no GCed objects
> >ever move around. IIRC, the authors have cautioned that this may not
> >always be the case, though a lot of current code I've seen relies on
> >unmoving userdata.

The caution is about strings, not about userdata (although we actually
did not say that explicitly in the manual). We have no intention of
allowing userdata addresses to change during GC. Unlike strings, which
are an internal data in Lua, the only purpose of userdata is to be used
by C code, which prefer that things stay where they are :)


> If that is the case, I'm confused about how the gc can shrink the pool
> without invalidating userdata. Using sockets for userdata :

The gc can shrink the pool as much as malloc/free can. In fact, Lua
has no notion of "pool". It only manipulates memory through
malloc/free/realloc.

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

Chris Marrin
In reply to this post by Adrian Sietsma
Adrian Sietsma wrote:
> ...
> My specific problem is thread-related - can I pass a (referenced) full
> userdata pointer to another thread for background i/o, trusting that Lua
> will _never_ delete (or move) the memory pointed to, as long as the ud
> stays referenced ?

Why not just point the userdata at a memory block that you allocate and
then pass that to the other thread? Since you allocate it, you can be
sure its address will not be changed by Lua, even if the userdata
pointing at it moves around.

--
chris marrin              ,""$, "As a general rule,don't solve puzzles
[hidden email]        b`    $  that open portals to Hell" ,,.
         ,.`           ,b`    ,`                            , 1$'
      ,|`             mP    ,`                              :$$'     ,mm
    ,b"              b"   ,`            ,mm      m$$    ,m         ,`P$$
   m$`             ,b`  .` ,mm        ,'|$P   ,|"1$`  ,b$P       ,`  :$1
  b$`             ,$: :,`` |$$      ,`   $$` ,|` ,$$,,`"$$     .`    :$|
b$|            _m$`,:`    :$1   ,`     ,$Pm|`    `    :$$,..;"'     |$:
P$b,      _;b$$b$1"       |$$ ,`      ,$$"             ``'          $$
  ```"```'"    `"`         `""`        ""`                          ,P`
Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

Adrian Sietsma
Chris Marrin wrote:
  > Why not just point the userdata at a memory block that you allocate and
> then pass that to the other thread? Since you allocate it, you can be
> sure its address will not be changed by Lua, even if the userdata
> pointing at it moves around.

You mean use a lightuserdata.

That is my current approach, but I was contemplating using Lua-allocated
memory if it is safe to assume constancy of userdata.

given this from Roberto :

"We have no intention of
allowing userdata addresses to change during GC. Unlike strings, which
are an internal data in Lua, the only purpose of userdata is to be used
by C code, which prefer that things stay where they are :)"

I will now happily pass full userdata to a background thread (IOCP on
Windows, fyi).

Adrian
Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

Chris Marrin
Adrian Sietsma wrote:

> Chris Marrin wrote:
>  > Why not just point the userdata at a memory block that you allocate and
>> then pass that to the other thread? Since you allocate it, you can be
>> sure its address will not be changed by Lua, even if the userdata
>> pointing at it moves around.
>
> You mean use a lightuserdata.
>
> That is my current approach, but I was contemplating using Lua-allocated
> memory if it is safe to assume constancy of userdata.

No, I mean a full userdata with a pointer to a void*. Then you allocate
the memory to pass to your thread and point at it from the userdata. I
believe Lua calls this a "boxed pointer" and it has the advantage that
you can use Lua to collect it by adding an __gc metamethod to the userdata.

>
> given this from Roberto :
>
> "We have no intention of
> allowing userdata addresses to change during GC. Unlike strings, which
> are an internal data in Lua, the only purpose of userdata is to be used
> by C code, which prefer that things stay where they are :)"
>
> I will now happily pass full userdata to a background thread (IOCP on
> Windows, fyi).

I'm not sure if Roberto intended what you think you read. If and when
Lua gets a compacting garbage collector, a userdata's address would move
around just like any other gc object. Maybe that will never happen and
maybe they have some clever way to guarantee that userdatas never move.
But it seems like a better idea to not rely on this and use your own data.

--
chris marrin              ,""$, "As a general rule,don't solve puzzles
[hidden email]        b`    $  that open portals to Hell" ,,.
         ,.`           ,b`    ,`                            , 1$'
      ,|`             mP    ,`                              :$$'     ,mm
    ,b"              b"   ,`            ,mm      m$$    ,m         ,`P$$
   m$`             ,b`  .` ,mm        ,'|$P   ,|"1$`  ,b$P       ,`  :$1
  b$`             ,$: :,`` |$$      ,`   $$` ,|` ,$$,,`"$$     .`    :$|
b$|            _m$`,:`    :$1   ,`     ,$Pm|`    `    :$$,..;"'     |$:
P$b,      _;b$$b$1"       |$$ ,`      ,$$"             ``'          $$
  ```"```'"    `"`         `""`        ""`                          ,P`
Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

Adrian Sietsma
Chris Marrin wrote:
  >
> No, I mean a full userdata with a pointer to a void*. Then you allocate
> the memory to pass to your thread and point at it from the userdata. I
> believe Lua calls this a "boxed pointer" and it has the advantage that
> you can use Lua to collect it by adding an __gc metamethod to the userdata.

Sorry, my mistake. In my case, that is not worth it, because the buffers are
private to the c-side, and it is just as easy to manually delete them when
their owner is gc'ed.

A:

>> Roberto :
>> "We have no intention of
>> allowing userdata addresses to change during GC. Unlike strings, which
>> are an internal data in Lua, the only purpose of userdata is to be used
>> by C code, which prefer that things stay where they are :)"
>>
>> Adrian :
>> I will now happily pass full userdata to a background thread (IOCP on
>> Windows, fyi).
>>
B:
> Chris :
> I'm not sure if Roberto intended what you think you read. If and when
> Lua gets a compacting garbage collector, a userdata's address would move
> around just like any other gc object. Maybe that will never happen and
> maybe they have some clever way to guarantee that userdatas never move.
> But it seems like a better idea to not rely on this and use your own data.

I read A to mean that userdata never moves, and will code accordingly.
It would be nice to have this point clarified in the manual.

ps
consider - if a userdata can move due to garbage collection, it would have
_major_ implications for c routines. There would need to be clear rules on
the lifetime of pointers returned from lua_newuserdata() and
lua_touserdata(), and which operations may invalidate them. Or, some
contract with the gc to say "leave this block alone".

Adrian

Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

David Given
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Adrian Sietsma wrote:
[...]
> consider - if a userdata can move due to garbage collection, it would
> have _major_ implications for c routines. There would need to be clear
> rules on the lifetime of pointers returned from lua_newuserdata() and
> lua_touserdata(), and which operations may invalidate them. Or, some
> contract with the gc to say "leave this block alone".

I'd just assumed that the same rules as for strings applied --- i.e.,
that the pointer was valid only while it was on the Lua stack. As soon
as you remove it from the stack, the pointer becomes invalid.

- --
+- David Given --McQ-+
|  [hidden email]    | "Those that repeat truisms, are also forced to
| ([hidden email]) | repeat them." --- Anonymous from Slashdot
+- www.cowlark.com --+
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFERgPqf9E0noFvlzgRAtFvAKCnqkNO0XeQDbtRGgugMUcY3WgizACgmcyc
+kZkGIhAdsILpOzwe19JkBQ=
=d0d+
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

Chris Marrin
In reply to this post by Adrian Sietsma
Adrian Sietsma wrote:

> ...
>> Chris :
>> I'm not sure if Roberto intended what you think you read. If and when
>> Lua gets a compacting garbage collector, a userdata's address would
>> move around just like any other gc object. Maybe that will never
>> happen and maybe they have some clever way to guarantee that userdatas
>> never move. But it seems like a better idea to not rely on this and
>> use your own data.
>
> I read A to mean that userdata never moves, and will code accordingly.
> It would be nice to have this point clarified in the manual.
>
> ps
> consider - if a userdata can move due to garbage collection, it would
> have _major_ implications for c routines. There would need to be clear
> rules on the lifetime of pointers returned from lua_newuserdata() and
> lua_touserdata(), and which operations may invalidate them. Or, some
> contract with the gc to say "leave this block alone".

I think it would be a mistake not to write code protecting your C
functions from disappearing userdata, which by implication can protect
you from moving userdata as well. I have had places in my code where I
maintained pointers to userdata and it inevitibly caused problems. Now I
let Lua maintain the pointers and I simply have references into the Lua
VM. I either use the registry or similar tables. The __gc metamethod can
keep you fully informed about the lifetime of each userdata. And Lua has
a powerful feature for maintining weak pointers to avoid frozen userdata
pointers. When attempting to get a reference to a weak pointer you will
get back a nil so you can cleanly handle this case. All these techniques
keep your C state consistent with your Lua state.

I think one of Lua's most powerful and useful features is its simple yet
flexible C interface.

--
chris marrin              ,""$, "As a general rule,don't solve puzzles
[hidden email]        b`    $  that open portals to Hell" ,,.
         ,.`           ,b`    ,`                            , 1$'
      ,|`             mP    ,`                              :$$'     ,mm
    ,b"              b"   ,`            ,mm      m$$    ,m         ,`P$$
   m$`             ,b`  .` ,mm        ,'|$P   ,|"1$`  ,b$P       ,`  :$1
  b$`             ,$: :,`` |$$      ,`   $$` ,|` ,$$,,`"$$     .`    :$|
b$|            _m$`,:`    :$1   ,`     ,$Pm|`    `    :$$,..;"'     |$:
P$b,      _;b$$b$1"       |$$ ,`      ,$$"             ``'          $$
  ```"```'"    `"`         `""`        ""`                          ,P`
Reply | Threaded
Open this post in threaded view
|

Re: userdata and the gc

Adrian Sietsma
Chris Marrin wrote:
> I think it would be a mistake not to write code protecting your C
> functions from disappearing userdata, which by implication can protect
> you from moving userdata as well. I have had places in my code where I
> maintained pointers to userdata and it inevitibly caused problems.

I agree with pointer paranoia, but it would be nice to alloc the memory via
Lua, so usage is reported correctly. If I understand correctly, a full
userdata is simply a block of memory that is returned by the Lua allocator
(malloc/realloc by default). It may well start below the pointer we get, but
the pointer should stay valid until the ud is gc'ed.

Now I
> let Lua maintain the pointers and I simply have references into the Lua
> VM. I either use the registry or similar tables. The __gc metamethod can
> keep you fully informed about the lifetime of each userdata. And Lua has
> a powerful feature for maintining weak pointers to avoid frozen userdata
> pointers. When attempting to get a reference to a weak pointer you will
> get back a nil so you can cleanly handle this case. All these techniques
> keep your C state consistent with your Lua state.

That would actually be worse than lightuserdata in my current scenario. I
currently remove blocks from my Lua "free list", and assign them to a field
in another userdata object. If they are full userdata, I need to keep a
reference to them in Lua (a "busy list"), so they are not collected. This is
worth doing if I can use a full userdata as the buffer directly, but if I'm
allocing and freeing them anyway, boxing the pointer doesn't gain me much.
Although I may be able to replace the free list with a weak table....

I'll probably re-think it later, when I've got the rest of this lib working.
It is the sort of thing that can easily be reworked later.
>
> I think one of Lua's most powerful and useful features is its simple yet
> flexible C interface.
>
Hear Hear.
We have the luxury of choosing between light and full userdata, boxed
pointers, etc. I'm drifting toward having an embedded Lua vm in
_everything_, for general string handling and hashmap/array storage.

Adrian