A proposal for faster userdata type checking

classic Classic list List threaded Threaded
41 messages Options
123
Reply | Threaded
Open this post in threaded view
|

A proposal for faster userdata type checking

Chris-41
The current luaL_newmetatable/luaL_checkudata auxiliary functions are
fine but luaL_checkudata adds a significant amount of overhead when
making heavy use of calls that need to check userdata types (which is
very important when creating "safe" APIs).  I have attached a small
extension patch that adds lua_tomuserdata() and luaL_checkmudata() to
the core which gives the application quicker access to the metatable
pointer.   From the developer perspective nothing changes unless they
want to use the new mechanism.

Example usage (simplified):

void *myusertype_meta;

int create_myusertype(lua_State *L) {
  ...
  luaL_newmetatable(L, "myusertype");
  /* cache the metatable pointer locally */
  myusertype_meta = lua_topointer(L, -1);
  ...
}

int some_lua_function(lua_State *L) {
  myusertype *ud = luaL_checkmutype(L, 1, "myusertype", myusertype_meta);
  ...
}

I see about a 30+% increase in general application speed plus the
normal API still works as usual of course.  What do you think?

CR
diff -ru lua-5.1.3.orig/src/lapi.c lua-5.1.3/src/lapi.c
--- lua-5.1.3.orig/src/lapi.c	2008-01-03 10:20:39.000000000 -0500
+++ lua-5.1.3/src/lapi.c	2008-02-28 10:44:13.429884590 -0500
@@ -393,6 +393,18 @@
 }
 
 
+LUA_API void *lua_tomuserdata (lua_State *L, int idx, const void *meta) {
+  StkId o = index2adr(L, idx);
+  switch (ttype(o)) {
+    case LUA_TUSERDATA:
+      if (uvalue(o)->metatable == meta) return (rawuvalue(o) + 1);
+      return NULL;
+    case LUA_TLIGHTUSERDATA: return pvalue(o);
+    default: return NULL;
+  }
+}
+
+
 LUA_API lua_State *lua_tothread (lua_State *L, int idx) {
   StkId o = index2adr(L, idx);
   return (!ttisthread(o)) ? NULL : thvalue(o);
diff -ru lua-5.1.3.orig/src/lauxlib.c lua-5.1.3/src/lauxlib.c
--- lua-5.1.3.orig/src/lauxlib.c	2008-01-21 08:20:51.000000000 -0500
+++ lua-5.1.3/src/lauxlib.c	2008-02-28 10:44:30.923498327 -0500
@@ -137,6 +137,14 @@
 }
 
 
+LUALIB_API void *luaL_checkmudata (lua_State *L, int ud, const char *tname, const void* meta) {
+  void *p = lua_tomuserdata(L, ud, meta);
+  if (p != NULL) return p;
+  luaL_typerror(L, ud, tname);  /* else error */
+  return NULL;  /* to avoid warnings */
+}
+
+
 LUALIB_API void luaL_checkstack (lua_State *L, int space, const char *mes) {
   if (!lua_checkstack(L, space))
     luaL_error(L, "stack overflow (%s)", mes);
Reply | Threaded
Open this post in threaded view
|

Re: A proposal for faster userdata type checking

Taj Khattra
On Thu, Feb 28, 2008 at 7:57 AM, Chris <[hidden email]> wrote:
>  fine but luaL_checkudata adds a significant amount of overhead when
>  making heavy use of calls that need to check userdata types (which is
>  very important when creating "safe" APIs).

did you compare the performance of storing the metatable in the C function
environment instead of the registry?

i seem to recall mike pall posting some performance numbers comparing
the two approaches, but google is failing me at the moment.

Reply | Threaded
Open this post in threaded view
|

Re: A proposal for faster userdata type checking

Alex Davies
Does anyone else think it would be nice to simply have a byte tag stored with each userdata? Couldn't be quicker to check, and it'd have the added benefit of allowing custom metatables per userdata object. (Only useful in a few cases, but still).

- Alex

----- Original Message ----- From: "Taj Khattra"

On Thu, Feb 28, 2008 at 7:57 AM, Chris <[hidden email]> wrote:
 fine but luaL_checkudata adds a significant amount of overhead when
 making heavy use of calls that need to check userdata types (which is
 very important when creating "safe" APIs).

did you compare the performance of storing the metatable in the C function
environment instead of the registry?

i seem to recall mike pall posting some performance numbers comparing
the two approaches, but google is failing me at the moment.

Reply | Threaded
Open this post in threaded view
|

RE: A proposal for faster userdata type checking

Jerome Vuarand-2
Alex Davies wrote:
> Does anyone else think it would be nice to simply have a byte tag
> stored with each userdata? Couldn't be quicker to check, and it'd
> have the added benefit of allowing custom metatables per userdata
> object. (Only useful in a few cases, but still).   

You can do it, add a byte at the beginning or end of the userdata. I
don't see why it would need to be outside of the userdata.


Reply | Threaded
Open this post in threaded view
|

Re: A proposal for faster userdata type checking

Diego Nehab-3
In reply to this post by Taj Khattra
Hi,

i seem to recall mike pall posting some performance numbers comparing
the two approaches, but google is failing me at the moment.

I remember that post too. And in the process of writing
LuaSocket 3.0, I revamped the auxiliar.c module. I should be
much faster now. The API is a little different, though, and
definitely not the same as luaL_checkudata. It would be nice
to have a benchmark of these alternatives. Anybody
interested in running these tests? I am of course willing to
help and provide my current implementation.

Regards,
Diego

Reply | Threaded
Open this post in threaded view
|

LuaSocket 3.0? (was Re: A proposal for faster userdata type checking)

Petite Abeille

On Feb 29, 2008, at 12:25 AM, Diego Nehab wrote:

And in the process of writing LuaSocket 3.0

What's cooking for LuaSocket 3.0?

Cheers,

PA.


Reply | Threaded
Open this post in threaded view
|

Re: LuaSocket 3.0? (was Re: A proposal for faster userdata type checking)

Diego Nehab-3
Hi,

And in the process of writing LuaSocket 3.0

What's cooking for LuaSocket 3.0?

Completely new, completely assynchronous, level triggered
API.

We are shooting for an API that behaves like the IOCP API
from Windows, but simpler to use.  Using kqueue on BSD, IOCP
on Windows, epoll on Linux.

It will be easy to implement higher level APIs (like the
dispatch.lua idea) on top of it. There will even be a
synchronous API on top of it (no select(), though).
Hopefully will scale very well w.r.t. number of sockets.

Kind regards,
Diego

Reply | Threaded
Open this post in threaded view
|

Re: A proposal for faster userdata type checking

Wesley Smith
In reply to this post by Diego Nehab-3
Hi Diego,
Would you mind describing a bit your approach?  I would be very
interested to hear some about it.
thanks,
wes

On Thu, Feb 28, 2008 at 3:25 PM, Diego Nehab <[hidden email]> wrote:
> Hi,
>
>
>  > i seem to recall mike pall posting some performance numbers comparing
>  > the two approaches, but google is failing me at the moment.
>
>  I remember that post too. And in the process of writing
>  LuaSocket 3.0, I revamped the auxiliar.c module. I should be
>  much faster now. The API is a little different, though, and
>  definitely not the same as luaL_checkudata. It would be nice
>  to have a benchmark of these alternatives. Anybody
>  interested in running these tests? I am of course willing to
>  help and provide my current implementation.
>
>  Regards,
>  Diego
>

Reply | Threaded
Open this post in threaded view
|

Re: LuaSocket 3.0? (was Re: A proposal for faster userdata type checking)

Petite Abeille-2-2
In reply to this post by Diego Nehab-3

On Feb 29, 2008, at 12:35 AM, Diego Nehab wrote:

And in the process of writing LuaSocket 3.0

What's cooking for LuaSocket 3.0?

Completely new, completely assynchronous, level triggered
API.

Sounds promissing :)

We are shooting for an API that behaves like the IOCP API
from Windows, but simpler to use.  Using kqueue on BSD, IOCP
on Windows, epoll on Linux.

It will be easy to implement higher level APIs (like the
dispatch.lua idea) on top of it. There will even be a
synchronous API on top of it (no select(), though).

Will the new implementation be API compatible with the old one as defined by the LuaSocket reference manual?

http://www.tecgraf.puc-rio.br/~diego/professional/luasocket/reference.html

Hopefully will scale very well w.r.t. number of sockets.

Cheers,

PA.





Reply | Threaded
Open this post in threaded view
|

Re: LuaSocket 3.0? (was Re: A proposal for faster userdata type checking)

Diego Nehab-3
Hi,

We are shooting for an API that behaves like the IOCP API
from Windows, but simpler to use.  Using kqueue on BSD,
IOCP on Windows, epoll on Linux.

It will be easy to implement higher level APIs (like the
dispatch.lua idea) on top of it. There will even be a
synchronous API on top of it (no select(), though).

Will the new implementation be API compatible with the old
one as defined by the LuaSocket reference manual?

http://www.tecgraf.puc-rio.br/~diego/professional/luasocket/reference.html

No. If you really need compatibility, there is no reason to
use 3.0. We will continue to support 2.0.x for bugs etc.

The assynchronous APIs present a different programming
paradigm, so it doesn't make sense to be compatible there.

On the other hand, 3.0 will have a dispatch.lua module that
will be very, very similar to 2.0.x. We will also have a
compatibility layer that will allow for applications using
an API that is very similar to 2.0.x.

At the moment, it is hard to evaluate how close we will get
to the exact old API, though. It should be possible to implement
select() on the new API, but I bet it will be slower than
the current library.

Kind regards,
Diego

Reply | Threaded
Open this post in threaded view
|

Re: A proposal for faster userdata type checking

Mark Hamburg-4
In reply to this post by Chris-41
The problem with caching pointers like this is that it doesn't work in the
multiple universe case.

Mark



Reply | Threaded
Open this post in threaded view
|

Re: A proposal for faster userdata type checking

Mark Hamburg-4
In reply to this post by Jerome Vuarand-2
on 2/28/08 2:40 PM, Jerome Vuarand at [hidden email] wrote:

> Alex Davies wrote:
>> Does anyone else think it would be nice to simply have a byte tag
>> stored with each userdata? Couldn't be quicker to check, and it'd
>> have the added benefit of allowing custom metatables per userdata
>> object. (Only useful in a few cases, but still).
> 
> You can do it, add a byte at the beginning or end of the userdata. I
> don't see why it would need to be outside of the userdata.

Storing it in the userdata itself only works if you know that everyone uses
that byte properly which is a slightly harder problem than doing the ID
space management for the fast tag checking.

I added a byte of fast tag checking to Lightroom, but I don't know whether
it's actually worth it in practice. I need to build an alternative based on
checking something in the metatable or based on a fully-weak-table of
proxies and see how the timings work out.

Mark



Reply | Threaded
Open this post in threaded view
|

Re: LuaSocket 3.0? (was Re: A proposal for faster userdata type checking)

Petite Abeille
In reply to this post by Diego Nehab-3

On Feb 29, 2008, at 1:08 AM, Diego Nehab wrote:

No. If you really need compatibility, there is no reason to
use 3.0. We will continue to support 2.0.x for bugs etc.

Ah... ok...

The assynchronous APIs present a different programming
paradigm, so it doesn't make sense to be compatible there.

Considering that Lua's threading model is cooperative and that most library operations are blocking, I don't see much benefit from an asynchronous socket API in practice. Or at least in the narrow confines of my current endeavors :)

On the other hand, 3.0 will have a dispatch.lua module that
will be very, very similar to 2.0.x. We will also have a
compatibility layer that will allow for applications using
an API that is very similar to 2.0.x.

At the moment, it is hard to evaluate how close we will get
to the exact old API, though. It should be possible to implement
select() on the new API, but I bet it will be slower than
the current library.

I see... will stick with 2.x for the moment then.

THanks for the insights.

Cheers,

PA.

Reply | Threaded
Open this post in threaded view
|

Re: LuaSocket 3.0? (was Re: A proposal for faster userdata type checking)

Diego Nehab-3
Hi,

The assynchronous APIs present a different programming
paradigm, so it doesn't make sense to be compatible
there.

Considering that Lua's threading model is cooperative and
that most library operations are blocking, I don't see
much benefit from an asynchronous socket API in practice.
Or at least in the narrow confines of my current endeavors
:)

The idea is to choke all notifications through a queue...

Regards,
Diego

Reply | Threaded
Open this post in threaded view
|

Re: A proposal for faster userdata type checking

Taj Khattra
In reply to this post by Wesley Smith
On Thu, Feb 28, 2008 at 3:36 PM, Wesley Smith <[hidden email]> wrote:
> Hi Diego,
>  Would you mind describing a bit your approach?  I would be very
>  interested to hear some about it.

not sure about diego's approach, but here is an example from
mike pall on storing the metatable in the C function environment:
http://lua-users.org/lists/lua-l/2005-04/msg00213.html

Reply | Threaded
Open this post in threaded view
|

Re: A proposal for faster userdata type checking

Diego Nehab-3
In reply to this post by Wesley Smith
Hi,

Sure. I don't have the code on me right now. The main
ideas is this:

    Avoid the strings as class names. Use lightuserdata
    instead. The key contains the addresses of the strings, not
    the values of the strings. The strings are extern const
    strings defined in whatever module created the class, and
    exported in its .h file.

So to set a userdata to a class, you only need one rawget,
using a lightuserdata as key, from the class table (which
can be the registry or a private table in an upvalue), plus
one setmetatable.

To check a userdata you need one touserdata, that one rawget, one getmetatable, and one isequal.

The savings come from the fact you don't traverse strings,
ever, and you don't create strings, ever.

Kind regards,
Diego

Reply | Threaded
Open this post in threaded view
|

Re: A proposal for faster userdata type checking

Mark Hamburg-4
on 2/28/08 4:43 PM, Diego Nehab at [hidden email] wrote:

> Hi,
> 
> Sure. I don't have the code on me right now. The main
> ideas is this:
> 
>      Avoid the strings as class names. Use lightuserdata
>      instead. The key contains the addresses of the strings, not
>      the values of the strings. The strings are extern const
>      strings defined in whatever module created the class, and
>      exported in its .h file.
> 
> So to set a userdata to a class, you only need one rawget,
> using a lightuserdata as key, from the class table (which
> can be the registry or a private table in an upvalue), plus
> one setmetatable.
> 
> To check a userdata you need one touserdata, that one rawget,
> one getmetatable, and one isequal.
> 
> The savings come from the fact you don't traverse strings,
> ever, and you don't create strings, ever.

Which leads to a Lua API request for 5.2:

    lua_rawgetud (lua_State* L, const void* ud);
    lua_rawsetud (lua_State* L, const void* ud);

Get and set a table indexed via a userdata.

Mark



Reply | Threaded
Open this post in threaded view
|

Re: LuaSocket 3.0? (was Re: A proposal for faster userdata type checking)

Doug Currie
In reply to this post by Diego Nehab-3
On Thursday, February 28, 2008 Diego Nehab wrote: 

> Hi,

>>> The assynchronous APIs present a different programming
>>> paradigm, so it doesn't make sense to be compatible
>>> there.
>>
>> Considering that Lua's threading model is cooperative and
>> that most library operations are blocking, I don't see
>> much benefit from an asynchronous socket API in practice.
>> Or at least in the narrow confines of my current endeavors
>> :)

> The idea is to choke all notifications through a queue...

Which could be used with a scheduler written in Lua so that threads
waiting on socket I/O (even using APIs like accept) can relinquish the
processor to ready threads.

e

-- 
Doug Currie
Londonderry, NH, USA



Reply | Threaded
Open this post in threaded view
|

Re: A proposal for faster userdata type checking

John Labenski
In reply to this post by Diego Nehab-3
On Thu, Feb 28, 2008 at 7:43 PM, Diego Nehab <[hidden email]> wrote:
>
>  Sure. I don't have the code on me right now. The main
>  ideas is this:
>
>      Avoid the strings as class names. Use lightuserdata
>      instead. The key contains the addresses of the strings, not
>      the values of the strings. The strings are extern const
>      strings defined in whatever module created the class, and
>      exported in its .h file.
>
>  So to set a userdata to a class, you only need one rawget,
>  using a lightuserdata as key, from the class table (which
>  can be the registry or a private table in an upvalue), plus
>  one setmetatable.
>
>  To check a userdata you need one touserdata, that one rawget,
>  one getmetatable, and one isequal.
>
>  The savings come from the fact you don't traverse strings,
>  ever, and you don't create strings, ever.
>

I agree with the above.

I don't have the numbers anymore, but wxLua has switched to using
lightuserdata table keys as well. For example, we push lightuserdata
of the address of const char* strings as keys to our tables in the
LUA_REGISTRYINDEX.

However, we do use integer values for our userdata types and store a
table in the LUA_REGISTRYINDEX that maps the integers keys to a table
of other info about the userdata. This allows us to simply extend the
Lua types (integers) with our wxLua types so we only have to pass a
single integer between functions and not the integer Lua type plus the
void* lightuserdata corresponding to the wxLua type since we don't
want every function to have keep looking up the wxLua type.

However, the simplicity and clarity of using a string as the
userdata's type is very appealing and would follow what Lua itself
does ("FILE*" for example), but I opted for speed.

I found that the Lua functions to hash string keys took much of a
simple program's time as seen with cachegrind.

local p = wx.wxPoint(0,0)
for i = 1, 1E4
    p:SetX(5) -- forces a lookup of the wxLua userdata type of 'p'
end


wxLua/modules/wxlua/include/wxlstate.h
// wxLua userdata metatable structure:
// {
//    lightuserdata(&wxlua_metatable_type_key) = wxLua type number in
wxlua_lreg_types_key table
//    lightuserdata(&wxlua_metatable_wxluabindclass_key) =
lightuserdata(&wxLuaBindClass)
//    __gc       = function(wxlua_wxLuaBindClass__gc)
//    __index    = function(wxlua_wxLuaBindClass__index)
//    __newindex = function(wxlua_wxLuaBindClass__newindex)
//    __tostring = function(wxlua_wxLuaBindClass__tostring)
// }


Regards,
    John

Reply | Threaded
Open this post in threaded view
|

RE: A proposal for faster userdata type checking

Richter, Jörg
>>
>>  So to set a userdata to a class, you only need one rawget,
>>  using a lightuserdata as key, from the class table (which
>>  can be the registry or a private table in an upvalue), plus
>>  one setmetatable.
>>
>>  To check a userdata you need one touserdata, that one rawget,
>>  one getmetatable, and one isequal.
>>
>>  The savings come from the fact you don't traverse strings,
>>  ever, and you don't create strings, ever.
>>
>
>I agree with the above.
>
>I don't have the numbers anymore, but wxLua has switched to using
>lightuserdata table keys as well. For example, we push lightuserdata
>of the address of const char* strings as keys to our tables in the
>LUA_REGISTRYINDEX.

I do the same. But storing the key in LUA_REGISTRYINDEX seems like a potential problem to me.

Imagine two libraries storing the same constant string in the registry. It can happen (static linking?) that the two strings have the same address. This is not possible with the address of 'static char' or 'static char[]'


   Jörg


123