Metatable check helper API proposal

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Metatable check helper API proposal

Sergey Zakharchenko
Hello,

This is a continuation of my luaL_checkudata/luaL_testudata
optimization ideas. How about a new API, lua_metatabletopointer (or
lua_getmetatablepointer?), acting as if lua_getmetatable followed by
lua_topointer, but without touching the Lua stack? This would enable
faster "are we really called with 'our' userdata?" checks and reduce
userdata metamethod call overhead (checks would trivially extend to,
say, a list of metatables known to a function).

There's a small implementation of it inside lua_getmetatable already:)

Best regards,

--
DoubleF
Reply | Threaded
Open this post in threaded view
|

Re: Metatable check helper API proposal

Roberto Ierusalimschy
> This is a continuation of my luaL_checkudata/luaL_testudata
> optimization ideas. How about a new API, lua_metatabletopointer (or
> lua_getmetatablepointer?), acting as if lua_getmetatable followed by
> lua_topointer, but without touching the Lua stack? This would enable
> faster "are we really called with 'our' userdata?" checks and reduce
> userdata metamethod call overhead (checks would trivially extend to,
> say, a list of metatables known to a function).

I think the main issue here is what to compare the metatable with,
that is, how to specify which metatable we want. Your suggestion of
comparing the metatable against a given upvalue is interesting, but
it clashes with light C functions. (Most C functions using userdata
would need to be heavy.)

Another idea would be to write 'luaL_testudata' inside the kernel,
to avoid all stack manipulation and other API overheads. We would
need to test first to know how much could we gain with that.

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: Metatable check helper API proposal

Andrew Gierth
>>>>> "Roberto" == Roberto Ierusalimschy <[hidden email]> writes:

 Roberto> Another idea would be to write 'luaL_testudata' inside the
 Roberto> kernel, to avoid all stack manipulation and other API
 Roberto> overheads. We would need to test first to know how much could
 Roberto> we gain with that.

If you do this, consider also providing an equivalent API that uses
lightuserdata keys rather than strings. I personally never use strings
as registry identifiers because of the risk of name collisions.

--
Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: Metatable check helper API proposal

Luiz Henrique de Figueiredo
In reply to this post by Sergey Zakharchenko
Reply | Threaded
Open this post in threaded view
|

Re: Metatable check helper API proposal

Sergey Zakharchenko
In reply to this post by Roberto Ierusalimschy
Roberto,

Roberto Ierusalimschy <[hidden email]>:

> I think the main issue here is what to compare the metatable with,
> that is, how to specify which metatable we want. Your suggestion of
> comparing the metatable against a given upvalue is interesting, but
> it clashes with light C functions. (Most C functions using userdata
> would need to be heavy.)

Right; that's why I abandoned that in favor of C pointer comparisons.
Library init code could create a metatable, ref it so that it doesn't
go away, call lua_topointer on it and store the pointer in a C
variable, then use it for comparison inside the C metamethod.

> We would
> need to test first to know how much could we gain with that.

That depends on how often the metamethod is called and how much other
work it does. In case of tuples, the gain seems significant; granted,
that's not the common use case, but multiple uservalues have only been
introduced recently.

A realistic benchmark could be the file stream in liolib. For an
example of a function that essentially does nothing except for the
metatable check, see io.type().

In general, let me know what measurements exactly you'd like to see:)

Best regards,

--
DoubleF
Reply | Threaded
Open this post in threaded view
|

Re: Metatable check helper API proposal

Gé Weijers
On Wed, Aug 19, 2020 at 7:20 AM Sergey Zakharchenko
<[hidden email]> wrote:
>
>
> Right; that's why I abandoned that in favor of C pointer comparisons.
> Library init code could create a metatable, ref it so that it doesn't
> go away, call lua_topointer on it and store the pointer in a C
> variable, then use it for comparison inside the C metamethod.

If this C variable is a static or global variable your code is no
longer reentrant, so you could not use the library with two Lua states
in your code. This may not be an issue for you, but the technique is
not universally applicable.



Reply | Threaded
Open this post in threaded view
|

Re: Metatable check helper API proposal

Sergey Zakharchenko
Gé,

Gé Weijers <[hidden email]>:
> If this C variable is a static or global variable your code is no
> longer reentrant, so you could not use the library with two Lua states
> in your code.

Correct. If this is important, the metatable could be registered using
a dummy global variable pointer (lua_rawsetp) and a pointer to it
could be obtained by yet another API, say lua_rawgetpp = lua_rawgetp
followed by lua_topointer but without touching the stack. No writes to
the global 'variable' are needed, so it can be shared between states.
This does start looking a bit like API creep though (e.g. shall we
also add lua_rawgetip?) but it restores reentrancy and still avoids
name clashes...

Best regards,

--
DoubleF
Reply | Threaded
Open this post in threaded view
|

Re: Metatable check helper API proposal

云风 Cloud Wu
In reply to this post by Roberto Ierusalimschy


Roberto Ierusalimschy <[hidden email]>于2020年8月19日 周三21:29写道:

I think the main issue here is what to compare the metatable with,
that is, how to specify which metatable we want. Your suggestion of

I have an idea .

Maybe we can use an indirect structure to reference the metatable rather than a direct pointer to it.

For example,

struct metatable_cache {
  struct Table *metatable;
  GCObject *metamethod[TM_N];
  struct metatable_cache *next;
};

When we set metatable, we can copy all the metamethods into this cache structure.

If we need access the metamethods , we can check the flags of metatable first to decide whether need to sync the metamethods again.

All the table used as a metatable should have a unique cache object. We can implement this like short string table.

And then , we can put the name or anything else to this structure to check it.


--
Reply | Threaded
Open this post in threaded view
|

Re: Metatable check helper API proposal

Dibyendu Majumdar
In reply to this post by Sergey Zakharchenko
On Wed, 19 Aug 2020 at 06:50, Sergey Zakharchenko
<[hidden email]> wrote:
>
> This is a continuation of my luaL_checkudata/luaL_testudata
> optimization ideas. How about a new API, lua_metatabletopointer (or
> lua_getmetatablepointer?), acting as if lua_getmetatable followed by
> lua_topointer, but without touching the Lua stack? This would enable
> faster "are we really called with 'our' userdata?" checks and reduce
> userdata metamethod call overhead (checks would trivially extend to,
> say, a list of metatables known to a function).
>

In the early days of Ravi I checked performance of matrix
multiplication with Lua tables versus C arrays with metamethods.
Even without metatable checking the C arrays were slower than Lua
tables - metamethods are costly to invoke!
Adding metatable checking made it even worse of course (I was using
the optimized version suggested by Luiz in the mailing list a long
while ago).

I think it is best to benchmark the perceived advantage first. You may
be surprised.

In Ravi today metatable based type assertion is done by the VM when
you have something like:

function tryme(a: MyType)
end

Having a named typed therefore turned out to be important. And for
backward compatibility it is important to continue using existing
registry of metatables.

Regards
Reply | Threaded
Open this post in threaded view
|

Re: Metatable check helper API proposal

云风 Cloud Wu
Dibyendu Majumdar <[hidden email]> 于2020年8月20日周四 上午1:12写道:

> In the early days of Ravi I checked performance of matrix
> multiplication with Lua tables versus C arrays with metamethods.
> Even without metatable checking the C arrays were slower than Lua
> tables - metamethods are costly to invoke!
> Adding metatable checking made it even worse of course (I was using
> the optimized version suggested by Luiz in the mailing list a long
> while ago).
>
> I think it is best to benchmark the perceived advantage first. You may
> be surprised.
I have tried to make a cache for metamethods today, and do some simple
benchmark.

```lua
local N = 100000000

local a = {}
local meta = {} ; meta.__index = meta

function meta:foo()
end

setmetatable(a, meta)

local t = os.clock()

for i = 1, N do
  local foo = a.foo
end

local t1 = os.clock()

for i = 1, N do
  local foo = a:foo()
end

local t2 = os.clock()

print(t1 - t, t2 - t1)
```

Intel i7-7700 @ 3.60GHz Windows 10 mingw64 gcc 8.2.0

lua 5.4 (https://github.com/lua/lua master) 1.524   2.887
After patch (See attachment):                     1.271   2.492

This patch is only a prototype for benchmark.

metacache.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Metatable check helper API proposal

云风 Cloud Wu
云风 Cloud Wu <[hidden email]> 于2020年8月20日周四 上午10:57写道:

> ```lua
> local N = 100000000
>
> local a = {}
> local meta = {} ; meta.__index = meta
>
> function meta:foo()
> end
>
> setmetatable(a, meta)
>
> local t = os.clock()
>
> for i = 1, N do
>   local foo = a.foo
> end
>
> local t1 = os.clock()
>
> for i = 1, N do
>   local foo = a:foo()
> end
>
> local t2 = os.clock()
>
> print(t1 - t, t2 - t1)
> ```
>
> Intel i7-7700 @ 3.60GHz Windows 10 mingw64 gcc 8.2.0
>
> lua 5.4 (https://github.com/lua/lua master) 1.524   2.887
> After patch (See attachment):                     1.271   2.492

I added the empty loop test.

local N = 100000000

local a = {}
local meta = {} ; meta.__index = meta

function meta:foo()
end

setmetatable(a, meta)

local t = os.clock()

for i = 1, N do
  local foo = a.foo
end

local t1 = os.clock()

for i = 1, N do
  local foo = a:foo()
end

local t2 = os.clock()

for i = 1, N do
  local foo = a
end

local t3 = os.clock()

print(t1 - t, t2 - t1, t3 - t2)
```

Intel i7-7700 @ 3.60GHz Windows 10 mingw64 gcc 8.2.0

lua 5.4                         1.551   2.807   0.378
After patch                   1.259   2.593   0.378

(1.551-0.378) / (1.259-0.378) = 1.331

I think it's 33% faster.

--
http://blog.codingnow.com