Some ideas

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Some ideas

Sokolov Yura-2
Excuse my English.
Some ideas about improving Lua.

0. Why Table.flags type is lu_byte(unsigned char)? We have 17
metamethods. Should it be unsigned short int instead? Then we cache 16
metamethods. On x86 there is no drawback having it unsigned short int
instead of unsigned char(due to the alignment).

1. Let user data to store TValue and then mark it during GC.
   Such userdata should have the __mark "method" in metatable to react on
  mark_event(as analogy of gc_event)

   Marking is made by calling markvalue macros from lgc.c for every
stored TValue. MarkFunction should have signature int ...(global_State*
g, GCObject* g).

   Using of TValue should be restricted by:
         a.  copying from/to lua stack (and clearing - setting to a nil)
        b.  type recognition (by macroses ttis*(o) from lobject.h)
        c.  accessing to scalar values (nil, boolean, number, but not
                 strings) (don't allow to change type, i think)
          (macroses nvalue(o), bvalue(o), setnvalue(obj, x),
            setbvalue(obj,x), setnilvalue(obj))  (i think it is optional)
        b.  dereferencing userdata and lightuserdata(macrosses rawuvalue
            and pvalue) (it is main usecase)

   All other operations possible only using lua stack.

   Benefits:
     -  http://lua-users.org/lists/lua-l/2006-05/msg00075.html - this
        could be implemented without using of environment table - i think
        there are should be great space gain. (and gc speed)
     -  another variants of speed+space improving
     -  implementing alternative data structures (reusing existent)
        (linked lists, queue(for threading),
          hashtables ala python/ruby, etc)

   Drawbacks:
     -  opening implementation details to programmer (lua is great in
        hiding details, and i really respect it) => more error prone
        using of Lua (but i think it would not widely used, so those
        who need it will be more careful)
     -  + <10 lines of code (I think, no more)
     -  big refactoring of headers
        (new user accessable header "lua_usergc.h"?)
     -  introducing new cfunction type:
          typedef int(*lua_MarkFunction)(global_State *g, GCObject *o);
        and possibly new lua type (markfunction) - but we could store
        reference just in lightuserdata.

    It is a little change in codebase, but great improvement in
   extendability of lua.

2. Let multindex syntax t[a,b,c] and improving index and newindex events
to this version:


function gettable_event (table, key, ...)
   local h
   if type(table) == "table" then
     if select(#, ...)==0 then
      local v = rawget(table, key)
      if v ~= nil then return v end
     end
     h = metatable(table).__index
     if h == nil then return nil end
   else
     h = metatable(table).__index
     if h == nil then
       error("...");
     end
   end
   if type(h) == "function" then
     return h(table, key, ...)      -- call the handler
   else return h[key, ...]          -- or repeat operation on it
   end
end

function settable_event (table, key, ...)
   local h
   if type(table) == "table" then
     if select(#, ...)==1 then
       local value = v
       local v = rawget(table, key)
       if v ~= nil then rawset(table, key, value); return end
       h = metatable(table).__newindex
       if h == nil then rawset(table, key, value); return end
     else
       h = metatable(table).__newindex
       if h == nil then error("..."); return end
     end
   else
     h = metatable(table).__newindex
     if h == nil then
       error("...");
     end
   end
   if type(h) == "function" then
     return h(table, key, ....)    -- call the handler
   else
        local t = {...}; value = t[#t]; table.remove(t);
             -- in c this should be noop
        h[key, unpack(t)] = value       -- or repeat operation on it
   end
end

Benefits:
    - NumLua, i think
    - access to ranges
        (as in Ruby:
        irb:001:0> a = [1,2,3,4,5]
         => [1,2,3,4,5]
         irb:002:0> a[2,4]
         => [3,4,5])
Drawbacks:
    - a bit slower whole execution
    - changes to the compiller
    - big refactor of gettable and settable.

If you think this ideas worth, i will try to implement it.
#1 would not be too hard, but #2 is difficult for me, but i could try.

This changes should not break existing code (for 5.1), i think.


Reply | Threaded
Open this post in threaded view
|

Re: Some ideas

Roberto Ierusalimschy
> 0. Why Table.flags type is lu_byte(unsigned char)? We have 17
> metamethods. Should it be unsigned short int instead? Then we cache 16
> metamethods.

"Fast access" is only faster when there is no metamethod. In the current
implementation, the metamethods without fast access are those that raise
an error whenever they are absent. So, that change would only optimize
the error handling (and would slow down a very little bit the non-error
behavior).


> On x86 there is no drawback having it unsigned short int instead of
> unsigned char(due to the alignment).

In Lua 5.1, the 'flags' field comes between two lu_byte and one lu_byte:

  typedef struct Table {
  CommonHeader;     /* --> GCObject *next; lu_byte tt; lu_byte marked */
  lu_byte flags;  /* 1<<p means tagmethod(p) is not present */
  lu_byte lsizenode;  /* log2 of size of `node' array */

All four fields fit in a word. Changing it to short would force 'lsizenode'
into another word.

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: Some ideas

Sokolov Yura-2
In reply to this post by Sokolov Yura-2
| All four fields fit in a word. Changing it to short would force
'lsizenode'
| into another word.

I just test one script(which creates hundred of thousands of tables)
with type(flags) = lu_bute, lu_int32 and lu_usint(unsigned short int)
(and also made printf("%d",sizeof(Table)):
   lu_byte  - max 169MB  sizeof(Table)=36
   lu_int32 - max 195MB  sizeof(Table)=40
   lu_usint - max 169MB  sizeof(Table)=36



Reply | Threaded
Open this post in threaded view
|

Re: Some ideas

Mike Pall-5-2
Hi,

Sokolov Yura wrote:
> I just test one script(which creates hundred of thousands of tables)
> with type(flags) = lu_bute, lu_int32 and lu_usint(unsigned short int)
> (and also made printf("%d",sizeof(Table)):
>   lu_byte  - max 169MB  sizeof(Table)=36
>   lu_int32 - max 195MB  sizeof(Table)=40
>   lu_usint - max 169MB  sizeof(Table)=36

No, this doesn't look correct. I get:

  lu_byte  -> sizeof(Table) = 32
  lu_usint -> sizeof(Table) = 36

(on x86 with GCC)

Maybe you have modified the source elsewhere (CommonHeader?).

Bye,
     Mike
Reply | Threaded
Open this post in threaded view
|

Re: Some ideas

Sokolov Yura-2
In reply to this post by Sokolov Yura-2
Mike Pall wrote:
> No, this doesn't look correct. I get:
>
>   lu_byte  -> sizeof(Table) = 32
>   lu_usint -> sizeof(Table) = 36
>
> (on x86 with GCC)
>
> Maybe you have modified the source elsewhere (CommonHeader?).

No, I didn't. But I use gcc4.0.2 and --with-cpu=pentium4 - that could be an answer ;-(
Sorry.

What about #1 in original message. Yes, my thoughts could be (and are) dilettantish,
 so I would like to learn about your opinion.

Reply | Threaded
Open this post in threaded view
|

Re: Some ideas

Mike Pall-5-2
Hi,

Sokolov Yura wrote:
> No, I didn't. But I use gcc4.0.2 and --with-cpu=pentium4 - that could be an
> answer ;-(

Seems to be a buggy version. Changing the alignment rules for
structure elements would completely break the x86 ABI.

I've checked GCC 3.3.x and GCC 4.1. No matter which flags I try,
I always get the correct result of sizeof(Table) = 32.

> What about #1 in original message.

Umm, I guess the Lua authors would be more qualified to answer.

Anyway ... have a look at Python (CPython to be exact). Their
extension API lets you freely store pointers to internal objects.
Sounds great? Well ...

Some consequences: they are pretty much tied to their reference
counting garbage collector forever. And they can never switch to
a tagged value concept either. So every tiny little integer
number object has to be allocated and reference counted (yes,
they preallocate -5..100, but this helps only a bit). And don't
let me get started about the required heap allocation of an
argument tuple for every C function call ...

And worst of all: the reference counting logic is spread out over
large parts of the source and in every extension. I recently had
the dubious pleasure to find and fix tons of memory leaks (mostly
refcount bugs for uncommon cases) in a Python extension. No fun.

>   Drawbacks:
>       -  opening implementation details to programmer

That about sums it up. I doubt the Lua authors would go for it.

The environment table for userdata objects can be used as a
simple linear array of TValues. You can easily build any data
structure on top of it (using ints as indexes into the array).
IMHO rawgeti and rawseti are pretty fast.

The best way to decide this issue is of course to create a
benchmark. Use a userdata object implementing a complex data
structure. Compare the implementations of it using either the
standard API or a direct access API (fake it by just using the
internal headers directly).

You could use my queue object (search the list archive) or maybe
a simple binary tree. I think the results should be interesting.
Maybe some other optimizations come out of it (e.g. index2adr is
too large to be inlined). Keep us posted if you find anything!

> Yes, my thoughts could be (and are) dilettantish, so I would
> like to learn about your opinion.

Nobody said so. Your thoughts are welcome. Do not interpret
silence as ignorance. The Lua authors are just very hesitant to
answer to this kind of postings.

My take on it: if you suggest several things and you don't get a
reply for one of them, this doesn't mean anything. It usually
means a more detailed response is needed, weighing in on the
advantages and disadvantages. I guess the Lua authors just don't
have the time to do this.

Consider it a brazilian oracle -- some of your ideas may end up
in the next version. Or maybe not. You'll never know until the
next release. Just keep on throwing ideas into the oracle. ;-)

[Yes, this 'no ack, no nack' policy is ummm ... unusual, but you
gotta learn to live with it.]

Bye,
     Mike