LuaJIT2 and lua_dump()

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

LuaJIT2 and lua_dump()

Lawrie Nichols
I notice that in the current LuaJIT2 beta (pulled from git this
morning), the lua_dump() function is effectively a noop. Is this
something that will be implemented in the future, or is there an
underlying issue that prevents the implmentation of this function ?

Thanks for any info,

Lawrie

Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Mike Pall-13
Lawrie Nichols wrote:
> I notice that in the current LuaJIT2 beta (pulled from git this  
> morning), the lua_dump() function is effectively a noop. Is this  
> something that will be implemented in the future, or is there an  
> underlying issue that prevents the implmentation of this function ?

Bytecode loading/saving is not a priority right now:

- The bytecode format itself is still in flux during the betas.

- It won't be compatible with Lua's bytecode in any case.

- The parser is quite fast, so you wouldn't gain much.

- Hiding the source code is not an issue for open source
  developers. So far, no closed source developer has asked for
  this feature and was also willing to cover the development costs.

There are more pressing issues right now, so I'd need to hear some
really good arguments before starting to work on it.

--Mike
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Nicolas-10
On 02 Mar 2010 14:36:42 +0100
Mike Pall <[hidden email]> wrote:

> Bytecode loading/saving is not a priority right now:
>
> - The bytecode format itself is still in flux during the betas.
>
> - It won't be compatible with Lua's bytecode in any case.
>
> - The parser is quite fast, so you wouldn't gain much.
>
> - Hiding the source code is not an issue for open source
>   developers. So far, no closed source developer has asked for
>   this feature and was also willing to cover the development costs.
>
> There are more pressing issues right now, so I'd need to hear some
> really good arguments before starting to work on it.

No that it makes it urgent at all but there are some "legit" uses of it
without being close sourced software.
My game currently handles its savefiles as a serialization of the running
objects, since some objects can have functions dynamically attached to
them it needs string.dump to serialize them and later reload them.
I am sure there are other such cases where it could be useful.

But yes anyway its better to get luajit2 ready that fiddle with such
details :)

Go luajit go!
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Peter Cawley
On Tue, Mar 2, 2010 at 1:43 PM, Nicolas <[hidden email]> wrote:
> My game currently handles its savefiles as a serialization of the running
> objects, since some objects can have functions dynamically attached to
> them it needs string.dump to serialize them and later reload them.

The alternative approach to serialising a function is to use the debug
API to get the filename and line number where it was defined, then
load said file, pull out the actual code, and thus save the source
code rather than bytecode.
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Nicolas-10
On 02 Mar 2010 14:48:10 +0100
Peter Cawley <[hidden email]> wrote:

> On Tue, Mar 2, 2010 at 1:43 PM, Nicolas <[hidden email]> wrote:
> > My game currently handles its savefiles as a serialization of the running
> > objects, since some objects can have functions dynamically attached to
> > them it needs string.dump to serialize them and later reload them.
>
> The alternative approach to serialising a function is to use the debug
> API to get the filename and line number where it was defined, then
> load said file, pull out the actual code, and thus save the source
> code rather than bytecode.

But that does not tell me from which line to which line to take.
It does not even work if the function was defined inline somewhere like that:

{
        foo = function()
                ....
        end,
        pop = "pup",
}

There is no way to extract the currect code.
(Also this relies on debug.* instead of standard libs)

Or did I misunderstand you ?

PS: Sorry this is a tad out of subject for the thread :/
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Lawrie Nichols
In reply to this post by Mike Pall-13
On 02/03/10 13:36, Mike Pall wrote:

> Lawrie Nichols wrote:
>    
>> I notice that in the current LuaJIT2 beta (pulled from git this
>> morning), the lua_dump() function is effectively a noop. Is this
>> something that will be implemented in the future, or is there an
>> underlying issue that prevents the implmentation of this function ?
>>      
> Bytecode loading/saving is not a priority right now:
>
> - The bytecode format itself is still in flux during the betas.
>
> - It won't be compatible with Lua's bytecode in any case.
>    
Thanks for the swift response. Basically, I'm using lua_dump() and
lua_load() to allow serialization of data between multiple lua states
via memory buffers, so currently the bytecode format (and it's
compatibility with standard Lua) is not really an issue for me....maybe
further down the line it might be, if lua states are running in separate
processes and I need to serialize via sockets though.
> - The parser is quite fast, so you wouldn't gain much.
>
> - Hiding the source code is not an issue for open source
>    developers. So far, no closed source developer has asked for
>    this feature and was also willing to cover the development costs.
>    
In this instance, my need is purely to dump a lua function in one state
and the load it in another state - this will be a mechanism for
temporary memory persistence - not obsfucation.
> There are more pressing issues right now, so I'd need to hear some
> really good arguments before starting to work on it.
>
> --Mike
>    
I did have a look around the source for LuaJIT2, thinking that maybe
this was something I might be able to contribute back to the project,
but to be honest it was way over my level of understanding :-(

I appreciate that loading/saving is not necessarily high on the priority
list at the moment, and there are only two (maybe somewhat dubious)
arguments I can make for it's inclusion:

* Lua currently supports it (I know, a bit lame);
* there is currently no other API available for passing lua functions
around between different Lua states

Lawrie

Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Joshua Jensen
In reply to this post by Mike Pall-13
----- Original Message -----
From: Mike Pall
Date: 3/2/2010 6:36 AM
> Lawrie Nichols wrote:
>    
>> I notice that in the current LuaJIT2 beta (pulled from git this
>> morning), the lua_dump() function is effectively a noop. Is this
>> something that will be implemented in the future, or is there an
>> underlying issue that prevents the implmentation of this function ?
>>      
> Bytecode loading/saving is not a priority right now:
>    
I don't use LuaJIT at this time, so this isn't immediately important for
me right now, but...
> - The parser is quite fast, so you wouldn't gain much.
>    
Loading a raw chunk of data and running some "fix-ups" will always be
far faster than parsing text data and compiling it into its appropriate
form.  Additionally, fragmentation of memory will be kept at a minimum
as parsers almost always need scratch space.  The Lua parser does, anyway.
> - Hiding the source code is not an issue for open source
>    developers. So far, no closed source developer has asked for
>    this feature and was also willing to cover the development costs.
>    
Among other things, I use Lua as a data format.  It is not uncommon to
have a 100 megabyte text .lua file.  I'm looking at one right now where
the text format is 134 megabytes, and the binary format is 26 megabytes.

It works for code, too.  I just tested running luac on a 15k .lua ASCII
script.  It ended up being 7k in binary form.

Josh
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Mike Pall-13
Joshua Jensen wrote:
> Additionally, fragmentation of memory will be kept at a minimum  
> as parsers almost always need scratch space.  The Lua parser does,
> anyway.

I've recently made some major changes to the parser that mostly
eliminate fragmentation. It uses temporary, shared, growable
structures which are later copied into compact prototype objects.
All data of a prototype is colocated in memory, too. It's a single
blob, not the 8 blobs that Lua needs per prototype.

If I ever get around to it, I could additionally compress the
debug info to 1/5th of its current size.

> It works for code, too.  I just tested running luac on a 15k .lua ASCII  
> script.  It ended up being 7k in binary form.

Someone on the list did a comparison some time ago. AFAIR the gist
of it was that you'd certainly want to compress either one, if you
really want to save disk space. And source code compresses much
better, eating up any savings from the use of a binary format.

--Mike
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Joshua Jensen
----- Original Message -----
From: Mike Pall
Date: 3/2/2010 9:10 AM

> Joshua Jensen wrote:
>    
>> Additionally, fragmentation of memory will be kept at a minimum
>> as parsers almost always need scratch space.  The Lua parser does,
>> anyway.
>>      
> I've recently made some major changes to the parser that mostly
> eliminate fragmentation. It uses temporary, shared, growable
> structures which are later copied into compact prototype objects.
> All data of a prototype is colocated in memory, too. It's a single
> blob, not the 8 blobs that Lua needs per prototype.
>
> If I ever get around to it, I could additionally compress the
> debug info to 1/5th of its current size.
>    
Sounds like a nice optimization.
>> It works for code, too.  I just tested running luac on a 15k .lua ASCII
>> script.  It ended up being 7k in binary form.
>>      
> Someone on the list did a comparison some time ago. AFAIR the gist
> of it was that you'd certainly want to compress either one, if you
> really want to save disk space. And source code compresses much
> better, eating up any savings from the use of a binary format.
>    
I just tested this:

134 megabyte ASCII .lua file compressed with zip: 18 megabytes
26 megabyte binary .lua file generated from 134 megabyte ASCII
compressed with zip: 10 megabytes

If I'm loading that data off of DVD for a console game, I just lost a
minimum of full second bringing in and decompressing the larger ASCII
file.  When console manufacturers have certification requirements where
loads must happen within a certain time period, every second is at a
premium.

In any case, I'm just providing use cases.

Josh
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Luiz Henrique de Figueiredo
In reply to this post by Nicolas-10
> It does not even work if the function was defined inline somewhere like that:
>
> {
> foo = function()
> ....
> end,
> pop = "pup",
> }
>
> There is no way to extract the currect code.

struct lua_Debug contains these fields:
        int linedefined;      /* (S) */
        int lastlinedefined;  /* (S)

Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Shmuel Zeigerman
In reply to this post by Joshua Jensen
In my experience, ASCII Lua-files after deletion of redundant whitespace
characters are usually smaller than their binaries.
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Luiz Henrique de Figueiredo
> In my experience, ASCII Lua-files after deletion of redundant whitespace
> characters are usually smaller than their binaries.

For that, see my lstrip:
        http://www.tecgraf.puc-rio.br/~lhf/ftp/lua/#lstrip
       
There is also LuaSrcDiet:
        http://luasrcdiet.luaforge.net/
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Rob Kendrick-2
In reply to this post by Shmuel Zeigerman
On Tue, 02 Mar 2010 18:47:55 +0200
Shmuel Zeigerman <[hidden email]> wrote:

> In my experience, ASCII Lua-files after deletion of redundant
> whitespace characters are usually smaller than their binaries.

And ones passed through a compression algorithm (DEFLATE, or LZF if
you're tight on code space) makes them even dinkier.

B.
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Nicolas-10
In reply to this post by Luiz Henrique de Figueiredo
On 02 Mar 2010 17:44:19 +0100
Luiz Henrique de Figueiredo <[hidden email]> wrote:

> > It does not even work if the function was defined inline somewhere like that:
> >
> > {
> > foo = function()
> > ....
> > end,
> > pop = "pup",
> > }
> >
> > There is no way to extract the currect code.
>
> struct lua_Debug contains these fields:
> int linedefined;      /* (S) */
> int lastlinedefined;  /* (S)

My bad :)
Yet extracting from line to line would not get me the code of the function
it would give me:
  foo = function()
  ....
  end,

which I cant extract and pass to loadstring (and it could be much worse
and be embeded in a line with many other things).

Anyway dump/load are indeed real nice for serializing :)
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Peter Cawley
On Tue, Mar 2, 2010 at 5:01 PM, Nicolas <[hidden email]> wrote:
> My bad :)
> Yet extracting from line to line would not get me the code of the function
> it would give me:
>        foo = function()
>                ....
>        end,
>
> which I cant extract and pass to loadstring (and it could be much worse
> and be embeded in a line with many other things).
You can scan the starting line for where "function" occurs, then trim
off everything prior, and also trim off the function name if it occurs
between "function" and "(". If "function" occurs more than once, then
iteratively try each position where it occurs. For the final line,
scan for "end" and trim off everything afterwards. If "end" occurs
more than once, then iteratively try each position. As long as you
don't define two functions which start and end on the same line (and
no sane programmer would), then loadstring("return function (etc.)
etc. end") will succeed for precisely one occurrence of "function" and
one occurrence of "end" (except for perhaps some strange and obscure
edge-cases which don't occur in the real world anyway).

Yes, it gives you a dependence on the debug library, and yes it is a
lot more work than dumping bytecode, but it does give you two
important advantages:
1) Works with Lua implementations with no bytecode format (e.g. LuaJIT2)
2) Serialised functions can be loaded by a Lua implementation which
has a different bytecode format to the Lua implementation which
serialised them (so you could move between 5.1 and 5.2, between
little-endian and big-endian, or between x86 and x64, etc.)
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Joshua Jensen
In reply to this post by Shmuel Zeigerman
----- Original Message -----
From: Shmuel Zeigerman
Date: 3/2/2010 9:47 AM
> In my experience, ASCII Lua-files after deletion of redundant
> whitespace characters are usually smaller than their binaries.
>
So, here we go.  I'll use exact bytes this time:

asciidata.lua - 134,280,891 bytes
asciidata-luasrcdiet.lua - 133,759,270 bytes
binarydata.lua after luac - 26,654,637 bytes

What follows are load times on a very fast machine with lots of RAM
(much faster than the game consoles I've referred to earlier) after
running the test once to cache the file in memory and then averaging
several runs afterward.  Bear in mind that the ASCII data is 5 times
larger than the binary data, and the test below doesn't take into
account disk load time.

Load asciidata.lua - 1.99 seconds
Load asciidata-luasrcdiet.lua - 1.98 seconds
Load binarydata.lua - 0.07 seconds

Is binary data important to me?  Based on the numbers above, you bet it is.

Josh
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Rob Kendrick-2
On Tue, 02 Mar 2010 10:41:12 -0700
Joshua Jensen <[hidden email]> wrote:

> asciidata.lua - 134,280,891 bytes
> asciidata-luasrcdiet.lua - 133,759,270 bytes
> binarydata.lua after luac - 26,654,637 bytes
>
> What follows are load times on a very fast machine with lots of RAM
> (much faster than the game consoles I've referred to earlier) after
> running the test once to cache the file in memory and then averaging
> several runs afterward.  Bear in mind that the ASCII data is 5 times
> larger than the binary data, and the test below doesn't take into
> account disk load time.
>
> Load asciidata.lua - 1.99 seconds
> Load asciidata-luasrcdiet.lua - 1.98 seconds
> Load binarydata.lua - 0.07 seconds
>
> Is binary data important to me?  Based on the numbers above, you bet
> it is.

I would suggest you need a different file format for such large-scale
data :)

B.
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Joshua Jensen
----- Original Message -----
From: Rob Kendrick
Date: 3/2/2010 10:47 AM
>> Is binary data important to me?  Based on the numbers above, you bet
>> it is.
>>      
> I would suggest you need a different file format for such large-scale
> data :)
>    
Why would you suggest that?  Lua loads all of that data from the binary
Lua in 0.07 seconds.  Any other key/value pair setup will likely have
even slower results.  Lua is fantastic at describing data, even large
amounts.  The data capabilities, coupled with the scripting language
itself, opens a wide world of format description, particularly for
intermediate file formats that are translated into optimized game formats.

Is Lua appropriate everywhere?  Certainly not.  A game should read
memory overlays from the disk whenever possible.  Some data is best
loaded directly into video memory and copying it from system memory
would be a waste of time.  When there is a special case to be handled,
Lua should be bypassed.  When there isn't, why shouldn't it be used?  
What would you suggest I use?

It seems to me the argument is that .lua files should only be
distributed as text; the binary reader is not useful because the text
parser is so fast.  I've shown a very common use case where the standard
Lua binary reader is much faster than the text reader.  Based on this
benchmark and many others I've run for smaller code and data sets, I am
certain the binary reader will always be faster than the text reader.

Josh
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Rob Kendrick-2
On Tue, 02 Mar 2010 11:41:18 -0700
Joshua Jensen <[hidden email]> wrote:

> >> Is binary data important to me?  Based on the numbers above, you
> >> bet it is.
> >>        
> > I would suggest you need a different file format for such
> > large-scale data :)
> >      
> Why would you suggest that?  Lua loads all of that data from the
> binary Lua in 0.07 seconds.  Any other key/value pair setup will
> likely have even slower results.

I would have thought a well-designed format will load much quicker, if
anything because of the simpler data structure and less memory churn.
Only a really bad format would be slower.  It'd also have the advantage
that it'd be trivial to make the data portable between different
systems, if your application runs on different architectures.

B.
Reply | Threaded
Open this post in threaded view
|

Re: LuaJIT2 and lua_dump()

Luiz Henrique de Figueiredo
> I would have thought a well-designed format will load much quicker, if
> anything because of the simpler data structure and less memory churn.
> Only a really bad format would be slower.  It'd also have the advantage
> that it'd be trivial to make the data portable between different
> systems, if your application runs on different architectures.

If he cares about portability of the binary data then an easy way is to
modify ldump.c and lundump.c.
12