Custom Lua Binary Sizes

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Custom Lua Binary Sizes

Paige DePol
Lua v5.3.4 compiles to appx 230k when built with optimisation (-Os) on my
system... which is quite compact, and is often touted as a primary benefit
for embedding the language.

As I've been hacking on Lua and adding more features this size has been
increasing, with all my patches I have just hit a megabyte in size in a
release build. However, it should be noted that a bulk of this size is in
the parser/lexer portion of the code. I have strived to make the runtime
portion of the code remain as lean as possible. I do not have numbers yet
but will be adding the ability to calculate the growth of the Lua binary,
including compiling with the parser/lexer and without, for all my patches.

I was wondering at which size would the Lua binary be considered "too large"
for embedding and the like. I am going to guess that this really would just
depend on the target environment... however, as I know there are people on
the list using Lua in embedded environs I thought I would get some feedback.

For embedded systems the parser/lexer could just be left out, obviously this
would require all necessary Lua code be pre-compiled (another good reason
for adding cross-platform bytecode genereration to 'luac' in my opinion). I
was also wondering if people who embed Lua precompile the source or not?

The target for my Lua variant is game engines, so while embedding is not
necessarily my target audience I am just curious about embedded use cases.
I am also curious just at which point people consider the binary to become
"bloatware" vs the vanilla binary size.

Thanks for your thoughts!

~Paige


Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

Thomas Fletcher


On Fri, Jan 5, 2018 at 2:07 AM, Paige DePol <[hidden email]> wrote:
Lua v5.3.4 compiles to appx 230k when built with optimisation (-Os) on my
system... which is quite compact, and is often touted as a primary benefit
for embedding the language.

As I've been hacking on Lua and adding more features this size has been
increasing, with all my patches I have just hit a megabyte in size in a
release build. However, it should be noted that a bulk of this size is in
the parser/lexer portion of the code. I have strived to make the runtime
portion of the code remain as lean as possible. I do not have numbers yet
but will be adding the ability to calculate the growth of the Lua binary,
including compiling with the parser/lexer and without, for all my patches.

I was wondering at which size would the Lua binary be considered "too large"
for embedding and the like. I am going to guess that this really would just
depend on the target environment... however, as I know there are people on
the list using Lua in embedded environs I thought I would get some feedback.

I'm happy to say that we've been using Lua as part of Storyboard, an embedded
UI development framework (www.cranksoftware.com if you are interested) for nearly 
10 years specifically because it is perfectly suited for the resource constrained
environments we work with.  While it varies from platform to platform, and we 
do adjust the configurations a bit, we generally run about 150K text and 60K
data on the binary itself.   

For embedded systems the parser/lexer could just be left out, obviously this
would require all necessary Lua code be pre-compiled (another good reason
for adding cross-platform bytecode genereration to 'luac' in my opinion). I
was also wondering if people who embed Lua precompile the source or not?

We do offer pre-compilation as an option for our scripts, but more to save on
processing time (do ahead of time to avoid runtime costs) than memory.   It is
certainly on the table that if we needed to we could strip these elements out and
just force pre-compiled scripts for much smaller environments.
 
The target for my Lua variant is game engines, so while embedding is not
necessarily my target audience I am just curious about embedded use cases.
I am also curious just at which point people consider the binary to become
"bloatware" vs the vanilla binary size.

For what it is worth, I would consider what you have packaged as being far too
large for us as we're running on systems with <1M RAM and then varying amounts
of flash.  The code size is fine if you have lots of flash, but then it frequently translates
into a higher dynamic memory cost for the features you are now including which is
usually what gets us (as a UI framework) into the tight resource spots.
 

Thanks for your thoughts!

Happy to give them .. I'm interested in hearing what others have to say about it.
Thomas
 
--
Thomas Fletcher
VP Research & Development
t. +1 (613) 595 1999 x511
c. +1 (613) 878 4659
e. [hidden email]
w. www.cranksoftware.com
Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

Sean Conner
In reply to this post by Paige DePol
It was thus said that the Great Paige DePol once stated:
>
> For embedded systems the parser/lexer could just be left out, obviously this
> would require all necessary Lua code be pre-compiled (another good reason
> for adding cross-platform bytecode genereration to 'luac' in my opinion). I
> was also wondering if people who embed Lua precompile the source or not?

  For work, an application I wrote [1] embedded Lua plus a large number of
modules required for it to work.  The modules written in Lua are
pre-compiled; in some cases they are smaller than the source files; other
times not so.  In any case, these are further compressed using zlib before
being embedded in the executable.  I have writen about this before on the
list [2] but briefly, the compressed pre-compiled Lua modules are larger
than the compressed source code [3].  I haven't modified my build because
the larger size isn't that much of a concern for us (compariatively
speaking, my program is smaller that most projects around here).

  -spc

]1] Network server.

[2] No time right now to find references, sadly.

[3] I want to say this is more so for 64-bit builds than 32-bit builds,
        but again, I don't have time today to do an actual test.


Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

steve donovan
On Fri, Jan 5, 2018 at 3:54 PM, Sean Conner <[hidden email]> wrote:
> the larger size isn't that much of a concern for us (compariatively
> speaking, my program is smaller that most projects around here).

Which is the usual operable definition of 'small' :)

Resource-constrained targets become more prevalent in this world of
IoT (jokes about setting ssh password on light bulbs aside)

Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

Frank Kastenholz-2
In reply to this post by Paige DePol
On 1/5/18 2:07 AM, Paige DePol wrote:

> I was wondering at which size would the Lua binary be considered "too large"
> for embedding and the like. I am going to guess that this really would just

In the past when working on embedded systems which included Lua, O(1M)
would not have been considered a significant problem, we had 10's of
megabytes of RAM, plus the stuff we had to add around the base Lua made
it in the 750k-1M range. Otoh, other systems would have had a problem
adding that to the executable.

For the environments I was dealing with, execution performance was the
critical metric. This was a bigger problem in Lua because base lua's
compiler does not do any form of optimization ... and too many
programmers assume that the compiler will do it all and the programmer
does not need to think about code structure, etc, w.r.t. performance.

The general rules of thumb are 1) each target will have a different
definition of "too much" and 2) the bigger you make it, the fewer
the number of targets that can adopt it.

I've not looked at your code, nor do I know the features you've added,
but in general, is it possible to partition your additions so that they
can be selected (via #define INCLUDE_FEATURE_X sorts of statement)? I
could easily imagine some specific environments wishing some features,
not all. In the system I was working on we were going to have to
remove some library functions (IO and OS stuff primarily) for security
and application partitioning reasons ... being able to simply
#define DONT_DO_FILE_FILE_IO would have been nice.

> For embedded systems the parser/lexer could just be left out, obviously this
> would require all necessary Lua code be pre-compiled (another good reason
> for adding cross-platform bytecode genereration to 'luac' in my opinion). I
> was also wondering if people who embed Lua precompile the source or not?

Like others we looked at precompilation and decided it was not a win.
The savings in space was not significant enough to make it worthwhile
and the compiler runs fast enough that the time compiling was not a
notable problem.  Otoh, ensuring that lua bytecode was portable
across all our potential target systems (different word sizes,
host operating systems, processor architectures & endianness, etc)
would have been a major hassle.

Frank







Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

Sean Conner
In reply to this post by Paige DePol
It was thus said that the Great Paige DePol once stated:
>
> For embedded systems the parser/lexer could just be left out, obviously this
> would require all necessary Lua code be pre-compiled (another good reason
> for adding cross-platform bytecode genereration to 'luac' in my opinion). I
> was also wondering if people who embed Lua precompile the source or not?

  I finally have some time, so I thought I might check the sizes of Lua
source code, compressed Lua source code [1], compiled Lua code and
compressed compiled Lua code, for both 32 bit and 64 bit systems (and yes,
there is a difference in compiled Lua sizes).  I used LuaRocks beacuse it
was
        a) handy
        b) a good mix of module sizes
        c) all in Lua.

  Attached is the full output from a 32-bit run and a 64-bit run, but let's
look at a small sample of the output.

  Here we have sizes from a 32-bit system.

text ztext bin zbin filename
-----------------------------------------
 4167 1479 6107 2647 add.lua
 3267 1207 4830 2137 admin_remove.lua
15750 4229 17277 6352 build.lua
11312 2888 15600 5552 builtin.lua
 2178  886 2637 1196 cmake.lua
  932  389 1023  487 command.lua

  You can see that the compiled versions are larger than the text.  And the
same holds true for the compressed versions.  Next up, results from a 64-bit
test:

text ztext bin zbin filename
-----------------------------------------
 4167 1479 6723 2669 add.lua
 3267 1207 5322 2163 admin_remove.lua
15750 4229 18889 6442 build.lua
11312 2888 17416 5638 builtin.lua
 2178  886 2917 1211 cmake.lua
  932  389 1123  495 command.lua

  Again, similar results, only the 64-bit compiled versions are larger than
the 32-bit compiled versions.  The trick here is to determine if the size
savings of compression are worth the extra cost of zlib support.  There's
also the question of whether the reduction in Lua size of removing the
parser is worth the larger size of pre-compiled Lua code.

  -spc (Trade-offs, trade-offs)

[1] Using zlib (lzlib module for Lua) with maximum compression.  I used
        maximum compression because I figure take the hit on time for
        compression on more powerful machines to save space on less powerful
        machines one might embed Lua in.

r32.txt (2K) Download Attachment
r64.txt (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

Daurnimator
On 6 January 2018 at 13:48, Sean Conner <[hidden email]> wrote:
>   I finally have some time, so I thought I might check the sizes of Lua
> source code, compressed Lua source code [1], compiled Lua code and
> compressed compiled Lua code, for both 32 bit and 64 bit systems (and yes,
> there is a difference in compiled Lua sizes).

Please compare stripped vs unstripped bytecode.

Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

KHMan
In reply to this post by Sean Conner
On 1/6/2018 10:48 AM, Sean Conner wrote:
> It was thus said that the Great Paige DePol once stated:
>>
>> For embedded systems the parser/lexer could just be left out, obviously this
>> would require all necessary Lua code be pre-compiled (another good reason
>> for adding cross-platform bytecode genereration to 'luac' in my opinion). I
>> was also wondering if people who embed Lua precompile the source or not?
>
[snip snip snip]
>
>    Again, similar results, only the 64-bit compiled versions are larger than
> the 32-bit compiled versions.  The trick here is to determine if the size
> savings of compression are worth the extra cost of zlib support.  There's
> also the question of whether the reduction in Lua size of removing the
> parser is worth the larger size of pre-compiled Lua code.
>
>    -spc (Trade-offs, trade-offs)

More of a speed/size trade-off if we look at the available
compression tools.

For a bit better size, one can use the last word in deflate
compression, Zopfli [1].

[1] https://en.wikipedia.org/wiki/Zopfli

Also, one can merge data into a single file. This allows deflate
to work better, since the sliding dictionary size is 32KB, and
some savings will be due to fewer Huffman tables due to having
fewer data blocks.

Or one can go for better size at a faster speed, use the
new-and-hot finite state entropy methods, Zstandard [2] or LZFSE
[3]. See the benchmark in [4].

[2] https://en.wikipedia.org/wiki/Zstandard
[3] https://en.wikipedia.org/wiki/LZFSE
[4] http://facebook.github.io/zstd/

Zlib is still good enough for a lot of use cases. Zstandard seems
to match or is better than LZMA. One issue is some patent
legalities that might run afoul of corporate lawyers. See this
excellent post [5].

[5]
https://gregoryszorc.com/blog/2017/03/07/better-compression-with-zstandard/


> [1] Using zlib (lzlib module for Lua) with maximum compression.  I used
> maximum compression because I figure take the hit on time for
> compression on more powerful machines to save space on less powerful
> machines one might embed Lua in.


--
Cheers,
Kein-Hong Man (esq.)
Selangor, Malaysia


Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

Sean Conner
In reply to this post by Daurnimator
It was thus said that the Great Daurnimator once stated:
> On 6 January 2018 at 13:48, Sean Conner <[hidden email]> wrote:
> >   I finally have some time, so I thought I might check the sizes of Lua
> > source code, compressed Lua source code [1], compiled Lua code and
> > compressed compiled Lua code, for both 32 bit and 64 bit systems (and yes,
> > there is a difference in compiled Lua sizes).
>
> Please compare stripped vs unstripped bytecode.

  Very interesting results (attached).  Sometimes, the text is smaller than
the stripped compiled version, sometimes not.  Sometimes the compressed text
is smaller than the stripped compiled version, sometimes not.  You would
really have to measure when embedding Lua into your executable.

  -spc


r32.txt (3K) Download Attachment
r64.txt (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

Jonathan Goble
In reply to this post by KHMan
On Fri, Jan 5, 2018 at 11:27 PM KHMan <[hidden email]> wrote:
Zstandard seems
to match or is better than LZMA. One issue is some patent
legalities that might run afoul of corporate lawyers. See this
excellent post [5].

[5]
https://gregoryszorc.com/blog/2017/03/07/better-compression-with-zstandard/

No longer an issue. This was discussed at length at [1], with the end result being an eventual relicensing of the project to dual BSD/GPLv2 and the complete deletion of the offending PATENTS file.

Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

KHMan
On 1/6/2018 12:54 PM, Jonathan Goble wrote:

> On Fri, Jan 5, 2018 at 11:27 PM KHMan wrote:
>
>     Zstandard seems
>     to match or is better than LZMA. One issue is some patent
>     legalities that might run afoul of corporate lawyers. See this
>     excellent post [5].
>
>     [5]
>     https://gregoryszorc.com/blog/2017/03/07/better-compression-with-zstandard/
>
>
> No longer an issue. This was discussed at length at [1], with the
> end result being an eventual relicensing of the project to dual
> BSD/GPLv2 and the complete deletion of the offending PATENTS file.
>
> [1] https://github.com/facebook/zstd/issues/335

Awesome, missed that.

A quick look at:
https://github.com/facebook/zstd/issues/77

says a pure decompressor might be ~45KB in size. Probably smaller
if code is tuned to a non modern superscalar CPU. Memory usage can
be tuned. Nice. Looking forward to modified implementations... how
long before someone here writes a pure Lua implementation? :-)

--
Cheers,
Kein-Hong Man (esq.)
Selangor, Malaysia


Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

Pierre Chapuis
Another alternative if you have very little resource is using a compression library designed specifically for embedded use, such as heatshrink [1].

The algorithm is LZSS and on my x64 system, it compiles to about 12k with -Os.

[1] https://github.com/silentbicycle/heatshrink

Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

Paige DePol
In reply to this post by Paige DePol
Paige DePol <[hidden email]> wrote:

> As I've been hacking on Lua and adding more features this size has been
> increasing, with all my patches I have just hit a megabyte in size in a
> release build.

Apparently, my Xcode was lying to me... repeatedly. It seems an old issue
I had with Xcode badly caching items and generally just being strange has
returned. After a judicious clearing of all Xcode caches and a recompile
of all object files in Release (-Os) configuration I get the following:

======== (15.5%)  (74.5%)  (34.8%) =======
 Parser:  35.4k +  82.4k = 117.8k (233.2%)
Runtime: 192.8k +  28.3k = 221.1k ( 14.7%)
Overall: 228.2k + 110.7k = 338.9k ( 48.5%)
======= Vanilla = Growth = Lunia =========

The top %'s are the size of the Parser in relation to the Overall size.
The side %'s are the growth of the binary code for that specific line.

My hard fork of Lua is actually only 48.5% larger than vanilla Lua at this
point, and about 3x smaller than previously indicated. As can be seen above
the bulk of my changes have occurred in code related to the Parser, with an
increase of 233%. The runtime code has only had an increase of 14.7%, which
is much more in-line with what I was expecting. If you remove the parser
from my fork the resulting binary actually winds up smaller than the vanilla
Lua binary (with parser, of course).

The size of the parser has significantly grown as it includes a number of my
larger patches: Preprocessor, Token Storage, CEMI (Consts, Enum, Macros, and
Inlines), Class Object Model, variable Type Locking, and more.

The runtime portion supports my new Index data object, which also required
a bump up to 64-bit for the virtual machine instructions, as well as support
for Type Locking of stack slots.

I have done my best to do as much processing as possible in the parsing phase
to allow the runtime to do as little as possible... so far I think that has
worked out quite well given the fairly small increase in the runtime code.


> Thomas Fletcher <[hidden email]> wrote:
>> For what it is worth, I would consider what you have packaged as being far too
>> large for us as we're running on systems with <1M RAM and then varying amounts
>> of flash.

Yes, with only 1M RAM I'd imagine a Lua engine that was your entire memory space
would be deemed impractical. Thankfully, I discovered my numbers were off quite
a lot, with the new numbers how would that fit into your memory requirements?


> Sean Conner <[hidden email]> wrote:
>>  For work, an application I wrote [1] embedded Lua plus a large number of
>> modules required for it to work.  The modules written in Lua are
>> pre-compiled; in some cases they are smaller than the source files; other
>> times not so.  In any case, these are further compressed using zlib before
>> being embedded in the executable.

Wouldn't overall memory available be a concern as well with a compression
library? I am guessing for embedded systems the decompression happens as
a stream, so the only memory allocations are for the decompression engine
itself and the final decompressed file?


> Frank Kastenholz <[hidden email]> wrote:
>> I've not looked at your code, nor do I know the features you've added,
>> but in general, is it possible to partition your additions so that they
>> can be selected (via #define INCLUDE_FEATURE_X sorts of statement)?

Yes, this can be done, and it was a method I did use originally. However, I
found the extreme amount of #if statements unwieldy after a while, and
especially where various patches might change the same code, or overlapping
code areas. Instead, I will be managing patches individually, with pre-reqs
for some, and will maintain a build system to test my patches with all my
other patches... eventually this will lead to a website where you can have a
custom version of Lua created for you with just the patches you want.


> Sean Conner <[hidden email]> wrote:
>>  I finally have some time, so I thought I might check the sizes of Lua
>> source code, compressed Lua source code [1], compiled Lua code and
>> compressed compiled Lua code, for both 32 bit and 64 bit systems...

Thanks for doing those comparisons, Sean, it was enlightening. Continuing
the later discussion about compression... I could see adding an option to
compress the debugging information in a compiled Lua file. That way the
debug info could take up less space and only be decompressed if actually
needed. Though if the error is a memory error that may prove problematic!


Thanks all for your feedback, it is appreciated. I am not yet playing with
embedded systems, however, I have a couple ideas so might be looking into
what hardware to get. Assuming I want to use Lua does anyone here have any
suggestions? Raspberry Pi, Arduino... I am a bit of a hardware newbie! ;)

~Paige




Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

Sean Conner
It was thus said that the Great Paige DePol once stated:
> > Sean Conner <[hidden email]> wrote:
> >>  For work, an application I wrote [1] embedded Lua plus a large number of
> >> modules required for it to work.  The modules written in Lua are
> >> pre-compiled; in some cases they are smaller than the source files; other
> >> times not so.  In any case, these are further compressed using zlib before
> >> being embedded in the executable.
>
> Wouldn't overall memory available be a concern as well with a compression
> library?

  In my case, not really.  I only started doing the compression when an
internal tool (constructed along similar lines ad the application I
mentioned) became ludicrously large because of one module (containing
thousands of names [2]).  Compression brought that down to merely largish
size [3].

> I am guessing for embedded systems the decompression happens as
> a stream, so the only memory allocations are for the decompression engine
> itself and the final decompressed file?

  Yes, but the "embedded system" I'm programming for is a server with gigs
of RAM and an insane number of (slowish) cores [4].

> > Sean Conner <[hidden email]> wrote:
> >>  I finally have some time, so I thought I might check the sizes of Lua
> >> source code, compressed Lua source code [1], compiled Lua code and
> >> compressed compiled Lua code, for both 32 bit and 64 bit systems...
>
> Thanks for doing those comparisons, Sean, it was enlightening.

  You're welcome.

  -spc

[1] Footnote not included here

[2] Yes, we have need to generate names of people when testing.

[3] I didn't want to have to install additional data files to make
        testing easier.  Also, I didn't want to have to install anything, to
        make testing easier on systems we had limited access to.

[4] 64-core 64-bit SPARC architecture running about a gigahertz or so. A
        bit sluggish compared to more mondern hardware but it handles heavy
        loads like you wouldn't believe.

        At least, I *think* it has 64 cores.  I know it has more than what
        you can normally get on a desktop.

Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

dyngeccetor8
On 01/22/2018 12:19 PM, Sean Conner wrote:
> [4] 64-core 64-bit SPARC architecture running about a gigahertz or so. A
> bit sluggish compared to more mondern hardware but it handles heavy
> loads like you wouldn't believe.
>
> At least, I *think* it has 64 cores.  I know it has more than what
> you can normally get on a desktop.

Taking this as a quiz..

UltraSPARC T2 probably? Eight cores by eight threads per core.

https://en.wikipedia.org/wiki/SPARC#Implementations

-- Martin

Reply | Threaded
Open this post in threaded view
|

Re: Custom Lua Binary Sizes

Sean Conner
It was thus said that the Great dyngeccetor8 once stated:

> On 01/22/2018 12:19 PM, Sean Conner wrote:
> > [4] 64-core 64-bit SPARC architecture running about a gigahertz or so. A
> > bit sluggish compared to more mondern hardware but it handles heavy
> > loads like you wouldn't believe.
> >
> > At least, I *think* it has 64 cores.  I know it has more than what
> > you can normally get on a desktop.
>
> Taking this as a quiz..
>
> UltraSPARC T2 probably? Eight cores by eight threads per core.
>
> https://en.wikipedia.org/wiki/SPARC#Implementations

  Yeah, that sounds about right.  Although I haven't seen the actual
physical hardware.

  -spc