Crunching Lua

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Crunching Lua

David Given
I have a situation where I want to distribute a Lua script as part of a shell 
script package. The Lua script needs to be as small as humanly possible, but 
doesn't have to be editable.

Any suggestions?

The traditional thing to do with C in this case is to run the source through a 
cruncher; this will strip out comments and whitespace, and the really good 
ones will rename all your identifiers to be as short as possible. Identifier 
renaming in a language as dynamic as Lua is probably going to lead to a 
World-O-Pain(TM), but the rest would still likely be useful, and not 
particularly difficult.

To my surprise, running the script through luac makes things *bigger* --- 
presumably due to inefficient opcode encoding. Can anyone suggest any other 
strategies I could adopt? (I'm already using gzip to compress the final 
result.)

-- 
+- David Given --McQ-+ "There is no expedient to which a man will not
|  [hidden email]    | resort to avoid the real labour of thinking." ---
| ([hidden email]) | Thomas Edison
+- www.cowlark.com --+ 

Attachment: pgpnEfkCEuFog.pgp
Description: PGP signature

Reply | Threaded
Open this post in threaded view
|

Re: Crunching Lua

Luiz Henrique de Figueiredo
> The traditional thing to do with C in this case is to run the source
> through a cruncher; this will strip out comments and whitespace, and
> the really good ones will rename all your identifiers to be as short
> as possible.

My ltokens library has a lstrip tool which strips comments and whitespace,
but does not do renaming.

See also
	http://lua-users.org/lists/lua-l/2005-02/msg00357.html
	http://lua-users.org/lists/lua-l/2005-02/msg00395.html
	http://lua-users.org/lists/lua-l/2005-02/msg00396.html

> To my surprise, running the script through luac makes things *bigger* --- 
> presumably due to inefficient opcode encoding.

Try luac -s.

--lhf

Reply | Threaded
Open this post in threaded view
|

Re: Crunching Lua

Stephen Kellett
In reply to this post by David Given
In message <200511031122.28641.dg@...>, David Given <[hidden email]> writes
To my surprise, running the script through luac makes things *bigger* ---
presumably due to inefficient opcode encoding. Can anyone suggest any other
strategies I could adopt?

Would it be possible for you to identify all the unique words in the file and encode then using your own scheme and then reconstruct the script using this at run time? Doesn't sound much different to compressing the script using gzip and then decompressing prior to execution.

Stephen
--
Stephen Kellett
Object Media Limited    http://www.objmedia.demon.co.uk/software.html
Computer Consultancy, Software Development
Windows C++, Java, Assembler, Performance Analysis, Troubleshooting

Reply | Threaded
Open this post in threaded view
|

Re: Crunching Lua

Rici Lake-2
In reply to this post by David Given

On 3-Nov-05, at 6:22 AM, David Given wrote:

To my surprise, running the script through luac makes things *bigger* --- presumably due to inefficient opcode encoding. Can anyone suggest any other
strategies I could adopt? (I'm already using gzip to compress the final
result.)

Did you try the -s option to luac?



Reply | Threaded
Open this post in threaded view
|

Re: Crunching Lua

David Given
In reply to this post by Luiz Henrique de Figueiredo
On Thursday 03 November 2005 11:59, Luiz Henrique de Figueiredo wrote:
> > The traditional thing to do with C in this case is to run the source
> > through a cruncher; this will strip out comments and whitespace, and
> > the really good ones will rename all your identifiers to be as short
> > as possible.
>
> My ltokens library has a lstrip tool which strips comments and whitespace,
> but does not do renaming.

Ah, thanks. I'll try it.

[...]
> Try luac -s.

I'd forgotten about that; the compiled out is now smaller than the source, but 
not by much --- the source (after gzipping) is 5322 bytes, and the compressed 
binary is 4805 bytes. I suspect that lstrip would do better.

-- 
+- David Given --McQ-+ "Opportunity is missed by most people because it's
|  [hidden email]    | dressed in overalls and looks like work." ---
| ([hidden email]) | Thomas Edison
+- www.cowlark.com --+ 

Attachment: pgpBr7d5_sTko.pgp
Description: PGP signature

Reply | Threaded
Open this post in threaded view
|

Re: Crunching Lua

Adrian Sietsma
David Given wrote:
...

I'd forgotten about that; the compiled out is now smaller than the source, but not by much --- the source (after gzipping) is 5322 bytes, and the compressed binary is 4805 bytes. I suspect that lstrip would do better.

can you pre-load gzip with the lua keywords as common tokens ? i think it can be done, but i don't know how. that plus lstrip should be pretty good.

Adrian

Reply | Threaded
Open this post in threaded view
|

Re: Crunching Lua

Aaron Brown-2
In reply to this post by David Given
David Given wrote:

> I suspect that lstrip would do better.

I wrote a whole thing about how this wouldn't work, because
Lua bytecode already doesn't include whitespace and
comments, then I realized that of course you meant lstrip
instead of, not in addition to, bytecode compilation.

A bit of quick-and-dirty experimentation (very small sample
size) suggests that lstrip crunches very small files much
better than luac -s, but does only slightly better with
medium to large files.

~> wc -c somefile.lua
     68 somefile.lua
~> luac -s somefile.lua && wc -c luac.out
    174 luac.out
~> lstrip < somefile.lua | wc -c
     56

~> wc -c somefile2.lua
  95136 somefile2.lua
~> luac -s somefile2.lua && wc -c luac.out
  38574 luac.out
~> lstrip < somefile2.lua | wc -c
  37362

~> wc -c somefile3.lua
 215027 somefile3.lua
~> luac -s somefile3.lua && wc -c luac.out
 132910 luac.out
~> lstrip < somefile3.lua | wc -c
 132010

-- 
Aaron

Reply | Threaded
Open this post in threaded view
|

Re: Crunching Lua

Luiz Henrique de Figueiredo
> A bit of quick-and-dirty experimentation (very small sample
> size) suggests that lstrip crunches very small files much
> better than luac -s, but does only slightly better with
> medium to large files.

lstrip makes a point of preserving line breaks. It's easy to remove this
feature. It'll make a difference for large files with long comments.
--lhf