lua scripts compression

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

lua scripts compression

Serge Semashko
Hello All,

I am looking for a simple script language which can be used from C++
program. I am going to incorporate this script language into antivirus
program engine. Each virus record will have its cure script, some will
be simple, some - more compliteted. There will be lots of scripts,
more than 10000 and some of them will be autogenerated. The question
is how to store all these scripts so that they take as little space as
possible?

I tried to convert scripts into bytecode but they sometimes take more
place than in source form. I know, that I can use any compression
library like zlib or ucl but I do not completely like this idea.
Comments and extra whitespaces could be removed to save space, also
many keywords can be replaced with numeric id's. I think the best
solution would be to store output of lexer in a compressed form (1
byte for each token id and also strings or numeric values if needed).
And I would appreciate the possibility to feed this data directly to
parser.

What do you think?

-- 
Best regards,
 Serge                          [hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: lua scripts compression

Luiz Henrique de Figueiredo
>I tried to convert scripts into bytecode but they sometimes take more
>place than in source form.

This unfortunate, but the main goal in precompiling is speed of loading
not compression. Have you tried stripping debug information (luac -s)?

>Comments and extra whitespaces could be removed to save space, also
>many keywords can be replaced with numeric id's. I think the best
>solution would be to store output of lexer in a compressed form (1
>byte for each token id and also strings or numeric values if needed).

Back in the very old days, before luac, we had something like this. But it
did not help much because most of the time was spent converting real numbers
(we wanted to reduced the load time of large graphics files, which had many
real numbers).
--lhf

Reply | Threaded
Open this post in threaded view
|

Re: lua scripts compression

Gunnar Zötl
In reply to this post by Serge Semashko
SS> is how to store all these scripts so that they take as little space as
SS> possible?

If you want to avoid compression libraries (which would be my first
choice for such a problem...), then maybe something like the basic
compressors in the old days did might be appropriate: reduce every
symbol to 1 character (or as few as you can afford). You can also do
this with the builtin functions and tables. Then have a library ready
that defines these short symbols for the builtin functions, and load
that before the script. You obviously need some program to compress
your scripts to this format. Together with whitespace compression and
comment removal this should result in a pretty good compression ratio.

Gunnar


Reply | Threaded
Open this post in threaded view
|

compilation under cygwin

John Passaniti-4
In reply to this post by Luiz Henrique de Figueiredo
A *tiny* note about compilation of the current 5.0 beta under Cygwin.

I was unable to compile Lua without first creating the 'lib' and 'bin'
directories.  Maybe this is a known thing, but it caused me a few seconds of
confusion.





Reply | Threaded
Open this post in threaded view
|

Re: compilation under cygwin

Luiz Henrique de Figueiredo
>A *tiny* note about compilation of the current 5.0 beta under Cygwin.
>
>I was unable to compile Lua without first creating the 'lib' and 'bin'
>directories.  Maybe this is a known thing, but it caused me a few seconds of
>confusion.

Someone commented this in a private mail to us. It seems to me that this is
a problem with the unpacker not creating empty directories. Anyway, I think
we'll add code in the top level Makefile to make sure bin and lib exist. Or
we could make bin and lib not empty :-)
--lhf