Pentium 4 and misaligned doubles

classic Classic list List threaded Threaded
52 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Pentium 4 and misaligned doubles

Rici Lake-2
I was having some trouble understanding a microbenchmark result on Lua 5.1work6
(also noted by Mike Pall) [note 1]

for i = 1, 1e8 do end
        4.09 real         4.07 user         0.00 sys

local t; for i = 1, 1e8 do end
        1.16 real         1.14 user         0.01 sys

The difference is the vm op which is executed 100 million times (full disassembly is below):
  FORLOOP 1, -1  (fast)    FORLOOP 0, -1   (slow)

This didn't appear to happen on 5.0.2, but I eventually managed to get it to; it just happens at a different point on the stack. By varying the number of locals created (and thus the stack index of the FORLOOP operation) from 0 to 64, I observed that the timing has a periodicity of 16. I constructed the following table by taking timing 10^9 iterations and multiplying by the processor speed in GHz to get the approximate number of cycles; it might be out by one or two but the timings are quite repeatable. [Note 2]


dstreg  5.1w6   5.0.2
------  -----   -----
     0    113     38
     1     32     38
     2     32     38
     3     32     38
     4     32     38
     5     32     64
     6     32     56
     7     32    111
     8     32     38
     9     32     38
    10     32     38
    11     32     38
    12     32     38
    13    113     38
    14     48     38
    15     39     38
<repeats, mod 16>

Now, the L1 cache of a Pentium 4 is effectively organized in units of 64 bytes. Consequently, every 16th stack slot contains a (potential) double which cross a cache-line boundary. The for loop in 5.0.2 reads three consecutive stack slots, and writes to the first one; the for loop in 5.1w6 reads three consecutive stack slots, writes to the first one, and then writes to the next consecutive slot; the above timings are completely consistent with that pattern of accesses if it were the case that a Pentium 4 has a penalty for reading misaligned doubles which cross a cache line, and a huge penalty for writing.

Armed with that thought and google, I found the following interesting reference:

<http://www.cygnus-software.com/papers/x86andinfinity.html>

in which you will find (in the reference to "test results") the following interesting chart, which I've abbreviated quite a bit: [Note 3]

Identifier               count      time    Mops/sec  Cycles/op  Slowdown
       Loads and stores - alignment tests
dword aligned src    ,  50000000,  0.04021, 1243.500,    2.252,    1.018 cache unaligned src  ,   2000000,  0.01441,  138.763,   20.178,    9.121 dword aligned dst    ,  50000000,  0.03964, 1261.332,    2.220,    1.003 cache unaligned dst  ,   2000000,  0.05760,   34.725,   80.634,   36.447

And, interestingly, the 20 cycle penalty for cache-unaligned loads and 80 cycle penalty for cache-unaligned stores corresponds amazingly closely to the observed data.

Even more interesting was the fact that Athlon AMD chips don't suffer from this problem. (I just mention that in passing, but it may influence my future buying decisions :)


[Note 1]
The output comes from the following script:

echo $1
/usr/bin/time ~/src/lua-5.1work5/src/lua -e "$1"
echo

[Note 2]
The timings were done on a Pentium 4 2.8 GHz, with both versions of Lua compiled with gcc version 3.3.3 with the options:

  -O3 -fno-crossjumping -fomit-frame-pointer -march=i686

which is so far what has given me the best results on that CPU.

[Note 3]

The referenced paper is mostly about infinities, NaNs and denormalized numbers, and if you use those which much frequency, you'll find the results even more, ummm, interesting. 861 cycles to add two infinities seems just a tad excessive.

I tested both comparison with infinity and adding infinity in Lua; it appears that there is no similar penalty to comparing with infinity (phew! I use that idiom) but the penalty for adding infinity seems to be pretty much as described. (See the third test)

local t; local inf=1/0; for j = 1, 1e8 do local i = inf + j end
       41.49 real        41.30 user         0.07 sys
local t; local inf=1e8; for j = 1, 1e8 do local i = inf + j end
        2.14 real         2.12 user         0.01 sys

In other words, based on the above result that the for loop itself take 31 cycles, it would appear that an addition takes about 28 cycles, unless it happens to be cache unaligned in which case it takes 48 cycles, or unless it happens to be an addition of infinity in which case it takes somewhat over 1000 cycles.

[Note 4]
I also ran these tests with 5.1work6 on Mac OS X with a 1 GHz G4. It exhibited neither of the variations; the for-loop test was consistent regardless of stack index, and the addition test did not appear to be affected by the sum. An empty numeric for loop appears to execute in about 39 cycles; the addition, whether by infinity or by 1e8, takes about 29 additional cycles. Of course, on the PowerPC the stack double-word aligned, so the for-loop test is not a fair comparison.

[Luac disassembly of initial test programs]
luac output: for i = 1, 1e8 do end
        1       [1]     LOADK           0 -1    ; 1
        2       [1]     LOADK           1 -2    ; 100000000
        3       [1]     LOADK           2 -1    ; 1
        4       [1]     FORPREP         0 0     ; to 5
        5       [1]     FORLOOP         0 -1    ; to 5
        6       [1]     RETURN          0 1

luac output: local t; for i = 1, 1e8 do end
        1       [1]     LOADNIL         0 0
        2       [1]     LOADK           1 -1    ; 1
        3       [1]     LOADK           2 -2    ; 100000000
        4       [1]     LOADK           3 -1    ; 1
        5       [1]     FORPREP         1 0     ; to 6
        6       [1]     FORLOOP         1 -1    ; to 6
        7       [1]     RETURN          0 1



Reply | Threaded
Open this post in threaded view
|

Re: Pentium 4 and misaligned doubles

Rici Lake-2
Just to confirm the previous, it occurred to me that in an array, every sixth value will cross a cache boundary. Indeed, empirical results demonstrate that there is an effect:

local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[1] = a[1] + 1 end
       13.51 real        13.44 user         0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[2] = a[2] + 1 end
       13.54 real        13.49 user         0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[3] = a[3] + 1 end
       13.51 real        13.44 user         0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[4] = a[4] + 1 end
       13.52 real        13.47 user         0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[5] = a[5] + 1 end
       13.51 real        13.45 user         0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[6] = a[6] + 1 end
       16.05 real        15.97 user         0.03 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[7] = a[7] + 1 end
       13.61 real        13.55 user         0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[8] = a[8] + 1 end
       13.63 real        13.58 user         0.00 sys

local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[1] = a[1] end
       10.91 real        10.85 user         0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[2] = a[2] end
       10.88 real        10.83 user         0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[3] = a[3] end
       10.88 real        10.83 user         0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[4] = a[4] end
       10.91 real        10.85 user         0.02 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[5] = a[5] end
       10.88 real        10.83 user         0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[6] = a[6] end
       13.41 real        13.35 user         0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[7] = a[7] end
       10.91 real        10.86 user         0.01 sys
local a = {1,2,3,4,5,6,7,8}; for i = 1, 1e8 do a[8] = a[8] end
       10.86 real        10.82 user         0.01 sys

Note that the constant 6 is probably also misaligned. Also, there is a difference between the layout of Lua 5.0.2 and Lua 5.1w6, so although the periodicity should be the same, it will probably have a different offset.

I should also add that I'm doing these tests with FreeBSD, whose allocator will not artifically cross a 64-byte boundary. (That wasn't a design goal, as far as I know -- the documentation only talks about page boundaries -- but it is a consequence of rounding all small allocations up to a power of two, and then assigning a whole page to objects of the same size.) I believe the Linux allocator is more memory conservative, and may allocate a small object across a 64-byte boundary.


Reply | Threaded
Open this post in threaded view
|

Re: Pentium 4 and misaligned doubles

Rici Lake-2
Note that the constant 6 is probably also misaligned. Also, there is a difference between the layout of Lua 5.0.2 and Lua 5.1w6, so although the periodicity should be the same, it will probably have a different offset.

And actually, that was the sole cause of the difference: (Now I'm puzzled again.)

local a = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; for i = 1, 1e7 do a[1] = a[16] + 1 end
        1.35 real         1.33 user         0.00 sys
local a = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; for i = 1, 1e7 do a[2] = a[1] + 1 end
        1.35 real         1.33 user         0.01 sys
local a = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; for i = 1, 1e7 do a[3] = a[2] + 1 end
        1.35 real         1.34 user         0.00 sys
local a = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; for i = 1, 1e7 do a[4] = a[3] + 1 end
        1.35 real         1.34 user         0.00 sys
local a = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; for i = 1, 1e7 do a[5] = a[4] + 1 end
        1.35 real         1.34 user         0.00 sys
local a = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; for i = 1, 1e7 do a[6] = a[5] + 1 end
        1.34 real         1.31 user         0.01 sys
local a = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; for i = 1, 1e7 do a[7] = a[6] + 1 end
        1.36 real         1.34 user         0.00 sys
local a = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; for i = 1, 1e7 do a[8] = a[7] + 1 end
        1.36 real         1.35 user         0.00 sys
local a = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; for i = 1, 1e7 do a[9] = a[8] + 1 end
        1.35 real         1.33 user         0.01 sys


Reply | Threaded
Open this post in threaded view
|

Re: Pentium 4 and misaligned doubles

Mike Pall-57
In reply to this post by Rici Lake-2
Hi,

Rici Lake wrote:
> [...] a Pentium 4 has a penalty for reading misaligned doubles 
> which cross a cache line, and a huge penalty for writing.

The results on a PIII are comparable, but with a periodicity
of 8 and two bumps ... something like: nnnnnXnXnnnnnXnX

I noticed that the results are a lot more pronounced in LuaJIT
because the dispatch overhead doesn't hide the unaligned penalties
anymore. So I investigated a bit into a proper solution:

Of course compiling with -malign-double is the easiest thing to
solve this with GCC. Alas, this breaks the x86 ABI. I think this
doesn't matter since the whole Lua core never passes structures
or unions that contain doubles to C library functions or back.

But still ... the recommended way to solve this is to add
  __attribute__ ((aligned(8)))
to either lua_Number or the Value union. Strangely enough, this
doesn't work with GCC 3.3.5. I think this is a bug, but maybe I'm
just misreading the docs.

The only way I could make it work is with:

  typedef struct lua_TValue {
    TValuefields;
  } __attribute__ ((aligned(16))) TValue;

A bit awkward, but solves both the stack alignment and the array
alignment problem.

The net effect is that stack slots and array slots of tables grow
from 12 to 16 bytes and hash slots of tables grow from 28 to 32
bytes. Since multiplying/dividing by 16 or 32 is just a plain
shift, other parts of the code are faster, too.

Maybe it would be a good idea to make this a user definable
setting in luaconf.h (near LUA_NUMBER, to make it clear this
is only relevant for doubles, not floats):

#if defined(__GNUC__) && defined(__i386)
#define LUA_TVALUE_ALIGN	__attribute__ ((aligned(16)))
#else
#define LUA_TVALUE_ALIGN
#endif


On a related note: the lua_number2int() optimization should be
turned off if __SSE2__ is defined (which is the case with
-march=pentium4). SSE2 offers a direct/faster way to truncate
doubles to ints (cvttsd2si reg32, xmm64/mem64). And GCC knows
how to use it (unless you override it with inline assembler).

Bye,
     Mike

Reply | Threaded
Open this post in threaded view
|

Re: Pentium 4 and misaligned doubles

Rici Lake-2

On 16-Aug-05, at 9:35 AM, Mike Pall wrote:

Of course compiling with -malign-double is the easiest thing to
solve this with GCC. Alas, this breaks the x86 ABI. I think this
doesn't matter since the whole Lua core never passes structures
or unions that contain doubles to C library functions or back.

Doesn't that change the stack alignment of double arguments? That would affect programs that called, for example, lua_pushnumber(), no? Or am I misreading something?

The only way I could make it work is with:

  typedef struct lua_TValue {
    TValuefields;
  } __attribute__ ((aligned(16))) TValue;

A bit awkward, but solves both the stack alignment and the array
alignment problem.

Wouldn't aligned(8) be sufficient? (At least, that wouldn't be lying to the compiler about the results of malloc() on x86 :) )

On a related note: the lua_number2int() optimization should be
turned off if __SSE2__ is defined (which is the case with
-march=pentium4).

I had to specify -msse2 for this to work on gcc 3.3.3

PS: Before anyone else notices that I'm fond of making an idiot of myself in public, it is quite clear why array alignment isn't important (but constant alignment is). The GETTABLE/SETTABLE ops (and the equivalent API calls) copy the table data onto the stack with a union copy (see the setobj macro in lobject.h), so the fp unit is presumably not involved. The key object, on the other hand, is not copied (if the constant is in the acceptable range for RK operands) and is then used directly as a number, so it's alignment is important.


Reply | Threaded
Open this post in threaded view
|

Microbenching table lookup (Was Re: Pentium 4 and misaligned doubles)

Rici Lake-2

On 16-Aug-05, at 12:13 PM, Rici Lake wrote:

I had to specify -msse2 for this to work on gcc 3.3.3

And it did make a bit of a difference, but only in some possibly unusual cases.

The pattern here is:

local t; local key=$1; local a = {[key] = 3}; for i = 1, 1e7 do local j = a[key] end

Full results, not converted to cycles (only user time), and without subtracting the cost of the for loop itself (about 0.12)

                                    normalization patch (note 6)
key          no sse   sse    note     no sse    sse
1             1.22    1.17    (1)      1.15     1.04
27            1.22    1.19    (2)      1.16     1.03
2^31-1        1.21    1.20             1.15     1.04
2^31          5.67    1.12    (3)      5.64     1.11
1e200         5.65    1.13             5.63     1.11
1/0           9.64    5.19    (4)      5.63     1.13
2^-1022       1.12    1.13             1.07     1.11
2^-1023       4.67    7.50    (5)      1.07     8.23
true          0.52    0.50             0.52     0.50
'a'           0.45    0.45             0.45     0.44
{}            0.83    0.80             0.83     0.80
function()end 0.82    0.81             0.82     0.80

Conclusion: Lua is optimised for string lookups in hash tables and integer lookups in the array part. That seems appropriate to me. The penalty for use of an object is noticeable but not outrageous. Floating point keys are problematic in certain cases, but probably not common cases. In any event, replacing the floating point normalization with an explicit check for 0 is probably an improvement.

Notes:

0) The test does not account for hash collisions. Floating point keys are normalized by adding 1 to the key, so considerable precision will be lost for very small keys; in particular, all keys in the range (-2^-52, 2^-52) hash to the same place.

1) The pattern with key=1 does not create a as an array, so the lookup is in the hash part of a. Contrast:

local t; local key=1; local a = {3}; for i = 1, 1e7 do local j = a[key] end
        0.55 real         0.53 user         0.01 sys
local t; local key=1; local a = {[key] = 3}; for i = 1, 1e7 do local j = a[key] end
        1.22 real         1.22 user         0.00 sys

2) The SSE2 conversion does not seem to be measurably faster in the case where the conversion does not overflow (but see below).

3) 2^31 (and 1e200) cause integer overflow on double-to-int conversion. My guess is that the SSE2 instructions avoid a penalty in this case.

4) This is the penalty for compuations with infinity. Apparently, SSE2 is used (if available) for the initial cast to integer, but not for the normalizing addition of 1 in the case that the hash is computed from the floating point representation.

5) 2^-1023 is denormalized. This seems to trigger a slow-down with the SSE2 conversion; at least, that's the only explanation I can come up with.

6) A considerable amount of the slowdown in unusual cases comes from the normalization step (n += 1) in hashnum. I replaced this with an explicit test for 0:

  if (n == 0) return hashnode(t, 0);

This will also avoid hash collisions on very small keys. While this simple-minded benchmark cannot be considered more than an indication, it seems that this patch improves performance in all cases except denormalised values with SSE2 conversion.



Reply | Threaded
Open this post in threaded view
|

Re: Pentium 4 and misaligned doubles

Mike Pall-57
In reply to this post by Rici Lake-2
Hi,

Rici Lake wrote:
> >Of course compiling with -malign-double is the easiest thing to
> >solve this with GCC. Alas, this breaks the x86 ABI. I think this
> >doesn't matter since the whole Lua core never passes structures
> >or unions that contain doubles to C library functions or back.
> 
> Doesn't that change the stack alignment of double arguments? That would 
> affect programs that called, for example, lua_pushnumber(), no? Or am I 
> misreading something?

The calling conventions do not change. Yes, this means a function
with (double, int, double) parameters always receives one of the
doubles unaligned. But interestingly freestanding doubles (static
or locals) are always aligned. IMHO -malign-double only affects
the x86 alignment rules for doubles within structures or unions.

> >The only way I could make it work is with:
> >
> >  typedef struct lua_TValue {
> >    TValuefields;
> >  } __attribute__ ((aligned(16))) TValue;
> >
> >A bit awkward, but solves both the stack alignment and the array
> >alignment problem.
> 
> Wouldn't aligned(8) be sufficient? (At least, that wouldn't be lying to 
> the compiler about the results of malloc() on x86 :) )

Umm, yes, this is sufficient. Should've checked twice. It just
doesn't work for a union or a non-structure typedef it seems.

> >On a related note: the lua_number2int() optimization should be
> >turned off if __SSE2__ is defined (which is the case with
> >-march=pentium4).
> 
> I had to specify -msse2 for this to work on gcc 3.3.3

Well, not with gcc 3.3.5:

$ echo '' | gcc -march=pentium3 -E - -dM | grep SSE
#define __SSE__ 1
$ echo '' | gcc -march=pentium4 -E - -dM | grep SSE
#define __SSE2__ 1
#define __SSE__ 1

And yes, it does use the SSE/SSE2 ops in these cases, too:

$ echo 'void foo(double x, int *a) { *a = (int)x; }' >tmp.c
$ gcc -march=pentium4 -O3 -S -o - tmp.c | grep cvtt
        cvttsd2si       8(%ebp), %eax

So it still seems to be safe to say that the inline assembly
replacement should only be used if !defined(__SSE2__) for
both of our compiler versions.

> PS: Before anyone else notices that I'm fond of making an idiot of 
> myself in public, it is quite clear why array alignment isn't important 
> (but constant alignment is). The GETTABLE/SETTABLE ops (and the 
> equivalent API calls) copy the table data onto the stack with a union 
> copy (see the setobj macro in lobject.h), so the fp unit is presumably 
> not involved.

Even if it was involved it would cause load/store or store/load
forwarding stalls in most cases. Copying potentially unaligned
and/or non-numeric data with FP ops is a bad idea (I tried).

> The key object, on the other hand, is not copied (if the 
> constant is in the acceptable range for RK operands) and is then used 
> directly as a number, so it's alignment is important.

Thankfully non-integer numeric keys are rare.

Bye,
     Mike

Reply | Threaded
Open this post in threaded view
|

Simple Lua for scripts

Alain
In reply to this post by Rici Lake-2
Hi,

I want to include Lua scripts in screen objects. My concern is that I want to limit accessibility to too many LUA commands, I want to limit the commands that he can use. Something like that:

1) the user inputs the script text in an object editing program. After editing, this program checks the script and if accepatable, stores it in the system database.

2) At runtime, the script is sent to lua unmodified for normal execution.

Step one can also be usefull for small sintax errors...

I searched the list history and found some references to GENTLE but I don't know if it is appropriate. Besides, it has a non comercial licence and I could not find the price of the comercial licence.

thanks for any help,
Alain


Reply | Threaded
Open this post in threaded view
|

Re: Simple Lua for scripts

Aaron Brown-2
Alain wrote:

> I want to include Lua scripts in screen objects. My
> concern is that I want to limit accessibility to too many
> LUA commands, I want to limit the commands that he can
> use.

If I understand what you're trying to do, it can be done
easily in stock Lua.  First use loadstring() to turn the
user's code into a function (this is where you catch any
syntax errors).  Then use setfenv() to keep the function
from accessing any dangerous globals.  This second step is
explained in section 14.3 of Programming in Lua:

<http://www.lua.org/pil/14.3.html>

-- 
Aaron

Reply | Threaded
Open this post in threaded view
|

Re: Simple Lua for scripts

Alain


Aaron Brown escreveu:
Alain wrote:
I want to include Lua scripts in screen objects. My
concern is that I want to limit accessibility to too many
LUA commands, I want to limit the commands that he can
use.

If I understand what you're trying to do, it can be done
easily in stock Lua.  First use loadstring() to turn the
user's code into a function (this is where you catch any
syntax errors).  Then use setfenv() to keep the function
from accessing any dangerous globals.  This second step is
explained in section 14.3 of Programming in Lua:
<http://www.lua.org/pil/14.3.html>

Ok, that was very usefull information and I will probably use both. But what I intended is something more restrictive. I want a syntax checker that forbids access to many lua functions, just saying ok/notok. I am thinking of a lexical analysis with only minimal lua syntax or something like that...

The problem is that LUA is a powerfull language, and I don't want users with all that power because I am the one who will have to give support to the program. That means that if the user puts some statement in his script he can do more that was intended fot him to do.

Forgive-me if it is difficult to make myself clear, but some concepts here are not very familiar to me.

Alain

Reply | Threaded
Open this post in threaded view
|

Re: Simple Lua for scripts

Ben Sunshine-Hill
That's accomplished with setfenv. 

On 8/22/05, Alain <[hidden email]> wrote:
> 
> 
> Aaron Brown escreveu:
> > Alain wrote:
> >>I want to include Lua scripts in screen objects. My
> >>concern is that I want to limit accessibility to too many
> >>LUA commands, I want to limit the commands that he can
> >>use.
> >
> > If I understand what you're trying to do, it can be done
> > easily in stock Lua.  First use loadstring() to turn the
> > user's code into a function (this is where you catch any
> > syntax errors).  Then use setfenv() to keep the function
> > from accessing any dangerous globals.  This second step is
> > explained in section 14.3 of Programming in Lua:
> > <http://www.lua.org/pil/14.3.html>
> 
> Ok, that was very usefull information and I will probably use both. But
> what I intended is something more restrictive. I want a syntax checker
> that forbids access to many lua functions, just saying ok/notok. I am
> thinking of a lexical analysis with only minimal lua syntax or something
> like that...
> 
> The problem is that LUA is a powerfull language, and I don't want users
> with all that power because I am the one who will have to give support
> to the program. That means that if the user puts some statement in his
> script he can do more that was intended fot him to do.
> 

That's accomplished with setfenv. Just only put the safe functions in
the environment that is set for the function.

Ben


Reply | Threaded
Open this post in threaded view
|

Re: Simple Lua for scripts

Alain
Thanks Ben for the information, but that is not enough. It does protect a function to alter some things, but a skilled (auto-nominated) can use other ways to access what he is not suposed to use.

I what I really am looking for is validating a minimal Lua-subset and avoiding commands unknown by me to get into the scripts.

I am Ssorry to ask the same quastion 3 times, but I feel that I did not make myself clear enough :(

Alain

Ben Sunshine-Hill escreveu:
That's accomplished with setfenv.
On 8/22/05, Alain <[hidden email]> wrote:


Aaron Brown escreveu:

Alain wrote:

I want to include Lua scripts in screen objects. My
concern is that I want to limit accessibility to too many
LUA commands, I want to limit the commands that he can
use.

If I understand what you're trying to do, it can be done
easily in stock Lua.  First use loadstring() to turn the
user's code into a function (this is where you catch any
syntax errors).  Then use setfenv() to keep the function
from accessing any dangerous globals.  This second step is
explained in section 14.3 of Programming in Lua:
<http://www.lua.org/pil/14.3.html>

Ok, that was very usefull information and I will probably use both. But
what I intended is something more restrictive. I want a syntax checker
that forbids access to many lua functions, just saying ok/notok. I am
thinking of a lexical analysis with only minimal lua syntax or something
like that...

The problem is that LUA is a powerfull language, and I don't want users
with all that power because I am the one who will have to give support
to the program. That means that if the user puts some statement in his
script he can do more that was intended fot him to do.



That's accomplished with setfenv. Just only put the safe functions in
the environment that is set for the function.

Ben



Reply | Threaded
Open this post in threaded view
|

Re: Simple Lua for scripts

Romulo Bahiense
Perhaps you are looking for something like:

-- begin of code

-- userCode should be the user script, loaded from somewhere
local userCode = [[
a = 5
b = 3
someFunction()
for i=1,10 do anotherFunction(i) end
math.sin = 'my own evil string'
string.find = 'another one'
]]

local protectedEnv = {
    anotherFunction = print,
    someFunction = function() print 'Yeah' end,
math = {sin = math.sin, log = math.log, min = math.min, max = math.max},
    string = {find = string.find, sub=string.sub, ....},
    ....
}

local f,err = loadstring(userCode)
if not f then -- syntax error
	error(err)
end
setfenv(f, protectedEnv)
f()
> yeah
> 1
> 2
> 3
> 4
> ...
> 10

print(protectedEnv.a,protectedEnv.b, protectedEnv.math.sin, protectedEnv.string.find, math.sin, string.find)
> 5	3	my own evil string	another one	function: ...	function: ...

-- end of code

PS: This code is untested ;)

--rb

Reply | Threaded
Open this post in threaded view
|

Re: Simple Lua for scripts

Ben Sunshine-Hill
In reply to this post by Alain
On 8/22/05, Alain <[hidden email]> wrote:
> Thanks Ben for the information, but that is not enough. It does protect
> a function to alter some things, but a skilled (auto-nominated) can use
> other ways to access what he is not suposed to use.
> 
> I what I really am looking for is validating a minimal Lua-subset and
> avoiding commands unknown by me to get into the scripts.
> 
> I am Ssorry to ask the same quastion 3 times, but I feel that I did not
> make myself clear enough :(

I don't think you quite understand. In Lua, the only way to access or
change things outside the "sandbox" is through functions. If you don't
expose any functions that give users the power to change stuff, they
won't be able to change stuff no matter how "auto-nominated" (?) they
are. There is simply no way for a script to add functions which do
things the script couldn't already do.

Here's an example: WebLua, available at
http://doris.sourceforge.net/lua/weblua.php . This script executes
arbitrary code server-side, and as far as I know doesn't need anything
for full security except a carefully chosen set of global functions
and normal limits on CPU time and memory usage.

You'll find that when it comes to sandboxing, simple solutions are
usually more secure than complicated ones. Trying to perform static
analysis of a script's security is more or less impossible.


Reply | Threaded
Open this post in threaded view
|

Re: Simple Lua for scripts

Alain
He I come to explain myselt again: I don't want sandboxing. I want a program that allow be to test if the user is using lua functions *other*than*the*ones*I*allow*him*to*use* not even what most lua programers consider *normal* to a lua program.

For this I believe I need something called a lexical analyser, so that I can allow only a sunset of normal LUA syntax.

thanks for all the help so far...
Alain

Ben Sunshine-Hill escreveu:
On 8/22/05, Alain <[hidden email]> wrote:

Thanks Ben for the information, but that is not enough. It does protect
a function to alter some things, but a skilled (auto-nominated) can use
other ways to access what he is not suposed to use.

I what I really am looking for is validating a minimal Lua-subset and
avoiding commands unknown by me to get into the scripts.

I am Ssorry to ask the same quastion 3 times, but I feel that I did not
make myself clear enough :(


I don't think you quite understand. In Lua, the only way to access or
change things outside the "sandbox" is through functions. If you don't
expose any functions that give users the power to change stuff, they
won't be able to change stuff no matter how "auto-nominated" (?) they
are. There is simply no way for a script to add functions which do
things the script couldn't already do.

Here's an example: WebLua, available at
http://doris.sourceforge.net/lua/weblua.php . This script executes
arbitrary code server-side, and as far as I know doesn't need anything
for full security except a carefully chosen set of global functions
and normal limits on CPU time and memory usage.

You'll find that when it comes to sandboxing, simple solutions are
usually more secure than complicated ones. Trying to perform static
analysis of a script's security is more or less impossible.



Reply | Threaded
Open this post in threaded view
|

Re: Simple Lua for scripts

Ben Sunshine-Hill
On 8/22/05, Alain <[hidden email]> wrote:
> He I come to explain myselt again: I don't want sandboxing. I want a
> program that allow be to test if the user is using lua functions
> *other*than*the*ones*I*allow*him*to*use* not even what most lua
> programers consider *normal* to a lua program.
> 
> For this I believe I need something called a lexical analyser, so that I
> can allow only a sunset of normal LUA syntax.
> 

This is the "impossible" static analysis I mentioned. As an example of
why this is impossible, here's a script that deletes /etc/passwd:

function dosomethingbad()
    (_G["o".."s"][table.concat { [1] = "re", [3] = "ove", [2] = "m"
}])("/etc/passwd")
end

...and does so without explicitly mentioning os.remove(). And this is
a simple example; what about a function that decrypted a PGPed script
and executed it? There just is no way to do what you want to do.
That's why sandboxing exists.

A "lexical analyzer" is a system which tokenizes input. It is
ineffective at providing security, as in the example I mentioned
above.

Ben


Reply | Threaded
Open this post in threaded view
|

Re: Simple Lua for scripts

Adam D. Moss
In reply to this post by Alain
Alain wrote:
He I come to explain myselt again: I don't want sandboxing. I want a program that allow be to test if the user is using lua functions *other*than*the*ones*I*allow*him*to*use* not even what most lua programers consider *normal* to a lua program.

For this I believe I need something called a lexical analyser, so that I can allow only a sunset of normal LUA syntax.

I think what you're basically being told is that the way
you're asking to do this isn't really the way you want to
do it.  You can't reliably guarantee through simply lexical
analysis that the user is only calling functions that you
intend her to call.

boopy = "tem"
os["sys"..boopy]("rm -rf /")

goodfunction = evilfunction
goodfunction()

evilfunction = goodfunction
evilfunction()

etc.

Yes, if you really want to lex lua you can use one of the
lua lexers/tokenisers, but you'd have to accept that the
results are going to be fairly deeply unreliable, unlike
the runtime sandboxing.

--adam
--
Adam D. Moss   -   [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Simple Lua for scripts

Rici Lake-2
In reply to this post by Alain

On 22-Aug-05, at 5:19 PM, Alain wrote:

He I come to explain myselt again: I don't want sandboxing. I want a program that allow be to test if the user is using lua functions *other*than*the*ones*I*allow*him*to*use* not even what most lua programers consider *normal* to a lua program.

You've been told how to do this, several times.

For this I believe I need something called a lexical analyser, so that I can allow only a sunset of normal LUA syntax.

Perhaps you should think about the possibility that your belief is incorrect.

A lexical analyzer will not tell you which Lua functions a program is using. However, you can arrange for only the functions you truly trust to be available in the Lua runtime. If the function is not loaded into the Lua runtime (or if it is deleted) then it is no longer available. Period.

It is perhaps slightly surprising that sandboxing Lua is quite simple, compared with other scripting languages. I hope that this feature will continue to receive high priority as the Lua packaging system develops.

Rici


Reply | Threaded
Open this post in threaded view
|

RE: Simple Lua for scripts

Nick Trout
In reply to this post by Alain

> On Behalf Of Rici Lake
>A lexical analyzer will not tell you which Lua functions a program is 
using. However, you can arrange for only the functions you truly trust 
to be available in the Lua runtime. If the function is not loaded into 
the Lua runtime (or if it is deleted) then it is no longer available. 
Period.
> It is perhaps slightly surprising that sandboxing Lua is quite simple,

compared with other scripting languages. I hope that this feature will 
continue to receive high priority as the Lua packaging system develops.

Sandboxing is indeed very easy. The idea is that you prevent rather than
cure. I've got a sandboxed version of Lua running here:

http://doris.sourceforge.net/lua/weblua.php

Alain, I do agree with you somewhat, that it would be nice to be able to
detect more of what a user is doing at compile time. I'm thinking more
in terms of misspelled objects etc calls rather than security issues,
which aren't impossible to detect, but current require parsing of the VM
code (and not all instances are easy to detect, as per the previous
examples).

This is a common complaint for people scripting in Lua (well in dynamic
languages, not just Lua) that easy mistakes aren't picked up until
runtime, whereas a statically compiled language would have picked this
up in the compile or link phase. This isn't such a problem if you have a
little script that executes and exits in 2 seconds, but if you have a
larger script that drives something and takes 5 minutes to hit a
spelling mistake, it can be irritating.

Work on "Lua macros" might fix both these issues. In your case you could
look for calls and box them, and my case I could check global object
lookups, where possible. It would also be possible to restrict the
syntax, where perhaps "extra syntax" is not necessary. 

So another option would be to look at the LHFs ltokens library:

	http://www.tecgraf.puc-rio.br/~lhf/ftp/lua/

Nick






Reply | Threaded
Open this post in threaded view
|

Re: Simple Lua for scripts

Alain
In reply to this post by Adam D. Moss
Ok, those two last answers from Adam and Ben come close to explaining 1) something that I didn't understand, 2) why I couln't explainmyself.

Security is an issue, but that was not what I am addressing here. NOW I am very much concerned!!!

What I really wanted is this: a much simpler scripting language than LUA, but I want LUA to execute it. Building a language is too complicated and LUA does it well. But Lua is too complex to leave in the hands of normal users, this is why: they will write things that don't work, then they will call ME to fix it.

If you think that I am on the wrong track, please say so. I have been following this Lua list for months, but I am not sure of anything anymore

Alain

Adam D. Moss escreveu:
Alain wrote:

He I come to explain myselt again: I don't want sandboxing. I want a program that allow be to test if the user is using lua functions *other*than*the*ones*I*allow*him*to*use* not even what most lua programers consider *normal* to a lua program.

For this I believe I need something called a lexical analyser, so that I can allow only a sunset of normal LUA syntax.


I think what you're basically being told is that the way
you're asking to do this isn't really the way you want to
do it.  You can't reliably guarantee through simply lexical
analysis that the user is only calling functions that you
intend her to call.

boopy = "tem"
os["sys"..boopy]("rm -rf /")

goodfunction = evilfunction
goodfunction()

evilfunction = goodfunction
evilfunction()

etc.

Yes, if you really want to lex lua you can use one of the
lua lexers/tokenisers, but you'd have to accept that the
results are going to be fairly deeply unreliable, unlike
the runtime sandboxing.

--adam

123