Lua 5.4.2 (lua_newstate)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

Lua 5.4.2 (lua_newstate)

Ranier Vilela-2
Hi,

More one suggestion to Lua code:

at lua_newstate function, the loop for initializing a fixed array,
can be securely changed by memset.

diff --git a/lstate.c b/lstate.c
index 42274292..153b000b 100644
--- a/lstate.c
+++ b/lstate.c
@@ -392,7 +392,7 @@ LUA_API lua_State *lua_newstate (lua_Alloc f, void *ud) {
   g->gcstepsize = LUAI_GCSTEPSIZE;
   setgcparam(g->genmajormul, LUAI_GENMAJORMUL);
   g->genminormul = LUAI_GENMINORMUL;
-  for (i=0; i < LUA_NUMTAGS; i++) g->mt[i] = NULL;
+  memset(g->mt, 0, sizeof(struct Table *) * (size_t) LUA_NUMTAGS);  
   if (luaD_rawrunprotected(L, f_luaopen, NULL) != LUA_OK) {
     /* memory allocation error: free partial state */
     close_state(L);

regards,
Ranier Vilela

avoid_loop_for_init_fixed_array.patch (758 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Luiz Henrique de Figueiredo
> -  for (i=0; i < LUA_NUMTAGS; i++) g->mt[i] = NULL;
> +  memset(g->mt, 0, sizeof(struct Table *) * (size_t) LUA_NUMTAGS);

No. The C standard says that NULL does not need to have all bits zero.
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Ranier Vilela-2
Em seg., 9 de nov. de 2020 às 13:48, Luiz Henrique de Figueiredo <[hidden email]> escreveu:
> -  for (i=0; i < LUA_NUMTAGS; i++) g->mt[i] = NULL;
> +  memset(g->mt, 0, sizeof(struct Table *) * (size_t) LUA_NUMTAGS);

No. The C standard says that NULL does not need to have all bits zero.
Change 0 by NULL, after the memset, the memory has the same content.
-  for (i=0; i < LUA_NUMTAGS; i++) g->mt[i] = NULL;
+  memset(g->mt, NULL, sizeof(struct Table *) * (size_t) LUA_NUMTAGS);

Internally, memset has its own loop, but this is very highly optimized for the platform used.

regards,
Ranier Vilela
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Ranier Vilela-2
In reply to this post by Ranier Vilela-2
Em seg., 9 de nov. de 2020 às 13:28, Ranier Vilela <[hidden email]> escreveu:
Hi,

More one suggestion to Lua code:

at lua_newstate function, the loop for initializing a fixed array,
can be securely changed by memset.

diff --git a/lstate.c b/lstate.c
index 42274292..153b000b 100644
--- a/lstate.c
+++ b/lstate.c
@@ -392,7 +392,7 @@ LUA_API lua_State *lua_newstate (lua_Alloc f, void *ud) {
   g->gcstepsize = LUAI_GCSTEPSIZE;
   setgcparam(g->genmajormul, LUAI_GENMAJORMUL);
   g->genminormul = LUAI_GENMINORMUL;
-  for (i=0; i < LUA_NUMTAGS; i++) g->mt[i] = NULL;
+  memset(g->mt, 0, sizeof(struct Table *) * (size_t) LUA_NUMTAGS);  
   if (luaD_rawrunprotected(L, f_luaopen, NULL) != LUA_OK) {
     /* memory allocation error: free partial state */
     close_state(L);
Another place:

diff --git a/lundump.c b/lundump.c
index 5aa55c44..46ed812e 100644
--- a/lundump.c
+++ b/lundump.c
@@ -191,10 +191,9 @@ static void loadConstants (LoadState *S, Proto *f) {
 static void loadProtos (LoadState *S, Proto *f) {
   int i;
   int n = loadInt(S);
-  f->p = luaM_newvectorchecked(S->L, n, Proto *);
   f->sizep = n;
-  for (i = 0; i < n; i++)
-    f->p[i] = NULL;
+  f->p = luaM_newvectorchecked(S->L, n, Proto *);
+  memset(f->p, NULL, sizeof(Proto *) * (size_t) n);
   for (i = 0; i < n; i++) {
     f->p[i] = luaF_newproto(S->L);
     luaC_objbarrier(S->L, f, f->p[i]);

regards,
Ranier Vilela

avoid_loop_for_init_fixed_array.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Luiz Henrique de Figueiredo
In reply to this post by Ranier Vilela-2
> Change 0 by NULL

NULL may not fit into a byte.
NULL and 0 in a pointer context are special values that are converted
to whatever representation is used for NULL pointers. The C programmer
has no control about it. The only way would be to declare a pointer
variable and set it to NULL and them copy from that, hardly worth the
trouble and hardly any clearer. The mt array is quite small.
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Francisco Olarte
In reply to this post by Ranier Vilela-2
Rainier:

On Mon, Nov 9, 2020 at 5:53 PM Ranier Vilela <[hidden email]> wrote:
> Em seg., 9 de nov. de 2020 às 13:48, Luiz Henrique de Figueiredo <[hidden email]> escreveu:
>> > -  for (i=0; i < LUA_NUMTAGS; i++) g->mt[i] = NULL;
>> > +  memset(g->mt, 0, sizeof(struct Table *) * (size_t) LUA_NUMTAGS);
>> No. The C standard says that NULL does not need to have all bits zero.
> Change 0 by NULL, after the memset, the memory has the same content.
> -  for (i=0; i < LUA_NUMTAGS; i++) g->mt[i] = NULL;
> +  memset(g->mt, NULL, sizeof(struct Table *) * (size_t) LUA_NUMTAGS);
> Internally, memset has its own loop, but this is very highly optimized for the platform used.

The second parameter to memset is an int. NULL must cast to 0 as an
int. But using it there is grossly missleanding.

OTOH, what Luis tries to tell you is that NULL can be represented as
something different than a 0 filled byte array. As long as it cast to
0 everything is fine.

Granted, all the platforms I've used use an all-zeroes representation,
but lua does not have the luxury of assuming that.

Francisco Olarte.
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Ranier Vilela-2
Em seg., 9 de nov. de 2020 às 14:09, Francisco Olarte <[hidden email]> escreveu:
Rainier:

On Mon, Nov 9, 2020 at 5:53 PM Ranier Vilela <[hidden email]> wrote:
> Em seg., 9 de nov. de 2020 às 13:48, Luiz Henrique de Figueiredo <[hidden email]> escreveu:
>> > -  for (i=0; i < LUA_NUMTAGS; i++) g->mt[i] = NULL;
>> > +  memset(g->mt, 0, sizeof(struct Table *) * (size_t) LUA_NUMTAGS);
>> No. The C standard says that NULL does not need to have all bits zero.
> Change 0 by NULL, after the memset, the memory has the same content.
> -  for (i=0; i < LUA_NUMTAGS; i++) g->mt[i] = NULL;
> +  memset(g->mt, NULL, sizeof(struct Table *) * (size_t) LUA_NUMTAGS);
> Internally, memset has its own loop, but this is very highly optimized for the platform used.

The second parameter to memset is an int. NULL must cast to 0 as an
int. But using it there is grossly missleanding.

OTOH, what Luis tries to tell you is that NULL can be represented as
something different than a 0 filled byte array. As long as it cast to
0 everything is fine.

Granted, all the platforms I've used use an all-zeroes representation,
but lua does not have the luxury of assuming that.
I never see any platform that NULL is not all-zeroes representation.

Ranier Vilela
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Francisco Olarte
Rainier:

On Mon, Nov 9, 2020 at 6:22 PM Ranier Vilela <[hidden email]> wrote:
> Em seg., 9 de nov. de 2020 às 14:09, Francisco Olarte <[hidden email]> escreveu:

>> Granted, all the platforms I've used use an all-zeroes representation,
>> but lua does not have the luxury of assuming that.

> I never see any platform that NULL is not all-zeroes representation.

As I told you I do not see nor have seen any. But Lua team strives for
C99 compatibility, IIRC. So they have to assume that they can hit it.

Francisco Olarte.
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Andrew Gierth
In reply to this post by Ranier Vilela-2
>>>>> "Ranier" == Ranier Vilela <[hidden email]> writes:

 Ranier> at lua_newstate function, the loop for initializing a fixed
 Ranier> array, can be securely changed by memset.

In addition to the other comments, note that modern compilers will
transform assignment loops into memset calls or inline implementations
of memset if circumstances suggest it will be both allowable and
efficient to do so.

--
Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Antonin Décimo
In reply to this post by Ranier Vilela-2
> I never see any platform that NULL is not all-zeroes representation.

Hi!

The infamous comp.lang.c Frequently Asked Questions at
http://c-faq.com has a subsection on null pointers in C
which I find helpful to clarify some C subtleties.
http://c-faq.com/null/

In particular, there's one entry discussing architectures where the
internal representation of the null pointer isn't zero bytes.
http://c-faq.com/null/machexamp.html

One catchy thing that I discovered with this FAQ is that if you
initialize some data containing pointers by writing zero bytes over
it, well you're not *for these architectures* initializing the
pointers to the null pointer (if you were a hacker in the '80s).

Best regards,

-- Antonin
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Ką Mykolas
In reply to this post by Ranier Vilela-2
On Mon, Nov 9, 2020 at 7:22 PM Ranier Vilela <[hidden email]> wrote:
> I never see any platform that NULL is not all-zeroes representation.

Well, Antonin mentioned obscure machines, but check out this LLVM
thread [1] about non-zero NULL on AMD GCN (OpenCL system).
In terms of time, it is not as old and obscure, just... behaves
differently on different address spaces.

[1] https://reviews.llvm.org/D26196
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Ką Mykolas
On Tue, Nov 10, 2020 at 2:38 AM Ką Mykolas <[hidden email]> wrote:

>
> On Mon, Nov 9, 2020 at 7:22 PM Ranier Vilela <[hidden email]> wrote:
> > I never see any platform that NULL is not all-zeroes representation.
>
> Well, Antonin mentioned obscure machines, but check out this LLVM
> thread [1] about non-zero NULL on AMD GCN (OpenCL system).
> In terms of time, it is not as old and obscure, just... behaves
> differently on different address spaces.
>
> [1] https://reviews.llvm.org/D26196

Also, there is another argument to that, from the engineering perspective,
address 0 do sound like perfectly valid address, pointing exactly at
the the beginning of available memory slab.
Thus, thinking of 0 as de facto NULL value, does not make much sense,
because NULL should not be dereferencable.

On that analogy, could You imagine nil being 0? :}
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Ranier Vilela-2
In reply to this post by Ką Mykolas
Em seg., 9 de nov. de 2020 às 21:39, Ką Mykolas <[hidden email]> escreveu:
On Mon, Nov 9, 2020 at 7:22 PM Ranier Vilela <[hidden email]> wrote:
> I never see any platform that NULL is not all-zeroes representation.

Well, Antonin mentioned obscure machines, but check out this LLVM
thread [1] about non-zero NULL on AMD GCN (OpenCL system).
In terms of time, it is not as old and obscure, just... behaves
differently on different address spaces.

[1] https://reviews.llvm.org/D26196
Ok, Makes sense to not use 0.
In fact, I never use 0 with pointers, but I've seen a lot of code like this.
Including IUP Lua.
Why not memset with (int) NULL, wouldn't work?

regards,
Ranier Vilela
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Francisco Olarte
On Tue, Nov 10, 2020 at 9:53 AM Ranier Vilela <[hidden email]> wrote:
> Ok, Makes sense to not use 0.
> In fact, I never use 0 with pointers, but I've seen a lot of code like this.
> Including IUP Lua.
> Why not memset with (int) NULL, wouldn't work?

Because memset uses whatever you pass it to fill a CHAR array ( you
pass an int because historically many machines did not pass char
arguments, they widened it to int because they used a stack position,
and you could only push something as wide as a register, which
normally was an int as it is supposed to be the fastest ).

So when filling with null using a loop indexing a POINTER array you
write, in x86, four bytes each times. Memset takes the 32 bit 0 int,
extracts the low order 0 byte and filles with it.
If NULL were 0x12345678 a pointer loop will fill mem with 78 56 34 12
78 56 34 12 78 56 34 12 , a memset will just put 78 78 78 78 78 78 78
78 78 .

You should normally not skip the ( pretty weak in C ) type system
using memsets/bzero/wmemset and all their friends unless you have
measured and are pretty sure of what you are doing. And  sometimes you
may even make your code slower. Memset needs to be a char-aligned
function, working with char aligned destinations. If you are using
intrinsics a good optimizer may notice you are filling a
pointer-aligned area, but with the pointer loop it will surely notice
it.

Francisco Olarte.
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Roberto Ierusalimschy
In reply to this post by Ranier Vilela-2
> Ok, Makes sense to not use 0.
> In fact, I never use 0 with pointers, but I've seen a lot of code like this.
> Including IUP Lua.
> Why not memset with (int) NULL, wouldn't work?

The reality according to ISO C is subtle. 0 *is* the null pointer,
when it is a constant properly converted to a pointer:

  6.3.2.3 Pointers
  [...]
  3 An integer constant expression with the value 0, or such an
    expression cast to type void *, is called a null pointer constant.(55)
    If a null pointer constant is converted to a pointer type, the
    resulting pointer, called a null pointer, is guaranteed to compare
    unequal to a pointer to any object or function.
  (55) The macro NULL is defined in <stddef.h> (and other headers) as
       a null pointer constant; see 7.17.

So, the "official" NULL in C is actually 0, NULL is only a macro.

The subtlelty is the cast: A null pointer is *a null pointer constant
converted to a pointer type*. A 0 converted to a float does not have
a zero representation; similarly, a constant 0 converted to a pointer
may not have a zero representation. And only a constant 0 converted
to a pointer is guaranteed to be NULL.

memset "fill[s] memory with a constant byte", not with a constant
pointer. So, the 0 (or NULL) given to it will never be a pointer,
and cannot properly represent the NULL pointer in an arbitrary
arquitecture, only in arquitectures where NULL is represented by
the same byte repeated multiple times. (In an 64-bit arquitecture
where the NULL is represented internally as 0xABABABABABABABAB,
«memset(m, 0xAB, len)» will initialize m with NULLs :-).

BTW, note that NULL is a "null pointer constant", not a "null
pointer" :-)

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Ranier Vilela-2
Em ter., 10 de nov. de 2020 às 10:43, Roberto Ierusalimschy <[hidden email]> escreveu:
> Ok, Makes sense to not use 0.
> In fact, I never use 0 with pointers, but I've seen a lot of code like this.
> Including IUP Lua.
> Why not memset with (int) NULL, wouldn't work?

The reality according to ISO C is subtle. 0 *is* the null pointer,
when it is a constant properly converted to a pointer:

  6.3.2.3 Pointers
  [...]
  3 An integer constant expression with the value 0, or such an
    expression cast to type void *, is called a null pointer constant.(55)
    If a null pointer constant is converted to a pointer type, the
    resulting pointer, called a null pointer, is guaranteed to compare
    unequal to a pointer to any object or function.
  (55) The macro NULL is defined in <stddef.h> (and other headers) as
       a null pointer constant; see 7.17.

So, the "official" NULL in C is actually 0, NULL is only a macro.

The subtlelty is the cast: A null pointer is *a null pointer constant
converted to a pointer type*. A 0 converted to a float does not have
a zero representation; similarly, a constant 0 converted to a pointer
may not have a zero representation. And only a constant 0 converted
to a pointer is guaranteed to be NULL.

memset "fill[s] memory with a constant byte", not with a constant
pointer. So, the 0 (or NULL) given to it will never be a pointer,
and cannot properly represent the NULL pointer in an arbitrary
arquitecture, only in arquitectures where NULL is represented by
the same byte repeated multiple times. (In an 64-bit arquitecture
where the NULL is represented internally as 0xABABABABABABABAB,
«memset(m, 0xAB, len)» will initialize m with NULLs :-).

BTW, note that NULL is a "null pointer constant", not a "null
pointer" :-)
Ok, I think that it was very clear.
But C programmer shouldn't worry about filling memory with NULLs,
when C99 allow constructs like this:

/* printf example */
#include <stdio.h>

int main()
{
    char * names[32] = { NULL };
   
   printf ("name[30]=%p\n", names[30]);
   return 0;
}
output:
name[30]=(nil)

Ranier Vilela
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

云风 Cloud Wu
Ranier Vilela <[hidden email]> 于2020年11月10日周二 下午9:54写道:
>
> Ok, I think that it was very clear.
> But C programmer shouldn't worry about filling memory with NULLs,

The point is as Andrew Gierth said , "modern compilers will
transform assignment loops into memset calls or inline implementations
of memset if circumstances suggest it will be both allowable and
efficient to do so" .

In my opinion, better C code should express the programmer's intention directly.
We need init the array with NULLs, memset is only a shallow
optimization that the compilers can do it well.

--
http://blog.codingnow.com
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Ranier Vilela-2
Em ter., 10 de nov. de 2020 às 23:07, 云风 Cloud Wu <[hidden email]> escreveu:
Ranier Vilela <[hidden email]> 于2020年11月10日周二 下午9:54写道:
>
> Ok, I think that it was very clear.
> But C programmer shouldn't worry about filling memory with NULLs,

The point is as Andrew Gierth said , "modern compilers will
transform assignment loops into memset calls or inline implementations
of memset if circumstances suggest it will be both allowable and
efficient to do so" .

In my opinion, better C code should express the programmer's intention directly.
We need init the array with NULLs, memset is only a shallow
optimization that the compilers can do it well.
The cut took out of context.
I said "when".

And yet, in many cases, it is not necessary to initialize the array with NULLs, especially if it is a large array.
Good C codes are always defensive.

I usually code like this:
char * names[32000];
names[0]=NULL;

void func(names, size) {
unsigned int i;
for(i = 0; i < size; ++i) {
    if (names[i] == NULL) { /* end of array? */
        break;
    }
   // etc etc etc
}
return;

With one simple move, will reset the entire array.
Another example:

struct array {
    sizet_t size;
    size_t alloc;
    struct * mystruct[1];
} ARRAY;

ARRAY * mt;

mt->size = 0; /* reset the entire array */
void func(ARRAY * mt) {
unsigned int i;
unsigned size = mt->size; /* array length */
for(i = 0; i < size; ++i) {
   // etc etc etc
}
Again, with one simple move, will reset the entire array.

No need to reset with NULLs, the entire array.

Ranier Vilela
Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.4.2 (lua_newstate)

Ranier Vilela-2
Em qua., 11 de nov. de 2020 às 08:23, Ranier Vilela <[hidden email]> escreveu:
Em ter., 10 de nov. de 2020 às 23:07, 云风 Cloud Wu <[hidden email]> escreveu:
Ranier Vilela <[hidden email]> 于2020年11月10日周二 下午9:54写道:
>
> Ok, I think that it was very clear.
> But C programmer shouldn't worry about filling memory with NULLs,

The point is as Andrew Gierth said , "modern compilers will
transform assignment loops into memset calls or inline implementations
of memset if circumstances suggest it will be both allowable and
efficient to do so" .

In my opinion, better C code should express the programmer's intention directly.
We need init the array with NULLs, memset is only a shallow
optimization that the compilers can do it well.
The cut took out of context.
I said "when".

And yet, in many cases, it is not necessary to initialize the array with NULLs, especially if it is a large array.
Good C codes are always defensive.

I usually code like this:
char * names[32000];
names[0]=NULL;

void func(names, size) {
unsigned int i;
for(i = 0; i < size; ++i) {
    if (names[i] == NULL) { /* end of array? */
        break;
    }
   // etc etc etc
}
return;

With one simple move, will reset the entire array.
Another example:

struct array {
    sizet_t size;
    size_t alloc;
    struct * mystruct[1];
} ARRAY;

ARRAY * mt;

mt->size = 0; /* reset the entire array */
void func(ARRAY * mt) {
unsigned int i;
unsigned size = mt->size; /* array length */
for(i = 0; i < size; ++i) {
   // etc etc etc
}
Again, with one simple move, will reset the entire array.

No need to reset with NULLs, the entire array.
Sorry, replace "unsigned int" with "size_t".

Ranier Vilela