Adding another way to point to "levels" to debug.getinfo and friends

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Adding another way to point to "levels" to debug.getinfo and friends

Eduardo Ochs
Hi list,

Two questions:

1) Is there a standard header that I can put in my e-mails that means
   "this is _NOT GOING to be used in production code UNDER ANY
   CIRCUMSTANCES_, this is going to be a personal hack that I will
   only load into a Lua interpreter BY HAND for some VERY CONTROLLED
   tests, etc, etc"?

   (Can you please suppose that I started my e-mail with a header like
   this? I've been considering asking the question below here at the
   list for YEARS, but EVERY SINGLE TIME I predicted the probable
   reaction of the professional programmers in the list and gave
   up...)

   By the way, I am the author of the article "Bootstrapping a Forth
   in 40 lines of Lua code" that appeared in the Lua Gems book. One of
   its last paragraphs is this:

     I've met many people over the years who have been Forth
     enthusiasts in the past, and we often end up discussing what made
     Forth so thrilling to use at that time - and what we can do to
     adapt its ideas to the computers of today. My personal impression
     is that Forth's main points were not the ones that I listed at
     the beginning of this section, and that I said that were easy to
     quantify; rather, what was most important was that nothing was
     hidden, there were no complex data structures around with
     "don't-look-at-this" parts (think on garbage collection in Lua,
     for example, and Lua's tables - beginners need to be convinced to
     see these things abstractly, as the concrete details of the
     implementation are hard), and _everything_ - code, data,
     dictionaries, stacks - were just linear sequences of bytes, that
     could be read and modified directly if we wished to. We had total
     freedom, defining new words was quick, and experiments were quick
     to make; that gave us a sense of power that was totally different
     from, say, the one that a Python user feels today because he has
     huge libraries at his fingertips.

   The technical question that I want to ask is related to using Lua
   as Forths were used in the early 90's - there were LOTS of commands
   that if used wrongly could freeze the system and require a reboot,
   and we were perfectly happy with that.


2) Here is the idea; the question is below.

   The functions debug.getinfo, debug.getlocal and debug.setlocal are
   usually called with an integer argument that the manual refers to
   as "level", that is processed like this (I took the code from
   db_getinfo, in ldblib.c) to set the variable "ar" to an "activation
   record":

     if (lua_isnumber(L, arg+1)) {
       if (!lua_getstack(L1, (int)lua_tointeger(L, arg+1), &ar)) {
         lua_pushnil(L);  /* level out of range */
         return 1;
       }
     }

   I would like to have _variants_ of these functions, to be called
   debug.mygetinfo, debug.mygetlocal and debug.mysetlocal, that would
   accept an alternative to a numerical "level". Running

     ar = debug.mygetstack(2)

   would set ar to a string like

     "activation record: 0x125cf20"

   whose address part points to the "activation record" of a function
   in the call stack, like the pointer that

     lua_getstack(L1, (int)lua_tointeger(L, arg+1), &ar)

   puts into ar, and if we are super-ultra-careful then we can call
   debug.mygetinfo, debug.mygetlocal and debug.mysetlocal in either of
   these ways, the second one being equivalent to the first one:

     debug.mygetinfo (2,  "n")
     debug.mygetinfo (ar, "n")
     debug.mygetlocal(2,  3)
     debug.mygetlocal(ar, 3)
     debug.mysetlocal(2,  3, 42)
     debug.mysetlocal(ar, 3, 42)

   But OF COURSE if we set ar to a bad address, say,

     ar = "activation record: 0x12345678"

   then debug.mygetinfo, debug.mygetlocal and debug.mysetlocal WOULD
   NOT HESITATE to use that address and segfault (HAHAHA! DEAL WITH
   THIS, MODERN PROGRAMMERS!!!)...

   The question is: has anyone implemented something like this, or
   something that would cover a part of this? I haven't written any C
   code in ages... I think I can implement it myself, alone, but that
   would take me one or two full days just for a prototype in which I
   would just change ldblib.c... putting these new functions into a
   ".so" would take more.


Thanks in advance!!!
  Eduardo Ochs =)
  http://angg.twu.net/dednat6.html   <- (for lualatex users)

Reply | Threaded
Open this post in threaded view
|

Re: Adding another way to point to "levels" to debug.getinfo and friends

Sean Conner
It was thus said that the Great Eduardo Ochs once stated:
> Hi list,
>
> Two questions:
>
> 1) Is there a standard header that I can put in my e-mails that means
>    "this is _NOT GOING to be used in production code UNDER ANY
>    CIRCUMSTANCES_, this is going to be a personal hack that I will
>    only load into a Lua interpreter BY HAND for some VERY CONTROLLED
>    tests, etc, etc"?

  There is no such header.  RFC-822 states that no field that is officially
defined will start with "X-" (or "x-" because header names are case
insensitive):

        Any field which is defined in a document published as a formal
        extension to this specification; none will have names beginning with
        the string "X-"

but RFC-2822 (which obsoletes RFC-822) and RFC-5322 (which obsoletes
RFC-2822) have no such language, instead saying:

        Fields may appear in messages that are otherwise unspecified in this
        document.  They MUST conform to the syntax of an optional-field.
        This is a field name, made up of the printable US-ASCII characters
        except SP and colon, followed by a colon, followed by any text that
        conforms to the unstructured syntax.
       
        The field names of any optional field MUST NOT be identical to any
        field name specified elsewhere in this document.

In other RFC documents (too many to mention) private or experimental fields
are usually labeled with "X-" (or "x-") so your best bet is to create a
header name starting with "X-" to be safe.  In fact, the email I'm replying
to has the following headers starting with "X-":

        X-Google-DKIM-Signature
        X-Gm-Message-State
        X-Google-Smtp-Source
        X-Received
        X-Pepperfish-Transaction
        X-Spam-Score
        X-Spam-Score-int
        X-Spam-Bar
        X-Scanned-By
        X-Spam-Report
        X-ACL-Warn
        X-Scan-Signature
        X-BeenThere
        X-Mailman-Version

> 2) Here is the idea; the question is below.
>
>    The functions debug.getinfo, debug.getlocal and debug.setlocal are
>    usually called with an integer argument that the manual refers to
>    as "level", that is processed like this (I took the code from
>    db_getinfo, in ldblib.c) to set the variable "ar" to an "activation
>    record":
>
>      if (lua_isnumber(L, arg+1)) {
>        if (!lua_getstack(L1, (int)lua_tointeger(L, arg+1), &ar)) {
>          lua_pushnil(L);  /* level out of range */
>          return 1;
>        }
>      }
>
>    I would like to have _variants_ of these functions, to be called
>    debug.mygetinfo, debug.mygetlocal and debug.mysetlocal, that would
>    accept an alternative to a numerical "level". Running
>
>      ar = debug.mygetstack(2)
>
>    would set ar to a string like
>
>      "activation record: 0x125cf20"
>
>    whose address part points to the "activation record" of a function
>    in the call stack, like the pointer that
>
>      lua_getstack(L1, (int)lua_tointeger(L, arg+1), &ar)
>
>    puts into ar, and if we are super-ultra-careful then we can call
>    debug.mygetinfo, debug.mygetlocal and debug.mysetlocal in either of
>    these ways, the second one being equivalent to the first one:
>
>      debug.mygetinfo (2,  "n")
>      debug.mygetinfo (ar, "n")
>      debug.mygetlocal(2,  3)
>      debug.mygetlocal(ar, 3)
>      debug.mysetlocal(2,  3, 42)
>      debug.mysetlocal(ar, 3, 42)
>
>    But OF COURSE if we set ar to a bad address, say,
>
>      ar = "activation record: 0x12345678"
>
>    then debug.mygetinfo, debug.mygetlocal and debug.mysetlocal WOULD
>    NOT HESITATE to use that address and segfault (HAHAHA! DEAL WITH
>    THIS, MODERN PROGRAMMERS!!!)...

  Does debug.mysetlocal() (to pick one at random) *parse* the ar string?  Or
do you mean to return the actual address of the ar structure?  I don't think
this will work as you think it will work.

        static int mydebug_mygetstack(lua_State *L)
        {
          lua_Debug ar;
         
          lua_getstack(L,luaL_checkinteger(L,1),&ar);
          lua_pushfstring(L,"activation record: %p",(void *)&ar);
          return 1;
        }

  The problem here is how lua_getstack() works---it wants a pointer to a
lua_Debug structure.  The C standard (C99, 6.2.4) says this:

        4 An object whose identifier is declared with no linkage and without
        the storage-class specifier static has automatic storage duration.

        5 For such an object that does not have a variable length array
        type, its lifetime extends from entry into the block with which it
        is associated until execution of that block ends in any way.
        (Entering an enclosed block or calling a function suspends, but does
        not end, execution of the current block.) If the block is entered
        recursively, a new instance of the object is created each time. The
        initial value of the object is indeterminate. If an initialization
        is specified for the object, it is performed each time the
        declaration is reached in the execution of the block; otherwise, the
        value becomes indeterminate each time the declaration is reached.

  "ar" here has automatic storage duration---it only lives as long as we are
in the function mydebug_mygetstack().  Once the function returns, the
address of "ar" is no longer valid [1].  You can get around this by doing:

        static int mydebug_mygetstack(lua_State *L)
        {
          lua_Debug *ar;
         
          lua_settop(L,1);
          ar = lua_newuserdata(L,sizeof(lua_Debug));
          lua_getstack(L,luaL_checkinteger(L,1),ar);
          lua_pushfstring(L,"activation record: %p",(void *)ar);
          return 1;
        }

  That's a *little* better in that you are returning the address of a
heap-allocated block of memory, and as long as the return value isn't
garbaged collected by Lua, it will be a valid address to use (and when it
goes out of scope, "C undefined behavior" [2].  Of course, you could do:

        static int mydebug_mygetstack(lua_State *L)
        {
          lua_Debug *ar;
         
          ar = malloc(sizeof(lua_Debug));
          if (ar != NULL)
          {
            lua_getstack(L,luaL_checkinteger(L,1),ar);
            lua_pushfstring(L,"activation record: %p",(void *)ar);
          }
          else
            lua_pushnil(L);
         
          return 1;
        }
       
  This is better in that it avoids "C undefined behavior" [2], but you could
leak memory this way, unless you manually free it when it's no longer used
(which kind of defeats the purpose of garbage collection).

  Parsing the address from the string is problematic, not only for the case
you stated, but in doing it "safely."  About the best way to do it is to use
strtoull() to parse the number, and if it doesn't return an error [3], check
it against UINTPTR_MAX and if less, cast the result to uintptr_t, then 'void
*' then the type you want.  Yes, it's exceedingly pedantic, but there are
those on this list that would say that even *that* is unsafe to do.  

  Anyway ... my guess is that you want a absolute reference to an
activation record, not a relative one.  Is that the case?

  -spc

[1] No longer valid for use as a lua_Debug object.  On any modern system
        (which is 99 44/100% likely to be an x86 based system) the address
        will exist, but point into the process' stack space, but what the
        contents are will *NOT* likely be what you expect. [2]

[2] This is what is called in C "undefined behavior".  There's another
        current thread on this mailing list about that.

[3] It returns ULLONG_MAX and sets errno to ERANGE.

Reply | Threaded
Open this post in threaded view
|

Re: Adding another way to point to "levels" to debug.getinfo and friends

Sean Conner
It was thus said that the Great Sean Conner once stated:

>
> static int mydebug_mygetstack(lua_State *L)
> {
>  lua_Debug *ar;
>  
>  ar = malloc(sizeof(lua_Debug));
>  if (ar != NULL)
>  {
>    lua_getstack(L,luaL_checkinteger(L,1),ar);
>    lua_pushfstring(L,"activation record: %p",(void *)ar);
>  }
>  else
>    lua_pushnil(L);
>  
>  return 1;
> }

  Oops, this can leak memory when it's not given an integer (it's rare that
I call malloc() in Lua interface code).  This should be:

        static int mydebug_mygetstack(lua_State *L)
        {
          lua_Debug *ar;
          int        level = luaL_checkinteger(L,1);
         
          ar = malloc(sizeof(lua_Debug));
          if (ar != NULL)
          {
            lua_getstack(L,level,&ar);
            lua_pushfstring(L,"activation record: %p",(void *)ar);
          }
          else
            lua_pushnil(L);
         
          return 1;
        }

  -spc (That will avoid the memory leak on a bad parameter)
 

Reply | Threaded
Open this post in threaded view
|

Re: Adding another way to point to "levels" to debug.getinfo and friends

Daurnimator
In reply to this post by Sean Conner
On Mon, 13 May 2019 at 09:03, Sean Conner <[hidden email]> wrote:
> In other RFC documents (too many to mention) private or experimental fields
> are usually labeled with "X-" (or "x-") so your best bet is to create a
> header name starting with "X-" to be safe.

Please stop using the X- prefix! See RFC 6648:

This document generalizes from the experience of the email and SIP
communities by doing the following:

   1.  Deprecates the "X-" convention for newly defined parameters in
       application protocols, including new parameters for established
       protocols.  This change applies even where the "X-" convention
       was only implicit, and not explicitly provided, such as was done
       for email in [RFC822].

Reply | Threaded
Open this post in threaded view
|

Re: Adding another way to point to "levels" to debug.getinfo and friends

Sean Conner
It was thus said that the Great Daurnimator once stated:

> On Mon, 13 May 2019 at 09:03, Sean Conner <[hidden email]> wrote:
> > In other RFC documents (too many to mention) private or experimental fields
> > are usually labeled with "X-" (or "x-") so your best bet is to create a
> > header name starting with "X-" to be safe.
>
> Please stop using the X- prefix! See RFC 6648:
>
> This document generalizes from the experience of the email and SIP
> communities by doing the following:
>
>    1.  Deprecates the "X-" convention for newly defined parameters in
>        application protocols, including new parameters for established
>        protocols.  This change applies even where the "X-" convention
>        was only implicit, and not explicitly provided, such as was done
>        for email in [RFC822].

  Interesting.  Quoting a bit more:

        Creators of new parameters to be used in the context of application
        protocols:

        1.  SHOULD assume that all parameters they create might become
            standardized, public, commonly deployed, or usable across
            multiple implementations.

        2.  SHOULD employ meaningful parameter names that they have reason
            to believe are currently unused.

        3.  SHOULD NOT prefix their parameter names with "X-" or similar
            constructs.

        Note: If the relevant parameter name space has conventions about
        associating parameter names with those who create them, a parameter
        name could incorporate the organization's name or primary domain
        name (see Appendix B for examples).

and later on:

        ... In rare cases, truly experimental parameters could be given
        meaningless names such as nonsense words, the output of a hash
        function, or Universally Unique Identifiers (UUIDs) [RFC4122].

  So 6dd39dca-34a1-4d0c-b1eb-30481b7ec7a8 would be a perfectly cromulent
header name to use for private experimentation.

  -spc (Although it looks really weird ... )

Reply | Threaded
Open this post in threaded view
|

Re: Adding another way to point to "levels" to debug.getinfo and friends

Philippe Verdy
In reply to this post by Eduardo Ochs
One way to make sure you will not conflict with any other standard RFC, is to name your headers using a UUID (formatted as syntaxically conforming header name).
Generate one randomly (to avoid collisions), attempt to search this generated GUID online if you want to make sure it is unique, keep that on your records.
In fact any randomly generated 128-bit integer can fit; convert it converted to ASCII using some set of safe digits (not just limited to hexadecimals) and without the extra group separators and surrounding braces commonly seen) and it will not be very long.

RFC 2822 states that "A field name MUST be composed of printable US-ASCII characters (i.e., characters that have values between 33 and 126, inclusive), except colon...". This would allow using characters in a set of 93, but as they are case insensitive, you also need to remove 26 dual-cased letters from the set to form the base, leaving a choice of 67 characters.
You can then use a Base-67 conversion.

And in base 67, a 128-bit random integer (or UUID) just needs ceil(128/log2(67))=22 characters (there are some extra bits not needed in the first of last character of the sequence, you may take that into account to avoid generating non-letters in these positions).

You may prefer using a base-64 conversion because ceil(128/log2(64))=22 and the encoding will not be longer (you can still append some additional characters to make subheaders, or generate other 22-character header each time). A Base-64 conversion will work for you, but not with the two alternate alphabets defined in RFC 4648 (because they are both case-sensitive).

You can choose your 64-characters alphabet so that it will avoid using "_" (useful as additional group separators, to allow "visual" control of its length and better "readibility"), and the double quote and backslash (which may be useful to embed your header in quoted strings, including in constants of programming languages like C or Java, without needing any escaping).

I would suggest formatting the 22 digits in Base67 in groups of 4 or 5 digits, except the first and last group having only 1 digit

So you don't need the "X-" followed by a "readable" header name, which is much more likely to enter in collision with other apps (or evolutions of RFC 2822 in its BCP standard track, or inclusion of RFC 2822 in a new standard protocol) than a randomly generated header.

And a header name like this one will work:

  "GE16Q$,18'(4<SG@HB.N5S" + ":" + "(some value here)" + CRLF

just like this one with 4 additional "formatting group separators" inserted every 5 digits:

  "G_E16Q$_,18'(_4<SG@_HB.N5_S" + ":" + "(some value here)" + CRLF

which is also equivalent to:

  "G_e16q$_,18'(_4<sg@_hb.n5_s" + ":" + CRLF + SPACE +
  "(some value here)" + CRLF

You may prepend an "X-" to this random header, if you still want to make sure it "looks" like a legacy header, but in that case I suggest you use Base-32 from RFC 4648 (i.e. letters A to Z and digits 2 to 7, avoiding 0 and I confused with letter O and I), without any padding and group separators (from a 128-bit UUID or random number, you need 26 digits in base 32, and with the "X-" prefix, your header name will have 28 characters).


Le dim. 12 mai 2019 à 23:44, Eduardo Ochs <[hidden email]> a écrit :
Hi list,

Two questions:

1) Is there a standard header that I can put in my e-mails that means
   "this is _NOT GOING to be used in production code UNDER ANY
   CIRCUMSTANCES_, this is going to be a personal hack that I will
   only load into a Lua interpreter BY HAND for some VERY CONTROLLED
   tests, etc, etc"?

   (Can you please suppose that I started my e-mail with a header like
   this? I've been considering asking the question below here at the
   list for YEARS, but EVERY SINGLE TIME I predicted the probable
   reaction of the professional programmers in the list and gave
   up...)

   By the way, I am the author of the article "Bootstrapping a Forth
   in 40 lines of Lua code" that appeared in the Lua Gems book. One of
   its last paragraphs is this:

     I've met many people over the years who have been Forth
     enthusiasts in the past, and we often end up discussing what made
     Forth so thrilling to use at that time - and what we can do to
     adapt its ideas to the computers of today. My personal impression
     is that Forth's main points were not the ones that I listed at
     the beginning of this section, and that I said that were easy to
     quantify; rather, what was most important was that nothing was
     hidden, there were no complex data structures around with
     "don't-look-at-this" parts (think on garbage collection in Lua,
     for example, and Lua's tables - beginners need to be convinced to
     see these things abstractly, as the concrete details of the
     implementation are hard), and _everything_ - code, data,
     dictionaries, stacks - were just linear sequences of bytes, that
     could be read and modified directly if we wished to. We had total
     freedom, defining new words was quick, and experiments were quick
     to make; that gave us a sense of power that was totally different
     from, say, the one that a Python user feels today because he has
     huge libraries at his fingertips.

   The technical question that I want to ask is related to using Lua
   as Forths were used in the early 90's - there were LOTS of commands
   that if used wrongly could freeze the system and require a reboot,
   and we were perfectly happy with that.


2) Here is the idea; the question is below.

   The functions debug.getinfo, debug.getlocal and debug.setlocal are
   usually called with an integer argument that the manual refers to
   as "level", that is processed like this (I took the code from
   db_getinfo, in ldblib.c) to set the variable "ar" to an "activation
   record":

     if (lua_isnumber(L, arg+1)) {
       if (!lua_getstack(L1, (int)lua_tointeger(L, arg+1), &ar)) {
         lua_pushnil(L);  /* level out of range */
         return 1;
       }
     }

   I would like to have _variants_ of these functions, to be called
   debug.mygetinfo, debug.mygetlocal and debug.mysetlocal, that would
   accept an alternative to a numerical "level". Running

     ar = debug.mygetstack(2)

   would set ar to a string like

     "activation record: 0x125cf20"

   whose address part points to the "activation record" of a function
   in the call stack, like the pointer that

     lua_getstack(L1, (int)lua_tointeger(L, arg+1), &ar)

   puts into ar, and if we are super-ultra-careful then we can call
   debug.mygetinfo, debug.mygetlocal and debug.mysetlocal in either of
   these ways, the second one being equivalent to the first one:

     debug.mygetinfo (2,  "n")
     debug.mygetinfo (ar, "n")
     debug.mygetlocal(2,  3)
     debug.mygetlocal(ar, 3)
     debug.mysetlocal(2,  3, 42)
     debug.mysetlocal(ar, 3, 42)

   But OF COURSE if we set ar to a bad address, say,

     ar = "activation record: 0x12345678"

   then debug.mygetinfo, debug.mygetlocal and debug.mysetlocal WOULD
   NOT HESITATE to use that address and segfault (HAHAHA! DEAL WITH
   THIS, MODERN PROGRAMMERS!!!)...

   The question is: has anyone implemented something like this, or
   something that would cover a part of this? I haven't written any C
   code in ages... I think I can implement it myself, alone, but that
   would take me one or two full days just for a prototype in which I
   would just change ldblib.c... putting these new functions into a
   ".so" would take more.


Thanks in advance!!!
  Eduardo Ochs =)
  http://angg.twu.net/dednat6.html   <- (for lualatex users)

Reply | Threaded
Open this post in threaded view
|

Re: Adding another way to point to "levels" to debug.getinfo and friends

Jonathan Goble
While this is an interesting discussion, I think that when the OP said
a "header", he meant a brief statement at the top of the email body
serving as a preface to the real content, not an RFC-style "header".

On Sun, May 12, 2019 at 11:52 PM Philippe Verdy <[hidden email]> wrote:

>
> One way to make sure you will not conflict with any other standard RFC, is to name your headers using a UUID (formatted as syntaxically conforming header name).
> Generate one randomly (to avoid collisions), attempt to search this generated GUID online if you want to make sure it is unique, keep that on your records.
> In fact any randomly generated 128-bit integer can fit; convert it converted to ASCII using some set of safe digits (not just limited to hexadecimals) and without the extra group separators and surrounding braces commonly seen) and it will not be very long.
>
> RFC 2822 states that "A field name MUST be composed of printable US-ASCII characters (i.e., characters that have values between 33 and 126, inclusive), except colon...". This would allow using characters in a set of 93, but as they are case insensitive, you also need to remove 26 dual-cased letters from the set to form the base, leaving a choice of 67 characters.
> You can then use a Base-67 conversion.
>
> And in base 67, a 128-bit random integer (or UUID) just needs ceil(128/log2(67))=22 characters (there are some extra bits not needed in the first of last character of the sequence, you may take that into account to avoid generating non-letters in these positions).
>
> You may prefer using a base-64 conversion because ceil(128/log2(64))=22 and the encoding will not be longer (you can still append some additional characters to make subheaders, or generate other 22-character header each time). A Base-64 conversion will work for you, but not with the two alternate alphabets defined in RFC 4648 (because they are both case-sensitive).
>
> You can choose your 64-characters alphabet so that it will avoid using "_" (useful as additional group separators, to allow "visual" control of its length and better "readibility"), and the double quote and backslash (which may be useful to embed your header in quoted strings, including in constants of programming languages like C or Java, without needing any escaping).
>
> I would suggest formatting the 22 digits in Base67 in groups of 4 or 5 digits, except the first and last group having only 1 digit
>
> So you don't need the "X-" followed by a "readable" header name, which is much more likely to enter in collision with other apps (or evolutions of RFC 2822 in its BCP standard track, or inclusion of RFC 2822 in a new standard protocol) than a randomly generated header.
>
> And a header name like this one will work:
>
>   "GE16Q$,18'(4<SG@HB.N5S" + ":" + "(some value here)" + CRLF
>
> just like this one with 4 additional "formatting group separators" inserted every 5 digits:
>
>   "G_E16Q$_,18'(_4<SG@_HB.N5_S" + ":" + "(some value here)" + CRLF
>
> which is also equivalent to:
>
>   "G_e16q$_,18'(_4<sg@_hb.n5_s" + ":" + CRLF + SPACE +
>   "(some value here)" + CRLF
>
> You may prepend an "X-" to this random header, if you still want to make sure it "looks" like a legacy header, but in that case I suggest you use Base-32 from RFC 4648 (i.e. letters A to Z and digits 2 to 7, avoiding 0 and I confused with letter O and I), without any padding and group separators (from a 128-bit UUID or random number, you need 26 digits in base 32, and with the "X-" prefix, your header name will have 28 characters).
>
>
> Le dim. 12 mai 2019 à 23:44, Eduardo Ochs <[hidden email]> a écrit :
>>
>> Hi list,
>>
>> Two questions:
>>
>> 1) Is there a standard header that I can put in my e-mails that means
>>    "this is _NOT GOING to be used in production code UNDER ANY
>>    CIRCUMSTANCES_, this is going to be a personal hack that I will
>>    only load into a Lua interpreter BY HAND for some VERY CONTROLLED
>>    tests, etc, etc"?
>>
>>    (Can you please suppose that I started my e-mail with a header like
>>    this? I've been considering asking the question below here at the
>>    list for YEARS, but EVERY SINGLE TIME I predicted the probable
>>    reaction of the professional programmers in the list and gave
>>    up...)
>>
>>    By the way, I am the author of the article "Bootstrapping a Forth
>>    in 40 lines of Lua code" that appeared in the Lua Gems book. One of
>>    its last paragraphs is this:
>>
>>      I've met many people over the years who have been Forth
>>      enthusiasts in the past, and we often end up discussing what made
>>      Forth so thrilling to use at that time - and what we can do to
>>      adapt its ideas to the computers of today. My personal impression
>>      is that Forth's main points were not the ones that I listed at
>>      the beginning of this section, and that I said that were easy to
>>      quantify; rather, what was most important was that nothing was
>>      hidden, there were no complex data structures around with
>>      "don't-look-at-this" parts (think on garbage collection in Lua,
>>      for example, and Lua's tables - beginners need to be convinced to
>>      see these things abstractly, as the concrete details of the
>>      implementation are hard), and _everything_ - code, data,
>>      dictionaries, stacks - were just linear sequences of bytes, that
>>      could be read and modified directly if we wished to. We had total
>>      freedom, defining new words was quick, and experiments were quick
>>      to make; that gave us a sense of power that was totally different
>>      from, say, the one that a Python user feels today because he has
>>      huge libraries at his fingertips.
>>
>>    The technical question that I want to ask is related to using Lua
>>    as Forths were used in the early 90's - there were LOTS of commands
>>    that if used wrongly could freeze the system and require a reboot,
>>    and we were perfectly happy with that.
>>
>>
>> 2) Here is the idea; the question is below.
>>
>>    The functions debug.getinfo, debug.getlocal and debug.setlocal are
>>    usually called with an integer argument that the manual refers to
>>    as "level", that is processed like this (I took the code from
>>    db_getinfo, in ldblib.c) to set the variable "ar" to an "activation
>>    record":
>>
>>      if (lua_isnumber(L, arg+1)) {
>>        if (!lua_getstack(L1, (int)lua_tointeger(L, arg+1), &ar)) {
>>          lua_pushnil(L);  /* level out of range */
>>          return 1;
>>        }
>>      }
>>
>>    I would like to have _variants_ of these functions, to be called
>>    debug.mygetinfo, debug.mygetlocal and debug.mysetlocal, that would
>>    accept an alternative to a numerical "level". Running
>>
>>      ar = debug.mygetstack(2)
>>
>>    would set ar to a string like
>>
>>      "activation record: 0x125cf20"
>>
>>    whose address part points to the "activation record" of a function
>>    in the call stack, like the pointer that
>>
>>      lua_getstack(L1, (int)lua_tointeger(L, arg+1), &ar)
>>
>>    puts into ar, and if we are super-ultra-careful then we can call
>>    debug.mygetinfo, debug.mygetlocal and debug.mysetlocal in either of
>>    these ways, the second one being equivalent to the first one:
>>
>>      debug.mygetinfo (2,  "n")
>>      debug.mygetinfo (ar, "n")
>>      debug.mygetlocal(2,  3)
>>      debug.mygetlocal(ar, 3)
>>      debug.mysetlocal(2,  3, 42)
>>      debug.mysetlocal(ar, 3, 42)
>>
>>    But OF COURSE if we set ar to a bad address, say,
>>
>>      ar = "activation record: 0x12345678"
>>
>>    then debug.mygetinfo, debug.mygetlocal and debug.mysetlocal WOULD
>>    NOT HESITATE to use that address and segfault (HAHAHA! DEAL WITH
>>    THIS, MODERN PROGRAMMERS!!!)...
>>
>>    The question is: has anyone implemented something like this, or
>>    something that would cover a part of this? I haven't written any C
>>    code in ages... I think I can implement it myself, alone, but that
>>    would take me one or two full days just for a prototype in which I
>>    would just change ldblib.c... putting these new functions into a
>>    ".so" would take more.
>>
>>
>> Thanks in advance!!!
>>   Eduardo Ochs =)
>>   http://angg.twu.net/dednat6.html   <- (for lualatex users)
>>

Reply | Threaded
Open this post in threaded view
|

Re: Adding another way to point to "levels" to debug.getinfo and friends

Philippe Verdy
In reply to this post by Philippe Verdy
As well, if you use Base32 from RFC4648 (letters A..Z and digits 2..7), to encode a 128 bit UUID, the 26 chararacters encoded will actually encode 130 bits. This means they encode 2 spare bits, and you can use them to avoid some characters in the first and last position in the 26.
If you want to avoid the X or digits in the first position (because it is used in many legacy apps), it's simple: make the first spare bit equal to 0 and put it in the highest order position of the 1st group of 5 bits (so the first encoded character will necessarily be a letter in A..P).
As well you can avoid the last character to be a digit in 2..7 by also inserting the second spare bit in the highest order position of the last group of 5 bits.
Now the 26 characters can be written in groups of 5 characters by hyphens (-).

You get header names like in the set from "Aaaaa-aaaaa-aaaaa-aaaaa-aaaaa-a:" to "P7777-77777-77777-77777-77777-p:" (you can use any capitalization you want for the letters) which are still relatively easy to read/write and possibly memorize (thanks to grouping and the choice of non ambiguous letters/digits), unique (randomly generated), and even shorter then some legacy "X-" header names like "X-Pepperfish-Transaction:", and none of these random header names will collide with legacy "X-" ones (different form) or with future RFC that will deprecate RFC2822 in its standard track (using human-significant keywords), and that still look in good format very legacy mail processors.




Le lun. 13 mai 2019 à 05:51, Philippe Verdy <[hidden email]> a écrit :
One way to make sure you will not conflict with any other standard RFC, is to name your headers using a UUID (formatted as syntaxically conforming header name).
Generate one randomly (to avoid collisions), attempt to search this generated GUID online if you want to make sure it is unique, keep that on your records.
In fact any randomly generated 128-bit integer can fit; convert it converted to ASCII using some set of safe digits (not just limited to hexadecimals) and without the extra group separators and surrounding braces commonly seen) and it will not be very long.

RFC 2822 states that "A field name MUST be composed of printable US-ASCII characters (i.e., characters that have values between 33 and 126, inclusive), except colon...". This would allow using characters in a set of 93, but as they are case insensitive, you also need to remove 26 dual-cased letters from the set to form the base, leaving a choice of 67 characters.
You can then use a Base-67 conversion.

And in base 67, a 128-bit random integer (or UUID) just needs ceil(128/log2(67))=22 characters (there are some extra bits not needed in the first of last character of the sequence, you may take that into account to avoid generating non-letters in these positions).

You may prefer using a base-64 conversion because ceil(128/log2(64))=22 and the encoding will not be longer (you can still append some additional characters to make subheaders, or generate other 22-character header each time). A Base-64 conversion will work for you, but not with the two alternate alphabets defined in RFC 4648 (because they are both case-sensitive).

You can choose your 64-characters alphabet so that it will avoid using "_" (useful as additional group separators, to allow "visual" control of its length and better "readibility"), and the double quote and backslash (which may be useful to embed your header in quoted strings, including in constants of programming languages like C or Java, without needing any escaping).

I would suggest formatting the 22 digits in Base67 in groups of 4 or 5 digits, except the first and last group having only 1 digit

So you don't need the "X-" followed by a "readable" header name, which is much more likely to enter in collision with other apps (or evolutions of RFC 2822 in its BCP standard track, or inclusion of RFC 2822 in a new standard protocol) than a randomly generated header.

And a header name like this one will work:

  "GE16Q$,18'(4<SG@HB.N5S" + ":" + "(some value here)" + CRLF

just like this one with 4 additional "formatting group separators" inserted every 5 digits:

  "G_E16Q$_,18'(_4<SG@_HB.N5_S" + ":" + "(some value here)" + CRLF

which is also equivalent to:

  "G_e16q$_,18'(_4<sg@_hb.n5_s" + ":" + CRLF + SPACE +
  "(some value here)" + CRLF

You may prepend an "X-" to this random header, if you still want to make sure it "looks" like a legacy header, but in that case I suggest you use Base-32 from RFC 4648 (i.e. letters A to Z and digits 2 to 7, avoiding 0 and I confused with letter O and I), without any padding and group separators (from a 128-bit UUID or random number, you need 26 digits in base 32, and with the "X-" prefix, your header name will have 28 characters).


Le dim. 12 mai 2019 à 23:44, Eduardo Ochs <[hidden email]> a écrit :
Hi list,

Two questions:

1) Is there a standard header that I can put in my e-mails that means
   "this is _NOT GOING to be used in production code UNDER ANY
   CIRCUMSTANCES_, this is going to be a personal hack that I will
   only load into a Lua interpreter BY HAND for some VERY CONTROLLED
   tests, etc, etc"?

   (Can you please suppose that I started my e-mail with a header like
   this? I've been considering asking the question below here at the
   list for YEARS, but EVERY SINGLE TIME I predicted the probable
   reaction of the professional programmers in the list and gave
   up...)

   By the way, I am the author of the article "Bootstrapping a Forth
   in 40 lines of Lua code" that appeared in the Lua Gems book. One of
   its last paragraphs is this:

     I've met many people over the years who have been Forth
     enthusiasts in the past, and we often end up discussing what made
     Forth so thrilling to use at that time - and what we can do to
     adapt its ideas to the computers of today. My personal impression
     is that Forth's main points were not the ones that I listed at
     the beginning of this section, and that I said that were easy to
     quantify; rather, what was most important was that nothing was
     hidden, there were no complex data structures around with
     "don't-look-at-this" parts (think on garbage collection in Lua,
     for example, and Lua's tables - beginners need to be convinced to
     see these things abstractly, as the concrete details of the
     implementation are hard), and _everything_ - code, data,
     dictionaries, stacks - were just linear sequences of bytes, that
     could be read and modified directly if we wished to. We had total
     freedom, defining new words was quick, and experiments were quick
     to make; that gave us a sense of power that was totally different
     from, say, the one that a Python user feels today because he has
     huge libraries at his fingertips.

   The technical question that I want to ask is related to using Lua
   as Forths were used in the early 90's - there were LOTS of commands
   that if used wrongly could freeze the system and require a reboot,
   and we were perfectly happy with that.


2) Here is the idea; the question is below.

   The functions debug.getinfo, debug.getlocal and debug.setlocal are
   usually called with an integer argument that the manual refers to
   as "level", that is processed like this (I took the code from
   db_getinfo, in ldblib.c) to set the variable "ar" to an "activation
   record":

     if (lua_isnumber(L, arg+1)) {
       if (!lua_getstack(L1, (int)lua_tointeger(L, arg+1), &ar)) {
         lua_pushnil(L);  /* level out of range */
         return 1;
       }
     }

   I would like to have _variants_ of these functions, to be called
   debug.mygetinfo, debug.mygetlocal and debug.mysetlocal, that would
   accept an alternative to a numerical "level". Running

     ar = debug.mygetstack(2)

   would set ar to a string like

     "activation record: 0x125cf20"

   whose address part points to the "activation record" of a function
   in the call stack, like the pointer that

     lua_getstack(L1, (int)lua_tointeger(L, arg+1), &ar)

   puts into ar, and if we are super-ultra-careful then we can call
   debug.mygetinfo, debug.mygetlocal and debug.mysetlocal in either of
   these ways, the second one being equivalent to the first one:

     debug.mygetinfo (2,  "n")
     debug.mygetinfo (ar, "n")
     debug.mygetlocal(2,  3)
     debug.mygetlocal(ar, 3)
     debug.mysetlocal(2,  3, 42)
     debug.mysetlocal(ar, 3, 42)

   But OF COURSE if we set ar to a bad address, say,

     ar = "activation record: 0x12345678"

   then debug.mygetinfo, debug.mygetlocal and debug.mysetlocal WOULD
   NOT HESITATE to use that address and segfault (HAHAHA! DEAL WITH
   THIS, MODERN PROGRAMMERS!!!)...

   The question is: has anyone implemented something like this, or
   something that would cover a part of this? I haven't written any C
   code in ages... I think I can implement it myself, alone, but that
   would take me one or two full days just for a prototype in which I
   would just change ldblib.c... putting these new functions into a
   ".so" would take more.


Thanks in advance!!!
  Eduardo Ochs =)
  http://angg.twu.net/dednat6.html   <- (for lualatex users)

Reply | Threaded
Open this post in threaded view
|

Re: Adding another way to point to "levels" to debug.getinfo and friends

Philippe Verdy
In reply to this post by Daurnimator
As well, if you use Base32 from RFC4648 (letters A..Z and digits 2..7), to encode a 128 bit UUID, the 26 chararacters encoded will actually encode 130 bits. This means they encode 2 spare bits, and you can use them to avoid some characters in the first and last position in the 26.
If you want to avoid the X or digits in the first position (because it is used in many legacy apps), it's simple: make the first spare bit equal to 0 and put it in the highest order position of the 1st group of 5 bits (so the first encoded character will necessarily be a letter in A..P).
As well you can avoid the last character to be a digit in 2..7 by also inserting the second spare bit in the highest order position of the last group of 5 bits.
Now the 26 characters can be written in groups of 5 characters by hyphens (-).

You get header names like in the set from "Aaaaa-aaaaa-aaaaa-aaaaa-aaaaa-a:" to "P7777-77777-77777-77777-77777-p:" (you can use any capitalization you want for the letters) which are still relatively easy to read/write and possibly memorize (thanks to grouping and the choice of non ambiguous letters/digits), unique (randomly generated), and even shorter then some legacy "X-" header names like "X-Pepperfish-Transaction:", and none of these random header names will collide with legacy "X-" ones (different form) or with future RFC that will deprecate RFC2822 in its standard track (using human-significant keywords), and that still look in good format very legacy mail processors.


Le lun. 13 mai 2019 à 03:55, Daurnimator <[hidden email]> a écrit :
On Mon, 13 May 2019 at 09:03, Sean Conner <[hidden email]> wrote:
> In other RFC documents (too many to mention) private or experimental fields
> are usually labeled with "X-" (or "x-") so your best bet is to create a
> header name starting with "X-" to be safe.

Please stop using the X- prefix! See RFC 6648:

This document generalizes from the experience of the email and SIP
communities by doing the following:

   1.  Deprecates the "X-" convention for newly defined parameters in
       application protocols, including new parameters for established
       protocols.  This change applies even where the "X-" convention
       was only implicit, and not explicitly provided, such as was done
       for email in [RFC822].

Reply | Threaded
Open this post in threaded view
|

Re: Adding another way to point to "levels" to debug.getinfo and friends

Philippe Verdy
In reply to this post by Jonathan Goble


Le lun. 13 mai 2019 à 06:50, Jonathan Goble <[hidden email]> a écrit :
While this is an interesting discussion, I think that when the OP said
a "header", he meant a brief statement at the top of the email body
serving as a preface to the real content, not an RFC-style "header".

Sorry for my last double reply, there was a glitch in Gmail saying that the first message was not sent due to a server error or temporary caching desynchronization (and then the interface blocking on attempt to synchronize). But apparently it was still sent correctly, as I see after forcing the complete reload of its web interface...
Reply | Threaded
Open this post in threaded view
|

Re: Adding another way to point to "levels" to debug.getinfo and friends

Sean Conner
In reply to this post by Eduardo Ochs
It was thus said that the Great Eduardo Ochs once stated:
> 2) Here is the idea; the question is below.
>
>    The functions debug.getinfo, debug.getlocal and debug.setlocal are
>    usually called with an integer argument that the manual refers to
>    as "level", that is processed like this (I took the code from
>    db_getinfo, in ldblib.c) to set the variable "ar" to an "activation
>    record":

  To get this thread back on track (sorry about that), I've given this some
thought, and I think this is the best approach to what you want.  Somewhere,
you want to add:

        #define TYPE_LUA_DEBUG "lua_Debug"

        luaL_newmetatable(L,TYPE_LUA_DEBUG);
        /* add metamethods as you see fit */

Then:

        static int mygetstack(lua_State *L)
        {
          lua_Debug *par;
          int        level = luaL_checkinteger(L,1);
         
          par = lua_newuserdata(L,sizeof(lua_Debug));
          if (!lua_getstack(L,level,par))
            return 0;
          luaL_getmetatable(L,TYPE_LUA_DEBUG);
          lua_setmetatable(L,-2);
          return 1;
        }
       
  This returns a userdata of type lua_Debug.  Lua will track the memory for
you, so there's no leaking.  Then, in the appropriate locations where you
take a level number, or a function, you can instead check for a userdata of
type TYPE_LUA_DEBUG, and then pass that to the appropriate Lua API call:

        lua_Debug *par = luaL_checkudata(L,idx,TYPE_LUA_DEBUG);
       
  If you want a pretty representation to print out, add a __tostring
metamethod to the metatable---something like:

        static int myluadebug___tostring(lua_State *L)
        {
          lua_pushfstring(L,"activation record: %p",lua_touserdata(L,1)); /* [1] */
          return 1;
        }

That's really all you need I would think.

  -spc
 
[1] Why not luaL_checkudata()?  The only way this can fail is
        intentional user intervention (as the only way to this function is
        via the metatable of the lua_Debug userdata we're using), but if
        that is a concern, then yes, use luaL_checkudata() here.

Reply | Threaded
Open this post in threaded view
|

Re: Adding another way to point to "levels" to debug.getinfo and friends

Eduardo Ochs
Hi!

Sorry for the delay... anyway: thanks A LOT!
The rest is easy. I'll send a message to the list when I have the code running -
but don't hold your breath. =)
  Thanks again =) =) =),
    Eduardo Ochs


On Mon, 13 May 2019 at 04:17, Sean Conner <[hidden email]> wrote:
It was thus said that the Great Eduardo Ochs once stated:
> 2) Here is the idea; the question is below.
>
>    The functions debug.getinfo, debug.getlocal and debug.setlocal are
>    usually called with an integer argument that the manual refers to
>    as "level", that is processed like this (I took the code from
>    db_getinfo, in ldblib.c) to set the variable "ar" to an "activation
>    record":

  To get this thread back on track (sorry about that), I've given this some
thought, and I think this is the best approach to what you want.  Somewhere,
you want to add:

        #define TYPE_LUA_DEBUG  "lua_Debug"

        luaL_newmetatable(L,TYPE_LUA_DEBUG);
        /* add metamethods as you see fit */

Then:

        static int mygetstack(lua_State *L)
        {
          lua_Debug *par;
          int        level = luaL_checkinteger(L,1);

          par = lua_newuserdata(L,sizeof(lua_Debug));
          if (!lua_getstack(L,level,par))
            return 0;
          luaL_getmetatable(L,TYPE_LUA_DEBUG);
          lua_setmetatable(L,-2);
          return 1;
        }

  This returns a userdata of type lua_Debug.  Lua will track the memory for
you, so there's no leaking.  Then, in the appropriate locations where you
take a level number, or a function, you can instead check for a userdata of
type TYPE_LUA_DEBUG, and then pass that to the appropriate Lua API call:

        lua_Debug *par = luaL_checkudata(L,idx,TYPE_LUA_DEBUG);

  If you want a pretty representation to print out, add a __tostring
metamethod to the metatable---something like:

        static int myluadebug___tostring(lua_State *L)
        {
          lua_pushfstring(L,"activation record: %p",lua_touserdata(L,1)); /* [1] */
          return 1;
        }

That's really all you need I would think.

  -spc

[1]     Why not luaL_checkudata()?  The only way this can fail is
        intentional user intervention (as the only way to this function is
        via the metatable of the lua_Debug userdata we're using), but if
        that is a concern, then yes, use luaL_checkudata() here.