Feature request: hiding upvalues

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Feature request: hiding upvalues

Dirk Laurie-2
If at the top of my Lua program, I put:

local math = require"mathx"

then I have burnt my bridges. I can never again access the built-in
'math' as a global variable. I can still get it as '_G.math', but no
matter how deeply do blocks and functions are nested, 'math' will
either be this upvalue or a newer local variable shadowing it.

Not too bad, in the case of 'math': actually Luiz's 'mathx' is in many
ways nicer if your C compiler can compile it.

But what about other names? (Acknowledgment: the following observation
was pointed out on this list some time ago by Egor Skriptunoff.) In
the object-oriented programming idiom of putting _ENV as first
parameter of a function, this persistent property of an upvalue is a
potential source of hard-to-find bugs.

Example:

> rotate = function (_ENV,c,s) return {x=c*x-s*y,y=s*x+c*y} end
> z = rotate({x=0.8,y=0.6},0.8,0.6); print(z.x,z.y)
0.28    0.96

But:

> local x=1; rotate = function (_ENV,c,s) return {x=c*x-s*y,y=s*x+c*y} end
> z = rotate({x=0.8,y=0.6},0.8,0.6); print(z.x,z.y)
0.44    1.08

The bug is obvious, here in the interpreter. You can see the offending
upvalue x.

But in a program, if 200 lines earlier, in an innocent-looking bit of
code that should have been enclosed in do ... end, we had

local x = 1
for _,v in ipairs(t) do x = x*v end
print("The product is: ",x)

that x would still have remained visible, and the same bug would be
present. Not so easy to find anymore.

There is a workaround. There always is. Wrap the body in 'load()'.

> local x=1; rotate = load"local _ENV,c,s = ...; return {x=c*x-s*y,y=s*x+c*y}"
> z = rotate({x=0.8,y=0.6},0.8,0.6); print(z.x,z.y)
0.28    0.96

It's not even many more keystrokes. But it does not do the same thing.
You'll need extra code to check that 'load' succeeded; runtime error
messages from inside are harder to understand.

I'm not too sure how one could implement hiding of upvalues at the
language level. (At the implementation level, it's obvious. Just skip
the phase that looks for them.) Maybe a keyword that makes upvalues
invisible to the end of the current local scope.

    rotate = function (_ENV,c,s) blind
      return {x=c*x-s*y,y=s*x+c*y}
    end

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Philippe Verdy
a combination of fsetenv() and __index in metatable could do the trick (__index is a method that is used to look for variables not found as keys in a Lua table, and the environment of a function is a standard Lua table)

Le mar. 13 nov. 2018 à 09:17, Dirk Laurie <[hidden email]> a écrit :
If at the top of my Lua program, I put:

local math = require"mathx"

then I have burnt my bridges. I can never again access the built-in
'math' as a global variable. I can still get it as '_G.math', but no
matter how deeply do blocks and functions are nested, 'math' will
either be this upvalue or a newer local variable shadowing it.

Not too bad, in the case of 'math': actually Luiz's 'mathx' is in many
ways nicer if your C compiler can compile it.

But what about other names? (Acknowledgment: the following observation
was pointed out on this list some time ago by Egor Skriptunoff.) In
the object-oriented programming idiom of putting _ENV as first
parameter of a function, this persistent property of an upvalue is a
potential source of hard-to-find bugs.

Example:

> rotate = function (_ENV,c,s) return {x=c*x-s*y,y=s*x+c*y} end
> z = rotate({x=0.8,y=0.6},0.8,0.6); print(z.x,z.y)
0.28    0.96

But:

> local x=1; rotate = function (_ENV,c,s) return {x=c*x-s*y,y=s*x+c*y} end
> z = rotate({x=0.8,y=0.6},0.8,0.6); print(z.x,z.y)
0.44    1.08

The bug is obvious, here in the interpreter. You can see the offending
upvalue x.

But in a program, if 200 lines earlier, in an innocent-looking bit of
code that should have been enclosed in do ... end, we had

local x = 1
for _,v in ipairs(t) do x = x*v end
print("The product is: ",x)

that x would still have remained visible, and the same bug would be
present. Not so easy to find anymore.

There is a workaround. There always is. Wrap the body in 'load()'.

> local x=1; rotate = load"local _ENV,c,s = ...; return {x=c*x-s*y,y=s*x+c*y}"
> z = rotate({x=0.8,y=0.6},0.8,0.6); print(z.x,z.y)
0.28    0.96

It's not even many more keystrokes. But it does not do the same thing.
You'll need extra code to check that 'load' succeeded; runtime error
messages from inside are harder to understand.

I'm not too sure how one could implement hiding of upvalues at the
language level. (At the implementation level, it's obvious. Just skip
the phase that looks for them.) Maybe a keyword that makes upvalues
invisible to the end of the current local scope.

    rotate = function (_ENV,c,s) blind
      return {x=c*x-s*y,y=s*x+c*y}
    end

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Philippe Verdy
In reply to this post by Dirk Laurie-2
I'm not too sure how one could implement hiding of upvalues at the
language level. (At the implementation level, it's obvious. Just skip
the phase that looks for them.)
This is not so obvious because Lua highly depends on this; the "phrase" that looks for it is exactly the one that lookups variables in the environment using its "__index" meta-entry, which is where the environment is already stated: so the first level of lookup would be required (otherwise the function itself not would have itself access its own local variables) but you want to avoid the recursion of the lookup to the next level to look for upvalues.
Note that this recursion is a trailing recursion (so Lua optimizes it natively as a loop: the "phrase" you want to hide would be a statement within that loop, and you want it to be used only on a specific loop number to break that loop by returning early a "nil" value so that an "undefined variable error" can be stated). The difficulty is that there's no loop number which is accessible. So all I see you can do is to set the "__index" meta entry specifically to your need.
Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Dirk Laurie-2
Op Di., 13 Nov. 2018 om 14:04 het Philippe Verdy <[hidden email]> geskryf:
>>
>> I'm not too sure how one could implement hiding of upvalues at the
>> language level. (At the implementation level, it's obvious. Just skip
>> the phase that looks for them.)
>
> This is not so obvious because Lua highly depends on this; the "phrase" that looks for it is exactly the one that lookups variables in the environment using its "__index" meta-entry, which is where the environment is already stated: so the first level of lookup would be required (otherwise the function itself not would have itself access its own local variables) but you want to avoid the recursion of the lookup to the next level to look for upvalues.
> Note that this recursion is a trailing recursion (so Lua optimizes it natively as a loop: the "phrase" you want to hide would be a statement within that loop, and you want it to be used only on a specific loop number to break that loop by returning early a "nil" value so that an "undefined variable error" can be stated). The difficulty is that there's no loop number which is accessible. So all I see you can do is to set the "__index" meta entry specifically to your need.

I think we are talking at cross-purposes.

Whether a name is recognized as an upvalue happens at compile time. No
metatable is involved. It's a question of what is in scope.

Do 'luac -l' for my two examples. The one without "x=1" generates the
instruction   GETTABLE     4 0 -1    ; "x"
but the one with "x=1" generates   GETUPVAL     4 0    ; x

The scope of a name is lexical. That means there is a sequence of
local scopes with the entire chunk outermost, each containing a
smaller scope until we get to the innermost scope. The compiler does
this when one refers to 'x':

1. Is there a local variable named 'x' in the innermost scope? If so,
it does not need to be loaded: the VM instruction can access it
directly.
2. For each containing containing scope working outwards, the question
is asked again. If a local variable named 'x' is found in that scope,
a GETUPVAL instruction is generated to load the variable via the
upvalue list that sits in the function's closure.
3. If no containing scope has 'x', a GETTABLE instruction is issued to
load the value as a table access from _ENV.

The requested "blind" keyword would merely tell the compiler to treat
the current innermost scope, from that point onwards, as not having a
containing scope, so that step 2 is an empty loop.

Youmay have been thinking of what happens in case 3: the GETTABLE from
_ENV could trigger a whole chin of __index metamethods, depending on
what you have done with _ENV (in fact, since this idiom is used in an
object-oriented paradigm, your _ENV is an object which may well have a
complicated metatable).

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Duane Leslie
In reply to this post by Dirk Laurie-2

On 13 Nov 2018, at 19:16, Dirk Laurie <[hidden email]> wrote:

I'm not too sure how one could implement hiding of upvalues at the
language level. (At the implementation level, it's obvious. Just skip
the phase that looks for them.) Maybe a keyword that makes upvalues
invisible to the end of the current local scope.

   rotate = function (_ENV,c,s) blind
     return {x=c*x-s*y,y=s*x+c*y}
   end

This does need a compiler change to fix because of the way locals and upvalues get hard coded at compile time.  My "declared upvalue" patch from 2-ish years ago fixes this as well as the recent request regarding upvalue ordering (I mentioned it in that thread).  I raise it again here because your proposal of adding a 'blind' keyword is the same as the empty upvalue list `<>` in my syntax.


Regards,

Duane.
Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Philippe Verdy
In reply to this post by Dirk Laurie-2
I don't think so; within the same block of statements, all variables are automatically bound to the same environment (i.e. a table), and the compiler does not need to know if it's local or external: all of them are local and accessed by the "__index" meta entry of the environment table, which is always used as  first level of indirection before performing an actual lookup to the environment table itself (not its metatable).
Unlike tables in Lua, all environments must have a metatable associated to their table, so there's always an "__index" entry in it (it also has a "__newindex" for assignments). A compiler may want to perform some optimizations for not creating a metatable with "__index" and "__newindex", but it cannot safely know if these two entries are set or not (they may be set by the block of instruction by using the fsetenv function, possibly by calling external functions which will execute with the parent element in their on environment linked to the parent environment, and so can also modify the parent environment).
So all names are local. The fact that when assigning a variable or reading it has an external effect comes only from the fact that the default "__index" function will lookup in parent environments in a chain to see if there's a matching name: if no such name is found in the chain, then the effect of reading the variable will return "nil"; the same occurs for "__newindex" which also tries to lookup the local table, then if not found performs a lookup in the parent environment, and if not found it will then create a new variable in the initial environment.
All you want is to stop the recursive lookup of variable names in the chain of environment, so that all variables behave as pure local variables (creating as many new variables as needed).
It's not really possible to block the recursion: your code even needs the chain for all basic operations (including operators like "+"). If you break the lookup, then your local code can simply do nothing at all!
Remember that the environment does not include only local variables, it also includes all functions and operators your code can use.
So your proppsed "blind" keyword in:
  function (_ENV,c,s) blind
      return {x=c*x-s*y,y=s*x+c*y}
    end
would have the effect of leaving only three names accessibles: _ENV, c and s, but operations like "=" (assignment made via "__newindex" function call), "*", "-", and "+" would also have no defined function (their lookup would return nil, and you'd then get errors: cannot call a function referenced by nil !

The only way to do that is to allow passing selected properties you need for your function to run, by creating a restrictive environment, in which the function:
    function (c,s)
      return {x=c*x-s*y,y=s*x+c*y}
    end
now can run in perfect isolation: it is effectively the case that variable names "x" and "y" are not defined locally, but you have to force them to use the local environment and not any parent environment, but you sill need the function references for the 3 arithmetic operators. Note that for function calls (including operator evaluations) there's also a "__call" entry in the environment to find matching function names: functions are not called directly.

An interesting reading:

or more generally

and the manual of course (which details all "__" prefixed functions needed in valid environment and that allow your code to be really executable) :




Le mar. 13 nov. 2018 à 14:23, Dirk Laurie <[hidden email]> a écrit :
Op Di., 13 Nov. 2018 om 14:04 het Philippe Verdy <[hidden email]> geskryf:
>>
>> I'm not too sure how one could implement hiding of upvalues at the
>> language level. (At the implementation level, it's obvious. Just skip
>> the phase that looks for them.)
>
> This is not so obvious because Lua highly depends on this; the "phrase" that looks for it is exactly the one that lookups variables in the environment using its "__index" meta-entry, which is where the environment is already stated: so the first level of lookup would be required (otherwise the function itself not would have itself access its own local variables) but you want to avoid the recursion of the lookup to the next level to look for upvalues.
> Note that this recursion is a trailing recursion (so Lua optimizes it natively as a loop: the "phrase" you want to hide would be a statement within that loop, and you want it to be used only on a specific loop number to break that loop by returning early a "nil" value so that an "undefined variable error" can be stated). The difficulty is that there's no loop number which is accessible. So all I see you can do is to set the "__index" meta entry specifically to your need.

I think we are talking at cross-purposes.

Whether a name is recognized as an upvalue happens at compile time. No
metatable is involved. It's a question of what is in scope.

Do 'luac -l' for my two examples. The one without "x=1" generates the
instruction   GETTABLE     4 0 -1    ; "x"
but the one with "x=1" generates   GETUPVAL     4 0    ; x

The scope of a name is lexical. That means there is a sequence of
local scopes with the entire chunk outermost, each containing a
smaller scope until we get to the innermost scope. The compiler does
this when one refers to 'x':

1. Is there a local variable named 'x' in the innermost scope? If so,
it does not need to be loaded: the VM instruction can access it
directly.
2. For each containing containing scope working outwards, the question
is asked again. If a local variable named 'x' is found in that scope,
a GETUPVAL instruction is generated to load the variable via the
upvalue list that sits in the function's closure.
3. If no containing scope has 'x', a GETTABLE instruction is issued to
load the value as a table access from _ENV.

The requested "blind" keyword would merely tell the compiler to treat
the current innermost scope, from that point onwards, as not having a
containing scope, so that step 2 is an empty loop.

Youmay have been thinking of what happens in case 3: the GETTABLE from
_ENV could trigger a whole chin of __index metamethods, depending on
what you have done with _ENV (in fact, since this idiom is used in an
object-oriented paradigm, your _ENV is an object which may well have a
complicated metatable).
Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Philippe Verdy
Also note that if Lua was defined so that all variables are local by default, your function would need to be written something like:

  function (c,s)
      using __mul, __add, __sub
      return {x=c*x-s*y,y=s*x+c*y}
  end

i.e. you would need to explicitly import locally the external operators you need inside the code. You don't need to specify the variables c and s that are already explicitly imported locally in the environment (and bound with a value from the caller's environment).

Writing Lua code would be a severe nightmare: we would have to explicit imports (with a "using" declaration, behaving like "local" except that it initialized the local variable from the outer environment, using its "__index" method if needed) in every function we write, except if the function does nothing or just returns one of the variables in parameters or created with "local" (that function could not even perform any test on the value of parameters without the "__eq" or "__lt" or "__le" being explicitly imported by a "using" clause, and would not even be able to make a functional call to a function object given in parameter without importing the "__call" method with a "using" clause).

Lua is not C or C++: everything is an object, including functions themselves or constants (like integers and strings). Unlike Javascript, Lua also makes all operators as true objects (bound to functions).

The only thing that you have in Lua to control and force the locality of named variables (and implicit variables for operators, which are a syntaxic sugar to perform function calls) is the "local" keyword. Everything else allows "inheritance" using the chain of environments.



Le mar. 13 nov. 2018 à 22:24, Philippe Verdy <[hidden email]> a écrit :
I don't think so; within the same block of statements, all variables are automatically bound to the same environment (i.e. a table), and the compiler does not need to know if it's local or external: all of them are local and accessed by the "__index" meta entry of the environment table, which is always used as  first level of indirection before performing an actual lookup to the environment table itself (not its metatable).
Unlike tables in Lua, all environments must have a metatable associated to their table, so there's always an "__index" entry in it (it also has a "__newindex" for assignments). A compiler may want to perform some optimizations for not creating a metatable with "__index" and "__newindex", but it cannot safely know if these two entries are set or not (they may be set by the block of instruction by using the fsetenv function, possibly by calling external functions which will execute with the parent element in their on environment linked to the parent environment, and so can also modify the parent environment).
So all names are local. The fact that when assigning a variable or reading it has an external effect comes only from the fact that the default "__index" function will lookup in parent environments in a chain to see if there's a matching name: if no such name is found in the chain, then the effect of reading the variable will return "nil"; the same occurs for "__newindex" which also tries to lookup the local table, then if not found performs a lookup in the parent environment, and if not found it will then create a new variable in the initial environment.
All you want is to stop the recursive lookup of variable names in the chain of environment, so that all variables behave as pure local variables (creating as many new variables as needed).
It's not really possible to block the recursion: your code even needs the chain for all basic operations (including operators like "+"). If you break the lookup, then your local code can simply do nothing at all!
Remember that the environment does not include only local variables, it also includes all functions and operators your code can use.
So your proppsed "blind" keyword in:
  function (_ENV,c,s) blind
      return {x=c*x-s*y,y=s*x+c*y}
    end
would have the effect of leaving only three names accessibles: _ENV, c and s, but operations like "=" (assignment made via "__newindex" function call), "*", "-", and "+" would also have no defined function (their lookup would return nil, and you'd then get errors: cannot call a function referenced by nil !

The only way to do that is to allow passing selected properties you need for your function to run, by creating a restrictive environment, in which the function:
    function (c,s)
      return {x=c*x-s*y,y=s*x+c*y}
    end
now can run in perfect isolation: it is effectively the case that variable names "x" and "y" are not defined locally, but you have to force them to use the local environment and not any parent environment, but you sill need the function references for the 3 arithmetic operators. Note that for function calls (including operator evaluations) there's also a "__call" entry in the environment to find matching function names: functions are not called directly.

An interesting reading:

or more generally

and the manual of course (which details all "__" prefixed functions needed in valid environment and that allow your code to be really executable) :




Le mar. 13 nov. 2018 à 14:23, Dirk Laurie <[hidden email]> a écrit :
Op Di., 13 Nov. 2018 om 14:04 het Philippe Verdy <[hidden email]> geskryf:
>>
>> I'm not too sure how one could implement hiding of upvalues at the
>> language level. (At the implementation level, it's obvious. Just skip
>> the phase that looks for them.)
>
> This is not so obvious because Lua highly depends on this; the "phrase" that looks for it is exactly the one that lookups variables in the environment using its "__index" meta-entry, which is where the environment is already stated: so the first level of lookup would be required (otherwise the function itself not would have itself access its own local variables) but you want to avoid the recursion of the lookup to the next level to look for upvalues.
> Note that this recursion is a trailing recursion (so Lua optimizes it natively as a loop: the "phrase" you want to hide would be a statement within that loop, and you want it to be used only on a specific loop number to break that loop by returning early a "nil" value so that an "undefined variable error" can be stated). The difficulty is that there's no loop number which is accessible. So all I see you can do is to set the "__index" meta entry specifically to your need.

I think we are talking at cross-purposes.

Whether a name is recognized as an upvalue happens at compile time. No
metatable is involved. It's a question of what is in scope.

Do 'luac -l' for my two examples. The one without "x=1" generates the
instruction   GETTABLE     4 0 -1    ; "x"
but the one with "x=1" generates   GETUPVAL     4 0    ; x

The scope of a name is lexical. That means there is a sequence of
local scopes with the entire chunk outermost, each containing a
smaller scope until we get to the innermost scope. The compiler does
this when one refers to 'x':

1. Is there a local variable named 'x' in the innermost scope? If so,
it does not need to be loaded: the VM instruction can access it
directly.
2. For each containing containing scope working outwards, the question
is asked again. If a local variable named 'x' is found in that scope,
a GETUPVAL instruction is generated to load the variable via the
upvalue list that sits in the function's closure.
3. If no containing scope has 'x', a GETTABLE instruction is issued to
load the value as a table access from _ENV.

The requested "blind" keyword would merely tell the compiler to treat
the current innermost scope, from that point onwards, as not having a
containing scope, so that step 2 is an empty loop.

Youmay have been thinking of what happens in case 3: the GETTABLE from
_ENV could trigger a whole chin of __index metamethods, depending on
what you have done with _ENV (in fact, since this idiom is used in an
object-oriented paradigm, your _ENV is an object which may well have a
complicated metatable).
Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Gabriel Bertilson
In reply to this post by Philippe Verdy
It's not true that there is a chain of environment tables and that
local variables are looked up in a table by name (if I'm reading you
right), at least not in Lua 5.3. (I think this is also true of Lua
5.1, though some of the opcodes are different.)

Global and local variables are implemented differently. Global
variables are treated as fields in the _ENV table, while local
variables are assigned to registers by the compiler. So setting and
getting a global variable uses different opcodes than setting and
getting a local variable:

--------
$ luac -l -l -
x = 10
print(x)

main <stdin:0,0> (5 instructions at 0x55c499b0bde0)
0+ params, 2 slots, 1 upvalue, 0 locals, 3 constants, 0 functions
    1    [1]    SETTABUP     0 -1 -2    ; _ENV "x" 10
    2    [2]    GETTABUP     0 0 -3    ; _ENV "print"
    3    [2]    GETTABUP     1 0 -1    ; _ENV "x"
    4    [2]    CALL         0 2 1
    5    [2]    RETURN       0 1
constants (3) for 0x55c499b0bde0:
    1    "x"
    2    10
    3    "print"
locals (0) for 0x55c499b0bde0:
upvalues (1) for 0x55c499b0bde0:
    0    _ENV    1    0
--------
$ luac -l -l -
local x = 10
print(x)

main <stdin:0,0> (5 instructions at 0x55a6a4b9bde0)
0+ params, 3 slots, 1 upvalue, 1 local, 2 constants, 0 functions
    1    [1]    LOADK        0 -1    ; 10
    2    [2]    GETTABUP     1 0 -2    ; _ENV "print"
    3    [2]    MOVE         2 0
    4    [2]    CALL         1 2 1
    5    [2]    RETURN       0 1
constants (2) for 0x55a6a4b9bde0:
    1    10
    2    "print"
locals (1) for 0x55a6a4b9bde0:
    0    x    2    6
upvalues (1) for 0x55a6a4b9bde0:
    0    _ENV    1    0
--------

Here the global variable x is set with SETTABUP and gotten with
GETTABUP, using the constant "x", but the local variable x is set with
LOADK and gotten with MOVE, using the register index 0. The name of a
local variable isn't used by the bytecode instructions, though it is
stored elsewhere in the bytecode (so error messages can mention local
variables by name).

And upvalues have a different implementation from both locals and
globals; they are set with SETTABUP and gotten with GETTABUP.

So hash tables are not involved in the implementation of local
variables or upvalues at all.

There is also no chain of environment tables by default in Lua 5.3.
The metatable _ENV is nil if it hasn't been modified.

$ lua -e 'print(getmetatable(_ENV))'
nil

But I guess you can make a chain of environment tables by doing `local
_ENV = setmetatable({}, { __index = _ENV })`.

Because locals, globals, and upvalues are implemented differently, the
compiler must determine whether a variable is local, global or an
upvalue when a chunk is compiled and the nature of a variable in a
chunk cannot be changed after that point. So the function expression
`function () return x end` either returns an upvalue or a global, and
the assignment `x = 10` is resolved to an assignment to a local, an
upvalue, or a global depending on context.

— Gabriel

On Tue, Nov 13, 2018 at 3:25 PM Philippe Verdy <[hidden email]> wrote:

>
> I don't think so; within the same block of statements, all variables are automatically bound to the same environment (i.e. a table), and the compiler does not need to know if it's local or external: all of them are local and accessed by the "__index" meta entry of the environment table, which is always used as  first level of indirection before performing an actual lookup to the environment table itself (not its metatable).
> Unlike tables in Lua, all environments must have a metatable associated to their table, so there's always an "__index" entry in it (it also has a "__newindex" for assignments). A compiler may want to perform some optimizations for not creating a metatable with "__index" and "__newindex", but it cannot safely know if these two entries are set or not (they may be set by the block of instruction by using the fsetenv function, possibly by calling external functions which will execute with the parent element in their on environment linked to the parent environment, and so can also modify the parent environment).
> So all names are local. The fact that when assigning a variable or reading it has an external effect comes only from the fact that the default "__index" function will lookup in parent environments in a chain to see if there's a matching name: if no such name is found in the chain, then the effect of reading the variable will return "nil"; the same occurs for "__newindex" which also tries to lookup the local table, then if not found performs a lookup in the parent environment, and if not found it will then create a new variable in the initial environment.
> All you want is to stop the recursive lookup of variable names in the chain of environment, so that all variables behave as pure local variables (creating as many new variables as needed).
> It's not really possible to block the recursion: your code even needs the chain for all basic operations (including operators like "+"). If you break the lookup, then your local code can simply do nothing at all!
> Remember that the environment does not include only local variables, it also includes all functions and operators your code can use.
> So your proppsed "blind" keyword in:
>   function (_ENV,c,s) blind
>       return {x=c*x-s*y,y=s*x+c*y}
>     end
> would have the effect of leaving only three names accessibles: _ENV, c and s, but operations like "=" (assignment made via "__newindex" function call), "*", "-", and "+" would also have no defined function (their lookup would return nil, and you'd then get errors: cannot call a function referenced by nil !
>
> The only way to do that is to allow passing selected properties you need for your function to run, by creating a restrictive environment, in which the function:
>     function (c,s)
>       return {x=c*x-s*y,y=s*x+c*y}
>     end
> now can run in perfect isolation: it is effectively the case that variable names "x" and "y" are not defined locally, but you have to force them to use the local environment and not any parent environment, but you sill need the function references for the 3 arithmetic operators. Note that for function calls (including operator evaluations) there's also a "__call" entry in the environment to find matching function names: functions are not called directly.
>
> An interesting reading:
>
> http://lua-users.org/wiki/DetectingUndefinedVariables
> or more generally
> http://lua-users.org/wiki/LuaScoping
>
> and the manual of course (which details all "__" prefixed functions needed in valid environment and that allow your code to be really executable) :
>
> http://www.lua.org/manual/5.2/manual.html#2.4
>
>
>
> Le mar. 13 nov. 2018 à 14:23, Dirk Laurie <[hidden email]> a écrit :
>>
>> Op Di., 13 Nov. 2018 om 14:04 het Philippe Verdy <[hidden email]> geskryf:
>> >>
>> >> I'm not too sure how one could implement hiding of upvalues at the
>> >> language level. (At the implementation level, it's obvious. Just skip
>> >> the phase that looks for them.)
>> >
>> > This is not so obvious because Lua highly depends on this; the "phrase" that looks for it is exactly the one that lookups variables in the environment using its "__index" meta-entry, which is where the environment is already stated: so the first level of lookup would be required (otherwise the function itself not would have itself access its own local variables) but you want to avoid the recursion of the lookup to the next level to look for upvalues.
>> > Note that this recursion is a trailing recursion (so Lua optimizes it natively as a loop: the "phrase" you want to hide would be a statement within that loop, and you want it to be used only on a specific loop number to break that loop by returning early a "nil" value so that an "undefined variable error" can be stated). The difficulty is that there's no loop number which is accessible. So all I see you can do is to set the "__index" meta entry specifically to your need.
>>
>> I think we are talking at cross-purposes.
>>
>> Whether a name is recognized as an upvalue happens at compile time. No
>> metatable is involved. It's a question of what is in scope.
>>
>> Do 'luac -l' for my two examples. The one without "x=1" generates the
>> instruction   GETTABLE     4 0 -1    ; "x"
>> but the one with "x=1" generates   GETUPVAL     4 0    ; x
>>
>> The scope of a name is lexical. That means there is a sequence of
>> local scopes with the entire chunk outermost, each containing a
>> smaller scope until we get to the innermost scope. The compiler does
>> this when one refers to 'x':
>>
>> 1. Is there a local variable named 'x' in the innermost scope? If so,
>> it does not need to be loaded: the VM instruction can access it
>> directly.
>> 2. For each containing containing scope working outwards, the question
>> is asked again. If a local variable named 'x' is found in that scope,
>> a GETUPVAL instruction is generated to load the variable via the
>> upvalue list that sits in the function's closure.
>> 3. If no containing scope has 'x', a GETTABLE instruction is issued to
>> load the value as a table access from _ENV.
>>
>> The requested "blind" keyword would merely tell the compiler to treat
>> the current innermost scope, from that point onwards, as not having a
>> containing scope, so that step 2 is an empty loop.
>>
>> Youmay have been thinking of what happens in case 3: the GETTABLE from
>> _ENV could trigger a whole chin of __index metamethods, depending on
>> what you have done with _ENV (in fact, since this idiom is used in an
>> object-oriented paradigm, your _ENV is an object which may well have a
>> complicated metatable).

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Tim Hill

> On Nov 15, 2018, at 3:22 PM, Gabriel Bertilson <[hidden email]> wrote:
>
> So the function expression
> `function () return x end` either returns an upvalue or a global, and
> the assignment `x = 10` is resolved to an assignment

Actually it doesn’t return an upvalue OR a global .. it returns whatever value was in ‘x’ when the function was called.

—Tim


Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Philippe Verdy
No, it returns what was in the variable x in the upvalue (i.e. the x part of the closure where it was compiled in scope), and not the x which may be defined anywhere else the function will be called, which is not bound to this closure created at compile time.
The upvalues are in some table (even if the LuaVM uses a faster approach using an array for the environment of function closures, which allows faster and direct indexing by it's internal opcode).
The opcodes however are not relevant, they are internals of the VM, just like the limitation of the closure array size. This array does not even necessarily exist at runtime when actually these opcodes may have already been transformed to another instruction set: these opcodes are not part of the language itself, they change across versions of Lua whose default VM changes regularly its virtual instructions set and formats; the chain really exists when this closure is bound before calling the function code to prepare the new "register file"). anyway I think that what the Lua authors name "registers" is abusive, it is just a frame format, except that it uses a separate store than the call stack, and that the callstack itself is not necessarily represented as a single vector but can as well be a chained list: VM are free to use the represetnation that is more convenient to them and the integration contraint or memory constraints.
Basically I don't see any different between the "registers" model of the Lua VM and the "[BP+n]" indexing used in most stack-based CPU instruction sets. In fact a VM just needs 3 pointer registers to work: some stack pointer, a base frame pointer, and an instruction pointer, eventually also a flags register if they are needed by chained instructions like conditional jumps (everything else can as well be in stack and other registers are just faster "caches" aliasing what is in the stack (including when there's a single accumulator, which is a cache of what is in the top of stack). Then you can model the instruction sets (opcodes) as you want. beside the instruction pointer, you can compile any language using a stack (independantly og how it is structured and allocated) and two registers for the bottom and top of stack, and between the locations in the stack indicated by these two pointers, there's the equivalent of the Lua "register file". Even absolute and relative jump instructions can be considered as basic arithmetic operations on generic registers, and it could be located directly at top of the stack, along with the base pointer, so only one register is needed: the stack pointer itself. Using actual registers outside the stack is just a local optimization.
When you realize that, you can see that there's a much wider way to implement a VM and its intruction set (generated by its internal compiler, which may be multistaged, with several kinds of instruction sets and opcodes).
So each timle you speak about opcodes, you are not speaking about Lua itself, but about another specific assembly language used by one of its VMs and its internal compiler. If you forget that inner detail, then conceptually at the Lua language level, there are tables for everything, the rest is only optimization made by the VM itself.

Le ven. 16 nov. 2018 à 01:18, Tim Hill <[hidden email]> a écrit :

> On Nov 15, 2018, at 3:22 PM, Gabriel Bertilson <[hidden email]> wrote:
>
> So the function expression
> `function () return x end` either returns an upvalue or a global, and
> the assignment `x = 10` is resolved to an assignment

Actually it doesn’t return an upvalue OR a global .. it returns whatever value was in ‘x’ when the function was called.

—Tim


Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Gabriel Bertilson
In reply to this post by Tim Hill
Right, I was being sloppy. Thanks for the correction because it was misleading.

— Gabriel


On Thu, Nov 15, 2018 at 6:19 PM Tim Hill <[hidden email]> wrote:

>
>
> > On Nov 15, 2018, at 3:22 PM, Gabriel Bertilson <[hidden email]> wrote:
> >
> > So the function expression
> > `function () return x end` either returns an upvalue or a global, and
> > the assignment `x = 10` is resolved to an assignment
>
> Actually it doesn’t return an upvalue OR a global .. it returns whatever value was in ‘x’ when the function was called.
>
> —Tim
>
>

Reply | Threaded
Open this post in threaded view
|

Name resolution in Lua (was: Feature ...)

Dirk Laurie-2
In reply to this post by Philippe Verdy
Op Vr. 16 Nov. 2018 om 05:25 het Philippe Verdy <[hidden email]> geskryf:

> The upvalues are in some table (even if the LuaVM uses a faster approach using an array for the environment of function closures, which allows faster and direct indexing by it's internal opcode).

> The opcodes however are not relevant, they are internals of the VM

> When you realize that, you can see that there's a much wider way to implement a VM and its intruction set (generated by its internal compiler, which may be multistaged, with several kinds of instruction sets and opcodes).

I thnk we are all agreed that the VM and its instruction set, and
therefore binary Lua chunks, are implementation details.

Where we seem to differ, is in what other concepts are also
implementation details.

Philippe's point of view is that there exists a scripting language
called Lua, for which in principle a compiler or interpreter is
possible that does not use the Lua stack model and C API, thereby
relegating that model to an implementation detail. In such an
implementation, there would be no need for any data stucture other
than a table.

The rest of us take the point of view that whatever is in the manual
is part of the official specification ot Lua, not an implementation
detail.

I grant Philippe the validity of his point of view. There are some
practical difficulties, such as the entire run-time library that would
have to be rewritten and re-debugged if you do not have the API at
your disposal, but in theory it is fine.

But then we come to the whole question of name resolution. Lua has an
interesting mixture: local variables, including upvalues, have lexical
scope and are resolved at compile time; global variables are resolved
at runtime. This has not necessarily always been the case, but from
Lua 5.2 it has been part of the definition of Lua that a global
variable is equivalent to a field in a table named _ENV.

There would need to be a similar table _LOC for local variables (which
includes parameters), and as Philippe has said, its __index metamethod
would be another _LOC, eventually leading to _ENV.

There is a certain economy of thought here: the use of the table as
the only data structuring mechanism is carried a step further, the
distinction between "upvalue" and "global" would disappear, closures
would not be needed.

This economy is however immediately lost in performance, involving
time and memory resources. Much of the work done in the usual model at
compile time would now (repeatedly) be done at runtime. In order to
preserve lexical rather than dynamic scoping. a new _LOC table would
be necessary for every 'local' statement.

But it does seem to be _possible_ to get a working Lua interpreter that way.

Reply | Threaded
Open this post in threaded view
|

Re: Name resolution in Lua (was: Feature ...)

Philippe Verdy
My point is not that the table has to be implemetned necessarily as a hash with a vector indexed by hash(key) containing pointers/references to keys and collision pointers (or a randomization function like adding some prime modulo N where N is the size of the hash) plus a pointer/refernce to values, and a separate store for keys and for values. There are lot of ways to implement tables including the possibility for some kinds of keys or values to reduce it to a simple integer-indexed vector (i.e. an array).
So the alledged cost of using a table for upvalues is not true: this is here also implementation detail. As well as the way you implement the stack (not necessarily as an array, as it could be as well a double-linked list), so te implementation can also be "stackless" (i.e. the Lua VM engine could be used with a small constant maximum use of the native thread stack, all the rest only needing a head, and even all Lua objects could have a constant maximum native size, allowing efficient reuse of the native heap with low level of fragmentation, and faster use of the native heap, and low footprint on the native heap).
It's important to remember that the Lua table is a generalization of several wellknown structures: lists, arrays, vectors, associative arrays, sets, trees... it can be used in fact for all kinds of collections (including collections of collections). Lua also allows tables to be linked to another table to form a chain of tables (so it also natively allows ternary trees of collections with all orders/ranks).
If we view the language this way, there's large freedom of implementation, and much more possibilities to implement not just interpreters, but as well compilers (including JIT compilers and static compilers) than what the current Lua.org's implementation currently uses.: Even Lua.org's has changed between versions how this implementation was done. Even the opcodes where considerably changed (i.e. the VMs themselves were necessarily incompatible, but not the language itself and programs written in it). This allowed multiple implementations to appear, and allowed the language to be integrated in many more environments, including the smallest ones (e.g. on IoT devices, or even small 8-bit microcontrolers, or within other VMs for languages like Java or Javascript, or within networked environments, or within SQL engines for example in stored procedures and triggers). Lua is extremely flexible nad much less dependant than other languages like C/C++ which require a much more complex base environement and setting up a processing model with harder to support.
The concept of "registers" in the Lua.org's implemetnation is nothing else than a conceptualization of an integer-indexed array which is used in hope to improve data locality, it is not necessarily bound to native registers. It is used only because of the way its opcodes are encoded in its VM instruction set (but the same can be said as well in C/C++ stack based implementations using [BP+offset] indexed arrays for local variables, and *caching* a few of them in native registers. The main difference being that "words" used in the Lua.org's implementation is not just n-bit integers (like in native CPUs) but can hold any Lua's datatype (i.e. it is a tagged TV value which can be represented by a fixed-size object, even if some bits are actually references to other "words").

Le ven. 16 nov. 2018 à 09:54, Dirk Laurie <[hidden email]> a écrit :
Op Vr. 16 Nov. 2018 om 05:25 het Philippe Verdy <[hidden email]> geskryf:

> The upvalues are in some table (even if the LuaVM uses a faster approach using an array for the environment of function closures, which allows faster and direct indexing by it's internal opcode).

> The opcodes however are not relevant, they are internals of the VM

> When you realize that, you can see that there's a much wider way to implement a VM and its intruction set (generated by its internal compiler, which may be multistaged, with several kinds of instruction sets and opcodes).

I thnk we are all agreed that the VM and its instruction set, and
therefore binary Lua chunks, are implementation details.

Where we seem to differ, is in what other concepts are also
implementation details.

Philippe's point of view is that there exists a scripting language
called Lua, for which in principle a compiler or interpreter is
possible that does not use the Lua stack model and C API, thereby
relegating that model to an implementation detail. In such an
implementation, there would be no need for any data stucture other
than a table.

The rest of us take the point of view that whatever is in the manual
is part of the official specification ot Lua, not an implementation
detail.

I grant Philippe the validity of his point of view. There are some
practical difficulties, such as the entire run-time library that would
have to be rewritten and re-debugged if you do not have the API at
your disposal, but in theory it is fine.

But then we come to the whole question of name resolution. Lua has an
interesting mixture: local variables, including upvalues, have lexical
scope and are resolved at compile time; global variables are resolved
at runtime. This has not necessarily always been the case, but from
Lua 5.2 it has been part of the definition of Lua that a global
variable is equivalent to a field in a table named _ENV.

There would need to be a similar table _LOC for local variables (which
includes parameters), and as Philippe has said, its __index metamethod
would be another _LOC, eventually leading to _ENV.

There is a certain economy of thought here: the use of the table as
the only data structuring mechanism is carried a step further, the
distinction between "upvalue" and "global" would disappear, closures
would not be needed.

This economy is however immediately lost in performance, involving
time and memory resources. Much of the work done in the usual model at
compile time would now (repeatedly) be done at runtime. In order to
preserve lexical rather than dynamic scoping. a new _LOC table would
be necessary for every 'local' statement.

But it does seem to be _possible_ to get a working Lua interpreter that way.

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Gé Weijers
In reply to this post by Philippe Verdy


On Thu, Nov 15, 2018 at 7:25 PM Philippe Verdy <[hidden email]> wrote: 
anyway I think that what the Lua authors name "registers" is abusive, it is just a frame format, except that it uses a separate store than the call stack, and that the callstack itself is not necessarily represented as a single vector but can as well be a chained list: VM are free to use the represetnation that is more convenient to them and the integration contraint or memory constraints.

The word 'registers' is used to differentiate the PUC Lua interpreter's approach of defining the VM from the typical stack machine instruction set, where most of the time the operands are the top few entries of a stack, and the instructions do not explicitly reference them. At the right level of abstraction (the VM instruction level) they are 'registers'. At the VM implementation level they're arrays of numbers/pointers with tag bits. I have to disagree with the use of the word "abusive" here.


Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Philippe Verdy
the term "word" was chosen purpusely and accuretely: it indicates the basic unit of data managed at the instruction level by the instruction set used by the interpreting part of the engine. Even in the Lua.org's VM, this unit of information is the same a a standard Lua value: it has both a  a type and a  value (not necessarily numeric), which are unseparatable.
In a classic CPU, the "word" has no distinctive type, it's just a fixed-size set of bits with arithmetic properties.
In Lua.org's VM, the "word" is also fixed size but contain other bits that have no arithmetic properties but are assiocated with specific behavior affecting what the VM does or infers in addition to what is specified by the instruction set.
I used "word" exactly, I did not say "byte" or "bits", or "integer": "word" is the abstract term that covers other kind of data (here data which includes itself its own datatype information, something not encoded at all in "words" used by CPU where datatype is implicit and unchangeable, and only depends on the instructions in code, plus some other constraints like the state of internal registers, notably those controlling access to external memory when these words are used as "adresses" by specifically instructions).
In the formal Turing machine, a "word" is any information stored in a distinct memory cell, but the machine does not restrict at all the type of information stored there (it's not necessarily an integer in a restricted range, like in CPUs).

Le ven. 16 nov. 2018 à 19:26, Gé Weijers <[hidden email]> a écrit :


On Thu, Nov 15, 2018 at 7:25 PM Philippe Verdy <[hidden email]> wrote: 
anyway I think that what the Lua authors name "registers" is abusive, it is just a frame format, except that it uses a separate store than the call stack, and that the callstack itself is not necessarily represented as a single vector but can as well be a chained list: VM are free to use the represetnation that is more convenient to them and the integration contraint or memory constraints.

The word 'registers' is used to differentiate the PUC Lua interpreter's approach of defining the VM from the typical stack machine instruction set, where most of the time the operands are the top few entries of a stack, and the instructions do not explicitly reference them. At the right level of abstraction (the VM instruction level) they are 'registers'. At the VM implementation level they're arrays of numbers/pointers with tag bits. I have to disagree with the use of the word "abusive" here.


Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Sean Conner
It was thus said that the Great Philippe Verdy once stated:
> the term "word" was chosen purpusely and accuretely: it indicates the basic
> unit of data managed at the instruction level by the instruction set used
> by the interpreting part of the engine.

  I was taught that a "word" is the number of bits the CPU can handle
natively (or without addtional work).  That's fine for most CPUs, but things
do get fuzzy.  For instance, the Intel 8088.  Internally it can do 16-bit
arithmatic so clearly it's "word" is 16-bits.  *BUT* it only has an 8-bit
data bus, so retreiving a 16-bit word requires more work.  So what is the
"word" size on an 8088?  8-bits or 16-bits?

  The Motorols 68000 is similar---internally, it can handle 32-bit
artithmatic, but it too, has a data bus smaller than it's internal bus, in
this case, a 16-bit data bus.  So, what does "word" mean for the 68000?
16-bits or 32-bits?  [1]

  Then you have the real odd-balls of the world, like the Intel 432 [2],
which didn't have a fixed size word.  Or the PERQ, a computer system with a
writable instruction set! (want 16-bit words?  Rewrite the instruction set.
13-bit words?  Fine, rewrite the instruction set)

> Even in the Lua.org's VM, this unit
> of information is the same a a standard Lua value: it has both a  a
> type and a  value (not necessarily numeric), which are unseparatable.
> In a classic CPU, the "word" has no distinctive type, it's just a
> fixed-size set of bits with arithmetic properties.
>
> In Lua.org's VM, the "word" is also fixed size but contain other bits that
> have no arithmetic properties but are assiocated with specific behavior
> affecting what the VM does or infers in addition to what is specified by
> the instruction set.

  Oh, then there are the various LISP machines, made in the late 70s---they
had tagged memory.  So in addition to the "data bits" part of the
"word", they also had "tag bits" that existly solely to impart type
information for the "data bits" part of the word.  So it's inaccurate to say
that a "word" is just a collection of undifferentiated set of bits.



> I used "word" exactly, I did not say "byte" or "bits", or "integer": "word"
> is the abstract term that covers other kind of data (here data which
> includes itself its own datatype information, something not encoded at all
> in "words" used by CPU where datatype is implicit and unchangeable, and
> only depends on the instructions in code, plus some other constraints like
> the state of internal registers, notably those controlling access to
> external memory when these words are used as "adresses" by specifically
> instructions).

  I could not follow this at all.  

  -spc

[1] For me, the 8088 has a 16-bit word, and the 68000 a 32-bit word, but
        I'm a software engineer.  Were I an electrical engineer, I'd give a
        different answer.

[2] Never actually released as far as I can tell.

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Philippe Verdy


Le ven. 16 nov. 2018 à 23:04, Sean Conner <[hidden email]> a écrit :
Oh, then there are the various LISP machines, made in the late 70s---they
had tagged memory.  So in addition to the "data bits" part of the
"word", they also had "tag bits" that existly solely to impart type
information for the "data bits" part of the word.  So it's inaccurate to say
that a "word" is just a collection of undifferentiated set of bits.

Reread: you're reformulating what I told. I never said that a word was a collection of undifferentiated bits.

This is exactly the opposite that I said (and this is notably applicable to Lua "words" whose innner "bits" are differerentiated between datatype indicators and actual data (to represent numeric or string values or nil or table instances, or functions; we have to add also "closures", i.e. the instanciated contexts in which functions or coroutines are operating): "words" in Lua are tagged objects with a small size (fixed for each type, even if they are linked to other "words" to create a larger structure, notably for strings, tables, functions, and coroutines). The fact that these tagged objects have small fixed size allows storing them in a single "word" which are then countable and referencable, by the instruction set defined by the Lua VM.

So yes there's a valid concept of "word" in that context.

In Lisp the fundamental unit of information is the "node" in a binary tree: there's a differenciation only between the left and right links of the node (which is just an ordered pair), an in very few bits reserved in each link of the pair (allowing to differentiate a "nil" reference from an actual node reference, or a scalar value); both items of the pair are identifical and you could argue that the fundamental unit of information is one of these two items, but these items by themselves do not allow any contruction of larger data structures (lists, stacks, sets, tables, graphs...) and do not allow creating an ordered flow of instructions. Instructions in any Lisp VM are themselves represented by these "nodes" (so the language naturally has reflection/instrospection capability, it can even transform "functions", i.e. what they will do when executed).

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Sean Conner
It was thus said that the Great Philippe Verdy once stated:

> Le ven. 16 nov. 2018 à 23:04, Sean Conner <[hidden email]> a écrit :
>
> > Oh, then there are the various LISP machines, made in the late 70s---they
> > had tagged memory.  So in addition to the "data bits" part of the
> > "word", they also had "tag bits" that existly solely to impart type
> > information for the "data bits" part of the word.  So it's inaccurate to
> > say
> > that a "word" is just a collection of undifferentiated set of bits.
> >
>
> Reread: you're reformulating what I told. I never said that a word was a
> collection of undifferentiated bits.

  You are correct.  I've gone back and carefully read through your walls of
text.  I read the following:

> In a classic CPU, the "word" has no distinctive type, it's just a
> fixed-size set of bits with arithmetic properties.

  In *many* CPU architectures, yes, a word is "a fixed-size set of bits" (I
have no idea how to interpret "arithmetic properties" so I'm leaving that
out), but not *ALL* CPU architectures are like that---that's what I was
talking about.

  You failed to understand what I meant by a LISP machine.  A LISP machine
is (or rather, was, they're no longer being manufactured but they still
exist and some even still run) a machine specifically built to efficiently
run LISP *at the CPU level*.  In fact [1]:

        The Symbolics 3600 family is a line of 36-bit single-user computers
        designed for high-productivity software development and for the
        execution of large symbolic programs. 3600-family processors give
        the user all the computational power associated with multi-user
        timesharing computers in a dedicated workstation. This is
        accomplished via a new and unique machine architecture that supports
        high-speed symbol processing operations directly in hardware. For
        example, *** every word in a Symbolics computer's virtual memory is
        tagged with data type bits ***---hence the name tagged architecture
        to describe 3600-family processors. The processor reads these bits
        to prevent illegal operations. As an added benefit, tag bits reduce
        the need for data type declarations in programs.

                (emphasis added)

> This is exactly the opposite that I said
>
> (and this is notably applicable to Lua "words" whose innner "bits" are
> differerentiated between datatype indicators and actual data
>
>         (to represent numeric or string values or nil or table instances,
>         or functions; we have to add also "closures", i.e. the
>         instanciated contexts in which functions or coroutines are
>         operating)
>
> :"words" in Lua are tagged objects with a small size
>
>         (fixed for each type, even if they are linked to other "words" to
>         create a larger structure, notably for strings, tables, functions,
>         and coroutines)
>
> . The fact that these tagged objects have small fixed size allows storing
> them in a single "word" which are then countable and referencable, by the
> instruction set defined by the Lua VM.

  You're missing a closing parenthesis.  This is another reason your text is
hard to follow.

> So yes there's a valid concept of "word" in that context.
>
> In Lisp the fundamental unit of information is the "node" in a binary
> tree: there's a differenciation only between the left and right links of
> the node (which is just an ordered pair), an in very few bits reserved in
> each link of the pair (allowing to differentiate a "nil" reference from an
> actual node reference, or a scalar value); both items of the pair are
> identifical and you could argue that the fundamental unit of information
> is one of these two items, but these items by themselves do not allow any
> contruction of larger data structures (lists, stacks, sets, tables,
> graphs...) and do not allow creating an ordered flow of instructions.

  I am not following you here.  I'm not even sure what you are trying to say
here.

> Instructions in any
> Lisp VM are themselves represented by these "nodes" (so the language
> naturally has reflection/instrospection capability, it can even transform
> "functions", i.e. what they will do when executed).

  Yes, there are other languages that can transform their own code---Ruby,
Forth, Smalltalk, even Assembly if you know what you are doing.

  -spc

[1] http://smbx.org/symbolics-technical-summary/

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Tim Hill
In reply to this post by Sean Conner


On Nov 16, 2018, at 2:04 PM, Sean Conner <[hidden email]> wrote:

I was taught that a "word" is the number of bits the CPU can handle
natively (or without addtional work).  That's fine for most CPUs, but things
do get fuzzy.  For instance, the Intel 8088.  Internally it can do 16-bit
arithmatic so clearly it's "word" is 16-bits.  *BUT* it only has an 8-bit
data bus, so retreiving a 16-bit word requires more work.  So what is the
"word" size on an 8088?  8-bits or 16-bits?

Same here, but I always felt it was driven by the CPU architecture, not the bus width (after all, a modern x86 CPU has VERY wide busses, sometimes 256 bits, but no-one is claiming its a 256-bit architecture). To my mind the 8088 was a 16-bit CPU since most instructions (add/sub/xor/neg etc) naturally worked on 16-bit values in 16-bit registers (notwithstanding using AH/AL etc).

I think “word” wrt CPU architecture was never a precise term, and probably never will be.

-_Tim

Reply | Threaded
Open this post in threaded view
|

Re: Feature request: hiding upvalues

Dirk Laurie-2
Op Sa. 17 Nov. 2018 om 04:31 het Tim Hill <[hidden email]> geskryf:

> I think “word” wrt CPU architecture was never a precise term, and probably never will be.

Oh, a mathematically precise definition is possible. "A word is an
individually addressable entity larger than the smallest such entity."
Whether this definition is useful is can be established only by a
disciple of Bourbaki (one of which, an I mistake not, has recently
joined this list).

12