Non-linear lexical scoping (again) and an idea about __unset

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Non-linear lexical scoping (again) and an idea about __unset

Soni "They/Them" L.
okay so a while back I talked about this "unset foo" idea, that would
allow you to get through to shadowed names.

the example I gave was basically:

local foo = 1
do
   local foo = 2
   do
     unset foo
     foo = 3
   end
   assert(foo == 2)
end
assert(foo == 3)

(I don't remember the exact examples I gave but I like this one.)

anyway, nobody liked it back then and I don't expect anyone to like it now.

but since we're talking about __close and toclose and stuff, why not use
__unset instead? it's called when a variable is unset i.e. goes out of
scope. (as long as we don't get the ability to unset variables like I
proposed with non-linear lexical scoping, at least.)

it still doesn't solve the "what to name the <toclose>" problem, but,
I'd prefer __unset over __close, personally. (ideally we'd also have a
__set for when the value is assigned to one of those special variables,
but I digress.)

Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Philippe Verdy
Your example is counter intuitive: why does the last assert(foo==3) has to be true (the outer scope was modified by the double-inner  scope use of "unset": why does then foo=3 would not affect instead the middle scope and then throw an error on assert(foo==2)?
If that "assert(foo==2)" passes then the "unset foo" has no effect than just being equivalent to "local foo -- =nil" and then the last assertion assert(foo=3) should fail because foo is still equal to 1.
Your idea would create a security havoc where programs can unhide the protection layers to get access to internal variables being normally inacessessible to them.
The only safe behavior would be that "unset foo" is like "local foo" (so "unset" is not necessary at all) but still different from the assignment "foo=nil" which overrides the value of the same variable in scope (only the former value of that variable will be closed and garbage collected, but the scope does not change: that same variable continues to used in other references, notably in other function closures).
Alloing a program to control the scoping of outer varisables they did not declare themselves is a crazy and dangerous idea.

Le mer. 26 juin 2019 à 22:27, Soni "They/Them" L. <[hidden email]> a écrit :
okay so a while back I talked about this "unset foo" idea, that would
allow you to get through to shadowed names.

the example I gave was basically:

local foo = 1
do
   local foo = 2
   do
     unset foo
     foo = 3
   end
   assert(foo == 2)
end
assert(foo == 3)

(I don't remember the exact examples I gave but I like this one.)

anyway, nobody liked it back then and I don't expect anyone to like it now.

but since we're talking about __close and toclose and stuff, why not use
__unset instead? it's called when a variable is unset i.e. goes out of
scope. (as long as we don't get the ability to unset variables like I
proposed with non-linear lexical scoping, at least.)

it still doesn't solve the "what to name the <toclose>" problem, but,
I'd prefer __unset over __close, personally. (ideally we'd also have a
__set for when the value is assigned to one of those special variables,
but I digress.)

Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Soni "They/Them" L.


On 2019-06-27 8:47 p.m., Philippe Verdy wrote:

> Your example is counter intuitive: why does the last assert(foo==3)
> has to be true (the outer scope was modified by the double-inner 
> scope use of "unset": why does then foo=3 would not affect instead the
> middle scope and then throw an error on assert(foo==2)?
> If that "assert(foo==2)" passes then the "unset foo" has no effect
> than just being equivalent to "local foo -- =nil" and then the last
> assertion assert(foo=3) should fail because foo is still equal to 1.
> Your idea would create a security havoc where programs can unhide the
> protection layers to get access to internal variables being normally
> inacessessible to them.
> The only safe behavior would be that "unset foo" is like "local foo"
> (so "unset" is not necessary at all) but still different from the
> assignment "foo=nil" which overrides the value of the same variable in
> scope (only the former value of that variable will be closed and
> garbage collected, but the scope does not change: that same variable
> continues to used in other references, notably in other function
> closures).
> Alloing a program to control the scoping of outer varisables they did
> not declare themselves is a crazy and dangerous idea.

If you're concatenating attacker code into your program code, you have
bigger issues. (see: SQL injection)

You cannot unset globals and upvalues. It is an error.

>
> Le mer. 26 juin 2019 à 22:27, Soni "They/Them" L. <[hidden email]
> <mailto:[hidden email]>> a écrit :
>
>     okay so a while back I talked about this "unset foo" idea, that would
>     allow you to get through to shadowed names.
>
>     the example I gave was basically:
>
>     local foo = 1
>     do
>        local foo = 2
>        do
>          unset foo
>          foo = 3
>        end
>        assert(foo == 2)
>     end
>     assert(foo == 3)
>
>     (I don't remember the exact examples I gave but I like this one.)
>
>     anyway, nobody liked it back then and I don't expect anyone to
>     like it now.
>
>     but since we're talking about __close and toclose and stuff, why
>     not use
>     __unset instead? it's called when a variable is unset i.e. goes
>     out of
>     scope. (as long as we don't get the ability to unset variables like I
>     proposed with non-linear lexical scoping, at least.)
>
>     it still doesn't solve the "what to name the <toclose>" problem, but,
>     I'd prefer __unset over __close, personally. (ideally we'd also
>     have a
>     __set for when the value is assigned to one of those special
>     variables,
>     but I digress.)
>


Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Philippe Verdy
Le ven. 28 juin 2019 à 22:07, Soni "They/Them" L. <[hidden email]> a écrit :
On 2019-06-27 8:47 p.m., Philippe Verdy wrote:
> Allowing a program to control the scoping of outer varisables they did
> not declare themselves is a crazy and dangerous idea.

If you're concatenating attacker code into your program code, you have
bigger issues. (see: SQL injection)

True but this does not contradict what I said, it's an independant consideration.

You cannot unset globals and upvalues. It is an error.

There's no such "globals" in Lua, there are only "closures" (containing upvalues including "_G").

Yes it is an error if you true to unset them because they are not really in a scope you can override.

But even if you're in a simple do/end block, or in a for...end loop declaring local loop variables, or after any "local" declaration, you have a new (embedded) lexical scope that should still behave like a closure and offer the protection:

Alowing such "unsets" of variables in outer scopes (that are still not in an outer closure) is also crazy and dangerous (unless you say that, to get the protection, you need to create and call a local function to explicitly create a new closure: the code in the function must never be able to unset variables from any outer visibility scope which is not inside the current closure.


Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Soni "They/Them" L.


On 2019-06-29 2:57 p.m., Philippe Verdy wrote:

> Le ven. 28 juin 2019 à 22:07, Soni "They/Them" L. <[hidden email]
> <mailto:[hidden email]>> a écrit :
>
>     On 2019-06-27 8:47 p.m., Philippe Verdy wrote:
>     > Allowing a program to control the scoping of outer varisables
>     they did
>     > not declare themselves is a crazy and dangerous idea.
>
>     If you're concatenating attacker code into your program code, you
>     have
>     bigger issues. (see: SQL injection)
>
>
> True but this does not contradict what I said, it's an independant
> consideration.
>
>     You cannot unset globals and upvalues. It is an error.
>
>
> There's no such "globals" in Lua, there are only "closures"
> (containing upvalues including "_G").
>
> Yes it is an error if you true to unset them because they are not
> really in a scope you can override.
>
> But even if you're in a simple do/end block, or in a for...end loop
> declaring local loop variables, or after any "local" declaration, you
> have a new (embedded) lexical scope that should still behave like a
> closure and offer the protection:
>
> Alowing such "unsets" of variables in outer scopes (that are still not
> in an outer closure) is also crazy and dangerous (unless you say that,
> to get the protection, you need to create and call a local function to
> explicitly create a new closure: the code in the function must never
> be able to unset variables from any outer visibility scope which is
> not inside the current closure.
>
>

There's nothing unsafe about "unset" because it's exactly equivalent to
renaming your (local) variables.

Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Philippe Verdy
Le sam. 29 juin 2019 à 21:11, Soni "They/Them" L. <[hidden email]> a écrit :
>
> There's nothing unsafe about "unset" because it's exactly equivalent to
> renaming your (local) variables.


If you unset a variable to get the effect of hiding it completely and get access to the homonymous variable from an outer scope, and then modifying it, it is unsafe.

This does not happen when you redeclare a local variable that makes the previous one out of scope and unreachable. As well when you set that variable to nil this does not alow the homonymous variable from a previous scope to become accessible.

Really, "unset" is not needed and dangerous. Just use "local" instead.
 
Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Soni "They/Them" L.


On 2019-06-29 4:18 p.m., Philippe Verdy wrote:

> Le sam. 29 juin 2019 à 21:11, Soni "They/Them" L. <[hidden email]
> <mailto:[hidden email]>> a écrit :
> >
> > There's nothing unsafe about "unset" because it's exactly equivalent to
> > renaming your (local) variables.
>
>
> If you unset a variable to get the effect of hiding it completely and
> get access to the homonymous variable from an outer scope, and then
> modifying it, it is unsafe.
>
> This does not happen when you redeclare a local variable that makes
> the previous one out of scope and unreachable. As well when you set
> that variable to nil this does not alow the homonymous variable from a
> previous scope to become accessible.
>
> Really, "unset" is not needed and dangerous. Just use "local" instead.

"unset" is equivalent to "end". it just happens to, itself, be lexically
scoped.

if each block (scope) could have a name and you could overlap them
however you like:

do 'foo'
   local x = 1
   if a then 'bar'
     end 'foo'
     local z = 2
     x = z
     do 'foo'
   end 'bar'
   assert(x==1)
end 'foo'

"unset" is no more dangerous than using different names.

Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Philippe Verdy
You may want to do that, but your code will be an horror with those pseudo "scope labels".

And the syntax you propose cannot work, as it is ambiguous (you use quoted strings in places where it could start an expression statement. Do you want to repeat the nightmare of semicolons at end of statements ?

So, "end" is not equivalent to "local", just use "local" instead to hide explicit variables, and do not allow "unhiding" specific variables from outer scopes (it would be a very bad practive in my opinion): if you use "end" it terminates the current closure of the function or all "local" or for variables that fall out of scope (and cannot be "resumed in scope" after it). Scopes are necessarily recursively embedded and must never overlap partially (scopes can only overlap entirely, or not at all; each "local" statement starts an embedded scope which fully overlaps the previous scope, until the "end" of function closure or "end" of block).Scopes must form a perfect hierarchic tree with no shared subranches.





Le sam. 29 juin 2019 à 21:29, Soni "They/Them" L. <[hidden email]> a écrit :


On 2019-06-29 4:18 p.m., Philippe Verdy wrote:
> Le sam. 29 juin 2019 à 21:11, Soni "They/Them" L. <[hidden email]
> <mailto:[hidden email]>> a écrit :
> >
> > There's nothing unsafe about "unset" because it's exactly equivalent to
> > renaming your (local) variables.
>
>
> If you unset a variable to get the effect of hiding it completely and
> get access to the homonymous variable from an outer scope, and then
> modifying it, it is unsafe.
>
> This does not happen when you redeclare a local variable that makes
> the previous one out of scope and unreachable. As well when you set
> that variable to nil this does not alow the homonymous variable from a
> previous scope to become accessible.
>
> Really, "unset" is not needed and dangerous. Just use "local" instead.

"unset" is equivalent to "end". it just happens to, itself, be lexically
scoped.

if each block (scope) could have a name and you could overlap them
however you like:

do 'foo'
   local x = 1
   if a then 'bar'
     end 'foo'
     local z = 2
     x = z
     do 'foo'
   end 'bar'
   assert(x==1)
end 'foo'

"unset" is no more dangerous than using different names.

Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Tim Hill
In reply to this post by Soni "They/Them" L.


> On Jun 26, 2019, at 1:27 PM, Soni They/Them L. <[hidden email]> wrote:
>
> okay so a while back I talked about this "unset foo" idea, that would allow you to get through to shadowed names.
>
> the example I gave was basically:
>
> local foo = 1
> do
>   local foo = 2
>   do
>     unset foo
>     foo = 3
>   end
>   assert(foo == 2)
> end
> assert(foo == 3)
>
> (I don't remember the exact examples I gave but I like this one.)
>
> anyway, nobody liked it back then and I don't expect anyone to like it now.
>
> but since we're talking about __close and toclose and stuff, why not use __unset instead? it's called when a variable is unset i.e. goes out of scope. (as long as we don't get the ability to unset variables like I proposed with non-linear lexical scoping, at least.)
>
> it still doesn't solve the "what to name the <toclose>" problem, but, I'd prefer __unset over __close, personally. (ideally we'd also have a __set for when the value is assigned to one of those special variables, but I digress.)
>

I’m sure this was asked before. What is the use-case for this? What problem does it solve? My feeling is that any actually need for this is probably an indication of bad software design/coding. In general, you should only be reaching “outwards” into scopes that you understand and control .. and if you understand and control them, then all you need to do it rename the inner locals. And if you dont control them, you are really taking a risk accessing them.

—Tim


Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Jim-2
In reply to this post by Philippe Verdy
On Sat, Jun 29, 2019 at 10:11:49PM +0200, Philippe Verdy wrote:
> to repeat the nightmare of semicolons at end of statements ?

how is that a "nightmare" ?
it just marks the end of a statement and thereby helps to ensure a free
form syntax not giving special to any white space.

> perfect hierarchic tree with no shared subranches.

what does that mean ?


Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Philippe Verdy


Le lun. 1 juil. 2019 à 22:13, Jim <[hidden email]> a écrit :
On Sat, Jun 29, 2019 at 10:11:49PM +0200, Philippe Verdy wrote:
> to repeat the nightmare of semicolons at end of statements ?

how is that a "nightmare" ?

It's a nightmare because of cases where it is absent (syntax ambiguity which is solved by a hack in the parser, partly solved by adding the semicolon, but only optionally); not because of cases where it is present (where it is non-ambiguous).

For this reason, several quality projects for Lua highly recommend using the semicolons (actually empty statements) everywhere... except that this empty semicolon must now be accepted after a goto, or a return or after the end of an infinite loop, and after a reachable label (that has "goto" statements point to the label, where the label is in lexical scope from the "goto" statement), even if it is not the common practice of removing them (but only where this is not ambiguous, i.e. not before function calls without any assignment which are the only case allowed of "expression statements").

Yes,it is a nightmare that was introduced by the currification syntax (which is "syntaxic sugar" initially added to Lua, without serious prior discussion about its syntaxic effect).
No such nightmare would have occured if either:
  - semicolons were always needed (like in C/C++/Java), or
  - were removable by a cleaner syntax, discrimating whitespaces/comments when they contain at least one newline (like in Javascript/ECMAscript, but this also creates some problems when the code presentation is changed, for refactoring purpose or  because the code is to be packed to remove all "unnecessary" whitespaces/comments, because the semicolons then need to be added everywhere a newline was significant).

If we leave aside the problem of code packing (which will be made by bots using true syntaxic parsers), there remains the problem of code refactoring (made by humans which may not see that the newline in Lua is just like any regular whitespace) :

This causes unexpected bugs when the code is changed by humans (e.g. by removing a statement just before a function call statement, or because one user may not see that what "looks like" a function call statement was actually the continuation of a currified function call and then could insert statements in the middle).

 
Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Soni "They/Them" L.


On 2019-07-03 8:04 a.m., Philippe Verdy wrote:

>
>
> Le lun. 1 juil. 2019 à 22:13, Jim <[hidden email]
> <mailto:[hidden email]>> a écrit :
>
>     On Sat, Jun 29, 2019 at 10:11:49PM +0200, Philippe Verdy wrote:
>     > to repeat the nightmare of semicolons at end of statements ?
>
>     how is that a "nightmare" ?
>
>
> It's a nightmare because of cases where it is absent (syntax ambiguity
> which is solved by a hack in the parser, partly solved by adding the
> semicolon, but only optionally); not because of cases where it is
> present (where it is non-ambiguous).
>
> For this reason, several quality projects for Lua highly recommend
> using the semicolons (actually empty statements) everywhere... except
> that this empty semicolon must now be accepted after a goto, or a
> return or after the end of an infinite loop, and after a reachable
> label (that has "goto" statements point to the label, where the label
> is in lexical scope from the "goto" statement), even if it is not the
> common practice of removing them (but only where this is not
> ambiguous, i.e. not before function calls without any assignment which
> are the only case allowed of "expression statements").
>
> Yes,it is a nightmare that was introduced by the currification syntax
> (which is "syntaxic sugar" initially added to Lua, without serious
> prior discussion about its syntaxic effect).
> No such nightmare would have occured if either:
>   - semicolons were always needed (like in C/C++/Java), or
>   - were removable by a cleaner syntax, discrimating
> whitespaces/comments when they contain at least one newline (like in
> Javascript/ECMAscript, but this also creates some problems when the
> code presentation is changed, for refactoring purpose or  because the
> code is to be packed to remove all "unnecessary" whitespaces/comments,
> because the semicolons then need to be added everywhere a newline was
> significant).
>
> If we leave aside the problem of code packing (which will be made by
> bots using true syntaxic parsers), there remains the problem of code
> refactoring (made by humans which may not see that the newline in Lua
> is just like any regular whitespace) :
>
> This causes unexpected bugs when the code is changed by humans (e.g.
> by removing a statement just before a function call statement, or
> because one user may not see that what "looks like" a function call
> statement was actually the continuation of a currified function call
> and then could insert statements in the middle).
>

If you indent with semicolons it'll never be an issue.

Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Philippe Verdy
Le mer. 3 juil. 2019 à 23:42, Soni "They/Them" L. <[hidden email]> a écrit :
If you indent with semicolons it'll never be an issue.

May be, but what horror !

And don't forget that programmers frequentlly have too long lines that they need to break
So these lines will NOT start by semicolons after the indentation.

Such convention of using semicolones at start of lines is never seen, much less readable than at end of instructions
And anyway the semicolon in Lua is neither an end of instruction or a start, but a separate no-op instruction, allowed in some contexts where other instructions are disallowed, such as after a return already "terminated" by a semicolon (so "return;;" is valid even if there are two no-op intructions, but "return;f();" is not with only the first ";" being correct) or after a goto, or after the end of a never ending loop (so "while true do... end;;" is valid, but "while true do... end;;f();" is not), or after any expression or assignment statement that always terminates by n unconditional "error".

Note that if the expression unconditionally terminates by an "error", you cannot even use it in an assignment instruction (because the assignment would never occur) or in the initializer of a "local" declaration (so "local a,b = 1,error" is invalid, and neither "a" or "b" would be initialized), as well you cannot use it in a "return" statement or in a subexpression in parentheses with other trailing or leading operators.

These crietria of validity of no-op ";" statement and "error" in expressions is only particlaly detected by the syntaxic parser; they are only detected by the downstream code flow analysis, but a compiler may still accept these constructs as valid and either drop silently of the extra code, or signal it to the programmer with a lint-like warning, or say that this code is most probably wrong and refuse to compile it (which is IMHO the best option).


Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Soni "They/Them" L.


On 2019-07-05 9:28 p.m., Philippe Verdy wrote:

> Le mer. 3 juil. 2019 à 23:42, Soni "They/Them" L. <[hidden email]
> <mailto:[hidden email]>> a écrit :
>
>     If you indent with semicolons it'll never be an issue.
>
>
> May be, but what horror !
>
> And don't forget that programmers frequentlly have too long lines that
> they need to break
> So these lines will NOT start by semicolons after the indentation.

Indeed, as such you know exactly what's a continuation and what isn't.

>
> Such convention of using semicolones at start of lines is never seen,
> much less readable than at end of instructions
> And anyway the semicolon in Lua is neither an end of instruction or a
> start, but a separate no-op instruction, allowed in some contexts
> where other instructions are disallowed, such as after a return
> already "terminated" by a semicolon (so "return;;" is valid even if
> there are two no-op intructions, but "return;f();" is not with only
> the first ";" being correct) or after a goto, or after the end of a
> never ending loop (so "while true do... end;;" is valid, but "while
> true do... end;;f();" is not), or after any expression or assignment
> statement that always terminates by n unconditional "error".
>
> Note that if the expression unconditionally terminates by an "error",
> you cannot even use it in an assignment instruction (because the
> assignment would never occur) or in the initializer of a "local"
> declaration (so "local a,b = 1,error" is invalid, and neither "a" or
> "b" would be initialized), as well you cannot use it in a "return"
> statement or in a subexpression in parentheses with other trailing or
> leading operators.
>
> These crietria of validity of no-op ";" statement and "error" in
> expressions is only particlaly detected by the syntaxic parser; they
> are only detected by the downstream code flow analysis, but a compiler
> may still accept these constructs as valid and either drop silently of
> the extra code, or signal it to the programmer with a lint-like
> warning, or say that this code is most probably wrong and refuse to
> compile it (which is IMHO the best option).
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Non-linear lexical scoping (again) and an idea about __unset

Philippe Verdy


Le sam. 6 juil. 2019 à 02:48, Soni "They/Them" L. <[hidden email]> a écrit :


On 2019-07-05 9:28 p.m., Philippe Verdy wrote:
> Le mer. 3 juil. 2019 à 23:42, Soni "They/Them" L. <[hidden email]
> <mailto:[hidden email]>> a écrit :
>
>     If you indent with semicolons it'll never be an issue.
>
>
> May be, but what horror !
>
> And don't forget that programmers frequentlly have too long lines that
> they need to break
> So these lines will NOT start by semicolons after the indentation.

Indeed, as such you know exactly what's a continuation and what isn't.

But "indenting" with semicolons at start of lines are not solving that problem in Lua for what is not so much obvious: doing it would be intended to sovle the ambiguity, but in fact it does not at all: if there's a msising colon at start of line, some programmer way still think it is missing and would add one "to align the code" visually, when in fact it would break the code.

Really I prefer adding semicolons at end of lines (like in C/C++/Java) to explicitly "terminate" statements.

For continuation lines (after breaking long statements into multiple lines), it is extremely recommended to systematically indent the contituation lines by at one additional notch to give a visual hint that this is effectively a continuation and that no one should accidentally insert a statement in the middle of another one. But this does not change the recommandation of using semicolons at end of statements except after the "end" keyword where it is normally never needed: if you ever define an anonymous "function...end" that you want to use inside an expression and followed immedialtey by a curryfied function call with a parameter, it is highly recommended to surroung the anonymous function by parentheses like "(function...end) parameter" but if that parameter is split on a separate line, it must be indented as well.

Lua's curryfied functions are the origin of these horrors, they are inconsistantly defined syntaxically, they are unnecessary syntaxic sugar causing all these problems.