future of annotations in Lua?

classic Classic list List threaded Threaded
57 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Philippe Verdy
Also another goal of Lua (also Javascript) is that the language must be usable in an interactive way (e.g. in a developer or debugger "console"). And this forbids using syntaxic constructs in the LALR definition that require rollbacking an already processed "reduce" action (which would have caused some action to be already taken and executed.

And this is where C++ fails completely (even if its syntax is LALR): it can only compile program units as a whole (by making using highly contextual shift-reduce conflict resolutions and building an entire "astract syntax tree" (AST) before even trying to compile it), and not in interactive way, so it is not usable as a "scripting language". So a C++ parser can only be used in a first pass and "execution" or "compilation" (of the AST) can only occur in a completely separate stage.

On the opposite, a Lua (or Javascript) parser is not required to build an AST (it may still help in a compiler, but not in a interactive developer console), it can compile and run the code directly while parsing it, and even if the program unit is not complete. And this qualifies Lua (or Javascript) as a valid scripting language.

If Lua had made the ";" mandatory at end of every "expression statement" (instead of making it optional), there would not even be any conditional shift/reduce resolution, the Lua parser would use only trailing function calls, with no cost at all and no recursive stacking, and never requiring to build any AST.



Le sam. 8 juin 2019 à 06:55, Philippe Verdy <[hidden email]> a écrit :
The syntaxic analysis in Lua is made with the constraint that it can be parsed with a LR(1) parser as much as possible (with a single token of look-ahead). But actually Lua is LALR because it must also support a few rollbacks for resolving a few shift/reduce conflicts.

Implementing these rollbacks adds a cost in the parser because before it cannot be written simply using trailing function calls: these functions have to return a validation status that must be tested, so these functions inserted in the parser will stack up in memory with all their existing local context (unlike trailing function calls that are easily compiled by Lua as simple jumps.

If we add syntaxic features to Lua, we should keep this in mind and not transform Lua into a Fortran-like language (whose parsing is extremely complex because of its optional whitespaces between all keywords and identifiers, and very tricky to write and test with enough coverages) or in a C++-like language (also very tricky).



Le sam. 8 juin 2019 à 06:41, Philippe Verdy <[hidden email]> a écrit :
But if I write:
  a = b
  ;(break 'here')
  c = d
it is still functionally equivalent (in core language) to the 3 statements (including the empty statement);
  a = b
  ;
  c = d
but simply not to:
  a = b
  c = d
In that last case, there's an ambiguity because "b c" may potentially be a currified function call "b(c)".
That ambiguity already complexifies the existing Lua parsing because the compiler MUST rollback after parsing "a = b c" when it sees the next "=" (there cannot be two assignments in the same statement).

This is the known case in Lua syntax where there's a "shift/reduce conflict" in Lua, that is not always resolved simply as a "shift" (which still has the priority), but where the compiler MUST support rollback at this point in order to be able to retry with a "reduce", and why the empty statement ";" was necessary to the syntax (because of the priority given to the "shift" which may still be valid up to some later point in the source stream of tokens in which case there will be an inner reduce later, that cannot be safely rollbacked and that cancels any prior candidate shift/reduce rollback point)


Le sam. 8 juin 2019 à 06:30, Philippe Verdy <[hidden email]> a écrit :
Example of use of an annotation on an empty statement:
   a = 1
   ;(break 'here')
   b = 2
here "(break 'here')" is the "break" annotation which may have optional parameters possibly currified, entirely written between parentheses because "break" without them is a core langage statement.

If the annotation is ignored (because there's no debugger using it and the code is just run with a core engine) then it is equivalent to the 3 statements:
   a = 1
   ;
   b = 2
with the same empty statement, not necessary here because "1 b" is not a valid currified function call "1(b)" as a constant number is not an object and has no function members, so it is equivalent also to the two separate statements:
  a = 1 b=2


Le sam. 8 juin 2019 à 06:04, Philippe Verdy <[hidden email]> a écrit :
I see only a single useful case for annotating an empty statement (i.e. just after any single ";" which is always isolated and has no further tokens in any context), it's for emitting some debugger info (or execution tracking logger, or breakpoint) that applies to a reachable point of execution, i.e. between two separate statements that are otherwise not themselves annotated by it.


Le sam. 8 juin 2019 à 05:55, Philippe Verdy <[hidden email]> a écrit :
And given the way Lua parses the ";" (only as an empty statement, which is a no-op), adding an annotation just after it would make no sense.
There's no way to unambiguously allow any annotation in Lua at *start* of any statement: it must necessarily be in the middle of the statement before any expression (it can follow an "," separator too, which probably makes sense only if there's something after it which is not the end of the comma-separated list) or just at end of the statement.



Le sam. 8 juin 2019 à 05:44, Philippe Verdy <[hidden email]> a écrit :
Yes but Java requires the ";" terminator, so there's no ambiguity when parsing, even if the annotation precedes all the rest of the statement.

In Lua, without the required ";" there will an ambiguity of parsing if the annotation does not follow immediately a statement initial keyword (local, function, for, return, if, then, else, begin...), or a "(" or "[" or "{".


Le ven. 7 juin 2019 à 18:07, Dibyendu Majumdar <[hidden email]> a écrit :
> There is a big difference between all those syntaxes
[snip]
> They are all prefixed to the whole item to which they
> apply. Following their syntax, we should write '@toclose local x = 1',
> instead of 'local @toclose x = 1'.
>

In Java the annotation precedes the type in a declaration; it being
classed as a type modifier in the grammar (this is one of the uses).
Lua of course doesn't have type declarations therefore annotations
cannot be placed in the same way.

Regards

Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Philippe Verdy
Also the current Lua syntax is inconsistent in another part:

The "expression statements" in Lua currently only allow expressions that are simple function calls (with the form "a(b...)" or "a b..." if it is currified), but it currently forbids expressions using unary or binary operators, like "a + b" or "(a + b)" or "a[b]", as valid statements, even if they are intrinsicly a function call, to "__add(a, b)" and "__getindex(a, b)" in these examples.

And I see no fundamental problem in extending "expressions statements" in Lua to any valid Lua expression (even if this has no effect such as using a simple number or string, or "-1" with the unary operator, or "1+2" with a binary operator). It does not change the syntax complexity at all, but just simplifies it with no additional cost and without changing any existing semantics or runtime and debugging behavior.


Le sam. 8 juin 2019 à 07:35, Philippe Verdy <[hidden email]> a écrit :
Also another goal of Lua (also Javascript) is that the language must be usable in an interactive way (e.g. in a developer or debugger "console"). And this forbids using syntaxic constructs in the LALR definition that require rollbacking an already processed "reduce" action (which would have caused some action to be already taken and executed.

And this is where C++ fails completely (even if its syntax is LALR): it can only compile program units as a whole (by making using highly contextual shift-reduce conflict resolutions and building an entire "astract syntax tree" (AST) before even trying to compile it), and not in interactive way, so it is not usable as a "scripting language". So a C++ parser can only be used in a first pass and "execution" or "compilation" (of the AST) can only occur in a completely separate stage.

On the opposite, a Lua (or Javascript) parser is not required to build an AST (it may still help in a compiler, but not in a interactive developer console), it can compile and run the code directly while parsing it, and even if the program unit is not complete. And this qualifies Lua (or Javascript) as a valid scripting language.

If Lua had made the ";" mandatory at end of every "expression statement" (instead of making it optional), there would not even be any conditional shift/reduce resolution, the Lua parser would use only trailing function calls, with no cost at all and no recursive stacking, and never requiring to build any AST.



Le sam. 8 juin 2019 à 06:55, Philippe Verdy <[hidden email]> a écrit :
The syntaxic analysis in Lua is made with the constraint that it can be parsed with a LR(1) parser as much as possible (with a single token of look-ahead). But actually Lua is LALR because it must also support a few rollbacks for resolving a few shift/reduce conflicts.

Implementing these rollbacks adds a cost in the parser because before it cannot be written simply using trailing function calls: these functions have to return a validation status that must be tested, so these functions inserted in the parser will stack up in memory with all their existing local context (unlike trailing function calls that are easily compiled by Lua as simple jumps.

If we add syntaxic features to Lua, we should keep this in mind and not transform Lua into a Fortran-like language (whose parsing is extremely complex because of its optional whitespaces between all keywords and identifiers, and very tricky to write and test with enough coverages) or in a C++-like language (also very tricky).



Le sam. 8 juin 2019 à 06:41, Philippe Verdy <[hidden email]> a écrit :
But if I write:
  a = b
  ;(break 'here')
  c = d
it is still functionally equivalent (in core language) to the 3 statements (including the empty statement);
  a = b
  ;
  c = d
but simply not to:
  a = b
  c = d
In that last case, there's an ambiguity because "b c" may potentially be a currified function call "b(c)".
That ambiguity already complexifies the existing Lua parsing because the compiler MUST rollback after parsing "a = b c" when it sees the next "=" (there cannot be two assignments in the same statement).

This is the known case in Lua syntax where there's a "shift/reduce conflict" in Lua, that is not always resolved simply as a "shift" (which still has the priority), but where the compiler MUST support rollback at this point in order to be able to retry with a "reduce", and why the empty statement ";" was necessary to the syntax (because of the priority given to the "shift" which may still be valid up to some later point in the source stream of tokens in which case there will be an inner reduce later, that cannot be safely rollbacked and that cancels any prior candidate shift/reduce rollback point)


Le sam. 8 juin 2019 à 06:30, Philippe Verdy <[hidden email]> a écrit :
Example of use of an annotation on an empty statement:
   a = 1
   ;(break 'here')
   b = 2
here "(break 'here')" is the "break" annotation which may have optional parameters possibly currified, entirely written between parentheses because "break" without them is a core langage statement.

If the annotation is ignored (because there's no debugger using it and the code is just run with a core engine) then it is equivalent to the 3 statements:
   a = 1
   ;
   b = 2
with the same empty statement, not necessary here because "1 b" is not a valid currified function call "1(b)" as a constant number is not an object and has no function members, so it is equivalent also to the two separate statements:
  a = 1 b=2


Le sam. 8 juin 2019 à 06:04, Philippe Verdy <[hidden email]> a écrit :
I see only a single useful case for annotating an empty statement (i.e. just after any single ";" which is always isolated and has no further tokens in any context), it's for emitting some debugger info (or execution tracking logger, or breakpoint) that applies to a reachable point of execution, i.e. between two separate statements that are otherwise not themselves annotated by it.


Le sam. 8 juin 2019 à 05:55, Philippe Verdy <[hidden email]> a écrit :
And given the way Lua parses the ";" (only as an empty statement, which is a no-op), adding an annotation just after it would make no sense.
There's no way to unambiguously allow any annotation in Lua at *start* of any statement: it must necessarily be in the middle of the statement before any expression (it can follow an "," separator too, which probably makes sense only if there's something after it which is not the end of the comma-separated list) or just at end of the statement.



Le sam. 8 juin 2019 à 05:44, Philippe Verdy <[hidden email]> a écrit :
Yes but Java requires the ";" terminator, so there's no ambiguity when parsing, even if the annotation precedes all the rest of the statement.

In Lua, without the required ";" there will an ambiguity of parsing if the annotation does not follow immediately a statement initial keyword (local, function, for, return, if, then, else, begin...), or a "(" or "[" or "{".


Le ven. 7 juin 2019 à 18:07, Dibyendu Majumdar <[hidden email]> a écrit :
> There is a big difference between all those syntaxes
[snip]
> They are all prefixed to the whole item to which they
> apply. Following their syntax, we should write '@toclose local x = 1',
> instead of 'local @toclose x = 1'.
>

In Java the annotation precedes the type in a declaration; it being
classed as a type modifier in the grammar (this is one of the uses).
Lua of course doesn't have type declarations therefore annotations
cannot be placed in the same way.

Regards

Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Egor Skriptunoff-2
In reply to this post by Lorenzo Donati-3
On Fri, Jun 7, 2019 at 12:35 AM Lorenzo Donati wrote:

The more I think about it, the more I find the syntax with "@" more
readable and more easily "expandable": parametrized annotations anyone?
Like for example:

local @const myTable = @table(64) {} -- preallocates 64 elements in the
array part




 
Please note that Lua lexer allows inserting a space between any two lexems.
For example, the following is allowed in Lua syntax:
   ::
      labelname
   ::
   x = math
   .
   pi
   goto labelname

My question is:
if an attribute name and an opening parenthesis are distinct lexems
(and there could be a space or a newline in between),
then what is the pure syntactical way to determine where an attribute is terminated?

Your example:
   local myTable = @table(64) {}

Variant1:
   "@table(64)" is the attribute
   "{}" is the expression

Variant2:
   "@table" is the attribute
   "(64){}" is the expression: you are invoking number 64 and pass empty table as argument :-)

It seems that for your example to work Lua parser should know every attribute to determine whether it expects parameters or not.
It would be more correct if Lua could parse both known and unknown attributes in the same way without knowing their semantics.
It's funny, but angle-bracket-based syntax works here without ambiguity problems:
   local myTable = <table(64)> {}

So, both <attribute> syntax and @attribute syntax are not "suitable".
But an example of a "suitable" syntax could be easily constructed: @[table 64] or @(table 64).
They are both "equal", but I prefer the last one.
To make parameterless attributes look nicer, two syntactical forms may be allowed:
   @attributename              -- for attributes without arguments
   @(attributename arguments)  -- for attributes with arguments
The "attributename" is simple or dot-separated compound identifier (such as "Torch.Tensor")
The "arguments" is a comma-separated list of Lua expressions, similar to a list of arguments for a function's invocation.
For example, to preallocate a table with 64 array slots and 8 hash slots:
   local @const myTable = @(table 64, 8) {}
The @const attribute might also be written as @(const) if you wish.

Such syntax guarantees unambiguous parsing everywhere: in local definition, in function definition, inside any expression, etc.


P.S.
Why @[table 64] and @(table 64) are "suitable", but @<table 64> is not?
Because of incompatibility with Lua expression syntax:
   @(table a>b and a or b)  is OK
   @<table a>b and a or b>  is not OK


P.P.S.
Prefix-ish and postfix-ish approaches are just a mirror of each other.
Neither is better than another.

   local @Torch.Tensor z = @Torch.Tensor x + @Torch.Tensor y
   (@Torch.Tensor z):abs()
   local @integer c = (@deterministic add)(@integer a, @integer b)
   local @const myTable = @(table 64, 8) {}

   local z :Torch.Tensor = x :Torch.Tensor + y :Torch.Tensor
   (z :Torch.Tensor):abs()
   local c :integer = (add :deterministic)(a :integer, b :integer)
   local myTable :const = {} :(table 64, 8)

The only difference between these two approaches is the occurrence of the sad smile in the last line.
Lua syntax should not contain any sadness :-)
Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Philippe Verdy


Le sam. 8 juin 2019 à 07:56, Egor Skriptunoff <[hidden email]> a écrit :
On Fri, Jun 7, 2019 at 12:35 AM Lorenzo Donati wrote:

The more I think about it, the more I find the syntax with "@" more
readable and more easily "expandable": parametrized annotations anyone?
Like for example:

local @const myTable = @table(64) {} -- preallocates 64 elements in the
array part




 
Please note that Lua lexer allows inserting a space between any two lexems.
For example, the following is allowed in Lua syntax:
   ::
      labelname
   ::
   x = math
   .
   pi
   goto labelname

My question is:
if an attribute name and an opening parenthesis are distinct lexems
(and there could be a space or a newline in between),
then what is the pure syntactical way to determine where an attribute is terminated?

Your example:
   local myTable = @table(64) {}

Variant1:
   "@table(64)" is the attribute
   "{}" is the expression

Variant2:
   "@table" is the attribute
   "(64){}" is the expression: you are invoking number 64 and pass empty table as argument :-

You are  repeating my identical remark when I replied to Lorenzo Donati, about why it is ambiguous (and then developed later).

And I also summarized it, but I can repeat my former analysis:

Any annotation in Lua can ONLY FOLLOW another token that:
-  marks the start of a simple statement (like "local", "return", or even ";" for the empty statement), or
 - marks the start of a composite syntaxic unit (like "(", "[", "{", or "begin"), or
 - marks the end of a composite syntaxic unit (like ")", "]", "}", or "end").

So it cannot occur in the middle of an expression (except possibly immediately after "(", "[", "{" or ")", "]", "}", etc. but this depends on the permitted choice for the first token of annotations).

The first token used by the annotation
- MUST NOT be a valid unary operator (like "+", "-" or "not"),
- MUST NOT be a number constant or string constant.
- but it MAY be ANY other existing token

That first token MAY then ALSO unambiguouly be:
- binary operators like ("*", "..", "div", "or", "and", "<", "=", etc.) or ","
- or other reserved keywords used in compound statements ("begin", "end", "if", "then", "else", "do", "while", "repeat", "until", "for", "return", "break", "local", "function", etc.)
- or currently unused token like "@"
- or even possibly ";" (but I would not allow it as it would be errorprone with source code that is possibly partially commented out)

provided that:
- the annotation is ENTIRELY surrounded by "(...)" or "[...]" or "{...}" which would be required (except after some tokens like "local" which mark an explicit start of a new statement that can be annotated)
- or the first token is a currently undefined one like "@" (where the previous surrounding is not always needed)

So we have a large choice for defining them unambiguously and generalizing them!

I still think that choosing "@" for the first token is the best choice. But this does not invalidate the choice of "*" or "<", provided the surrounding rule is used.
and for the proposed syntax "<annotation>" is the worst choice, if we need to surround it by additional parentheses to avoid ambiguous shift-reduce conflicts in some places, resulting in the horrible "(<annotation>)" !



Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Philippe Verdy
As well, given the large possible choice for the first significant token of the annotation, we can say that:

- if the first token is "*" or "@", it is reserved for the standard specification of Lua including the future ones (so what follows that "*" must obey to these specifications)

- all other usable tokens are for extensions that can be entirely and easily ignored by a conforming parser that don't recognize it, it should not invalidate the interpretation of the rest of the syntax. But Lua may add a constraint for them, requiring these annotations to use the "surrounding rule" (with parentheses).



Le sam. 8 juin 2019 à 08:32, Philippe Verdy <[hidden email]> a écrit :


Le sam. 8 juin 2019 à 07:56, Egor Skriptunoff <[hidden email]> a écrit :
On Fri, Jun 7, 2019 at 12:35 AM Lorenzo Donati wrote:

The more I think about it, the more I find the syntax with "@" more
readable and more easily "expandable": parametrized annotations anyone?
Like for example:

local @const myTable = @table(64) {} -- preallocates 64 elements in the
array part




 
Please note that Lua lexer allows inserting a space between any two lexems.
For example, the following is allowed in Lua syntax:
   ::
      labelname
   ::
   x = math
   .
   pi
   goto labelname

My question is:
if an attribute name and an opening parenthesis are distinct lexems
(and there could be a space or a newline in between),
then what is the pure syntactical way to determine where an attribute is terminated?

Your example:
   local myTable = @table(64) {}

Variant1:
   "@table(64)" is the attribute
   "{}" is the expression

Variant2:
   "@table" is the attribute
   "(64){}" is the expression: you are invoking number 64 and pass empty table as argument :-

You are  repeating my identical remark when I replied to Lorenzo Donati, about why it is ambiguous (and then developed later).

And I also summarized it, but I can repeat my former analysis:

Any annotation in Lua can ONLY FOLLOW another token that:
-  marks the start of a simple statement (like "local", "return", or even ";" for the empty statement), or
 - marks the start of a composite syntaxic unit (like "(", "[", "{", or "begin"), or
 - marks the end of a composite syntaxic unit (like ")", "]", "}", or "end").

So it cannot occur in the middle of an expression (except possibly immediately after "(", "[", "{" or ")", "]", "}", etc. but this depends on the permitted choice for the first token of annotations).

The first token used by the annotation
- MUST NOT be a valid unary operator (like "+", "-" or "not"),
- MUST NOT be a number constant or string constant.
- but it MAY be ANY other existing token

That first token MAY then ALSO unambiguouly be:
- binary operators like ("*", "..", "div", "or", "and", "<", "=", etc.) or ","
- or other reserved keywords used in compound statements ("begin", "end", "if", "then", "else", "do", "while", "repeat", "until", "for", "return", "break", "local", "function", etc.)
- or currently unused token like "@"
- or even possibly ";" (but I would not allow it as it would be errorprone with source code that is possibly partially commented out)

provided that:
- the annotation is ENTIRELY surrounded by "(...)" or "[...]" or "{...}" which would be required (except after some tokens like "local" which mark an explicit start of a new statement that can be annotated)
- or the first token is a currently undefined one like "@" (where the previous surrounding is not always needed)

So we have a large choice for defining them unambiguously and generalizing them!

I still think that choosing "@" for the first token is the best choice. But this does not invalidate the choice of "*" or "<", provided the surrounding rule is used.
and for the proposed syntax "<annotation>" is the worst choice, if we need to surround it by additional parentheses to avoid ambiguous shift-reduce conflicts in some places, resulting in the horrible "(<annotation>)" !



Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Philippe Verdy
Another useful case for annotations is to allow compilers to perform some optimizations that are otherwise unsafe.

Just look at this expression:
  f()+f(),
it contains two syntaxically equivalent subexpressions f(). But the compiler cannot cache the result to call the function only once and keep the result in an internally generated caching variable, so it cannot act as if we had written:
  local x=f()
  x+x
As well we cannot declare variables (like x here) in the middle of expressions. But we can annotate the subexpression (f()) to be "constant" and having no side effect:
  *const (f()) + *const (f())
Here the annotation is just written "*const", it does not require surrounding parentheses because "*" is not a valid unary operator in expressions. But the surrounding parentheses around f() are needed, because otherwise the annotation would apply only to "f" and not f(), but we could avoid these extraparentheses if annotations in expressions behave like unary operators and are right-associative in which case it becomes:
  *const f() + *const f()

Now the compiler can generate the internal temporary variable (let's name it "x" even if the name is not visible in the lexical scope) itself for calling f() only once and store its result which is then used in the expression
  x + x

Now that same expression contains two fetches of the same variable, but the compiler still does not know the type of that variable, so it cannot optimize it further using a multiplication by a constant as if it was 
  x * 2
and where it would not even need the temporary variable, so that it would be equivalent to
  f() * 2

To do that, we need a second annotation declaring that the value of the function is a number:
  *const *number f() + *const *number f()
The compiler now detects two occurences of the same subexpression "*number f()" both of them cachable because of "*const". So it first infers as if we had declared:
  local *const x = *number f()
which also translates to
  local *const *number x = f()
and then "sees" the expression
  x + x
 which it can now safely optimize (because it knows that "x" is a number in both operands of the "+" operation to
  x * 2
and then, because the temporary variable "x" is now used only once, it can eliminate it by substitution and generate the same thing as if we had written
  f() * 2

Annotations can then be very useful for provide hints to the compiler, notably everywhere it cannot safely perform type inference. If needed, the compiler will insert type-checking assertion code at run time (throwing errors if the assertion failed, for example, here, if f() did not return a number).

If these hints are not recognized by the engine, then the evaluation of f()+f() will be unchanged, the function will be called twice, possibly returning two different values and possibly not numbers, and a runtime check will see how to compute the addition (within the intrinsic _add(x,y) function call).


Le sam. 8 juin 2019 à 08:44, Philippe Verdy <[hidden email]> a écrit :
As well, given the large possible choice for the first significant token of the annotation, we can say that:

- if the first token is "*" or "@", it is reserved for the standard specification of Lua including the future ones (so what follows that "*" must obey to these specifications)

- all other usable tokens are for extensions that can be entirely and easily ignored by a conforming parser that don't recognize it, it should not invalidate the interpretation of the rest of the syntax. But Lua may add a constraint for them, requiring these annotations to use the "surrounding rule" (with parentheses).



Le sam. 8 juin 2019 à 08:32, Philippe Verdy <[hidden email]> a écrit :


Le sam. 8 juin 2019 à 07:56, Egor Skriptunoff <[hidden email]> a écrit :
On Fri, Jun 7, 2019 at 12:35 AM Lorenzo Donati wrote:

The more I think about it, the more I find the syntax with "@" more
readable and more easily "expandable": parametrized annotations anyone?
Like for example:

local @const myTable = @table(64) {} -- preallocates 64 elements in the
array part




 
Please note that Lua lexer allows inserting a space between any two lexems.
For example, the following is allowed in Lua syntax:
   ::
      labelname
   ::
   x = math
   .
   pi
   goto labelname

My question is:
if an attribute name and an opening parenthesis are distinct lexems
(and there could be a space or a newline in between),
then what is the pure syntactical way to determine where an attribute is terminated?

Your example:
   local myTable = @table(64) {}

Variant1:
   "@table(64)" is the attribute
   "{}" is the expression

Variant2:
   "@table" is the attribute
   "(64){}" is the expression: you are invoking number 64 and pass empty table as argument :-

You are  repeating my identical remark when I replied to Lorenzo Donati, about why it is ambiguous (and then developed later).

And I also summarized it, but I can repeat my former analysis:

Any annotation in Lua can ONLY FOLLOW another token that:
-  marks the start of a simple statement (like "local", "return", or even ";" for the empty statement), or
 - marks the start of a composite syntaxic unit (like "(", "[", "{", or "begin"), or
 - marks the end of a composite syntaxic unit (like ")", "]", "}", or "end").

So it cannot occur in the middle of an expression (except possibly immediately after "(", "[", "{" or ")", "]", "}", etc. but this depends on the permitted choice for the first token of annotations).

The first token used by the annotation
- MUST NOT be a valid unary operator (like "+", "-" or "not"),
- MUST NOT be a number constant or string constant.
- but it MAY be ANY other existing token

That first token MAY then ALSO unambiguouly be:
- binary operators like ("*", "..", "div", "or", "and", "<", "=", etc.) or ","
- or other reserved keywords used in compound statements ("begin", "end", "if", "then", "else", "do", "while", "repeat", "until", "for", "return", "break", "local", "function", etc.)
- or currently unused token like "@"
- or even possibly ";" (but I would not allow it as it would be errorprone with source code that is possibly partially commented out)

provided that:
- the annotation is ENTIRELY surrounded by "(...)" or "[...]" or "{...}" which would be required (except after some tokens like "local" which mark an explicit start of a new statement that can be annotated)
- or the first token is a currently undefined one like "@" (where the previous surrounding is not always needed)

So we have a large choice for defining them unambiguously and generalizing them!

I still think that choosing "@" for the first token is the best choice. But this does not invalidate the choice of "*" or "<", provided the surrounding rule is used.
and for the proposed syntax "<annotation>" is the worst choice, if we need to surround it by additional parentheses to avoid ambiguous shift-reduce conflicts in some places, resulting in the horrible "(<annotation>)" !



Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Philippe Verdy
Another example:
  while x < y do something(x) end

Nothing indicates to the compiler that the subexpression "y" never changes when calling "something(x)", so the compiler has to fetch it at each loop.

Adding an annotation makes it clear:
  while x < *const y do something(x) end

The compiler will generate a code as if we had written:
  local *const tmpy = y
  while x < tmpy do something(x) end

Note that "y" above may be any complex subexpression, possibly including function calls, submember accesses and so on, with side effects.

In this example we have still not annotated the variable "x" but we could do so as well:
  while *const x < *const y do something(x) end

which will then turn into a single evaluation of the expression "*const < *const y": if it evaluates to false, no loop will be performed, otherwise it will be an infinite loop, as is we had written:
  local *const tmpx, *const tmpy = x, y
  if tmpx < tmpy then
    while true do something(tmpx)
 end

But the compiler will also note that tmpy is used only once (in the internally generated if) so it will drop the tmpy variable (by substitution) to compile the equivalent of:
  local *const tmpx, *const tmpy = x
  if tmpx < y then
    while true do something(tmpx)
 end


Le sam. 8 juin 2019 à 09:59, Philippe Verdy <[hidden email]> a écrit :
Another useful case for annotations is to allow compilers to perform some optimizations that are otherwise unsafe.

Just look at this expression:
  f()+f(),
it contains two syntaxically equivalent subexpressions f(). But the compiler cannot cache the result to call the function only once and keep the result in an internally generated caching variable, so it cannot act as if we had written:
  local x=f()
  x+x
As well we cannot declare variables (like x here) in the middle of expressions. But we can annotate the subexpression (f()) to be "constant" and having no side effect:
  *const (f()) + *const (f())
Here the annotation is just written "*const", it does not require surrounding parentheses because "*" is not a valid unary operator in expressions. But the surrounding parentheses around f() are needed, because otherwise the annotation would apply only to "f" and not f(), but we could avoid these extraparentheses if annotations in expressions behave like unary operators and are right-associative in which case it becomes:
  *const f() + *const f()

Now the compiler can generate the internal temporary variable (let's name it "x" even if the name is not visible in the lexical scope) itself for calling f() only once and store its result which is then used in the expression
  x + x

Now that same expression contains two fetches of the same variable, but the compiler still does not know the type of that variable, so it cannot optimize it further using a multiplication by a constant as if it was 
  x * 2
and where it would not even need the temporary variable, so that it would be equivalent to
  f() * 2

To do that, we need a second annotation declaring that the value of the function is a number:
  *const *number f() + *const *number f()
The compiler now detects two occurences of the same subexpression "*number f()" both of them cachable because of "*const". So it first infers as if we had declared:
  local *const x = *number f()
which also translates to
  local *const *number x = f()
and then "sees" the expression
  x + x
 which it can now safely optimize (because it knows that "x" is a number in both operands of the "+" operation to
  x * 2
and then, because the temporary variable "x" is now used only once, it can eliminate it by substitution and generate the same thing as if we had written
  f() * 2

Annotations can then be very useful for provide hints to the compiler, notably everywhere it cannot safely perform type inference. If needed, the compiler will insert type-checking assertion code at run time (throwing errors if the assertion failed, for example, here, if f() did not return a number).

If these hints are not recognized by the engine, then the evaluation of f()+f() will be unchanged, the function will be called twice, possibly returning two different values and possibly not numbers, and a runtime check will see how to compute the addition (within the intrinsic _add(x,y) function call).


Le sam. 8 juin 2019 à 08:44, Philippe Verdy <[hidden email]> a écrit :
As well, given the large possible choice for the first significant token of the annotation, we can say that:

- if the first token is "*" or "@", it is reserved for the standard specification of Lua including the future ones (so what follows that "*" must obey to these specifications)

- all other usable tokens are for extensions that can be entirely and easily ignored by a conforming parser that don't recognize it, it should not invalidate the interpretation of the rest of the syntax. But Lua may add a constraint for them, requiring these annotations to use the "surrounding rule" (with parentheses).



Le sam. 8 juin 2019 à 08:32, Philippe Verdy <[hidden email]> a écrit :


Le sam. 8 juin 2019 à 07:56, Egor Skriptunoff <[hidden email]> a écrit :
On Fri, Jun 7, 2019 at 12:35 AM Lorenzo Donati wrote:

The more I think about it, the more I find the syntax with "@" more
readable and more easily "expandable": parametrized annotations anyone?
Like for example:

local @const myTable = @table(64) {} -- preallocates 64 elements in the
array part




 
Please note that Lua lexer allows inserting a space between any two lexems.
For example, the following is allowed in Lua syntax:
   ::
      labelname
   ::
   x = math
   .
   pi
   goto labelname

My question is:
if an attribute name and an opening parenthesis are distinct lexems
(and there could be a space or a newline in between),
then what is the pure syntactical way to determine where an attribute is terminated?

Your example:
   local myTable = @table(64) {}

Variant1:
   "@table(64)" is the attribute
   "{}" is the expression

Variant2:
   "@table" is the attribute
   "(64){}" is the expression: you are invoking number 64 and pass empty table as argument :-

You are  repeating my identical remark when I replied to Lorenzo Donati, about why it is ambiguous (and then developed later).

And I also summarized it, but I can repeat my former analysis:

Any annotation in Lua can ONLY FOLLOW another token that:
-  marks the start of a simple statement (like "local", "return", or even ";" for the empty statement), or
 - marks the start of a composite syntaxic unit (like "(", "[", "{", or "begin"), or
 - marks the end of a composite syntaxic unit (like ")", "]", "}", or "end").

So it cannot occur in the middle of an expression (except possibly immediately after "(", "[", "{" or ")", "]", "}", etc. but this depends on the permitted choice for the first token of annotations).

The first token used by the annotation
- MUST NOT be a valid unary operator (like "+", "-" or "not"),
- MUST NOT be a number constant or string constant.
- but it MAY be ANY other existing token

That first token MAY then ALSO unambiguouly be:
- binary operators like ("*", "..", "div", "or", "and", "<", "=", etc.) or ","
- or other reserved keywords used in compound statements ("begin", "end", "if", "then", "else", "do", "while", "repeat", "until", "for", "return", "break", "local", "function", etc.)
- or currently unused token like "@"
- or even possibly ";" (but I would not allow it as it would be errorprone with source code that is possibly partially commented out)

provided that:
- the annotation is ENTIRELY surrounded by "(...)" or "[...]" or "{...}" which would be required (except after some tokens like "local" which mark an explicit start of a new statement that can be annotated)
- or the first token is a currently undefined one like "@" (where the previous surrounding is not always needed)

So we have a large choice for defining them unambiguously and generalizing them!

I still think that choosing "@" for the first token is the best choice. But this does not invalidate the choice of "*" or "<", provided the surrounding rule is used.
and for the proposed syntax "<annotation>" is the worst choice, if we need to surround it by additional parentheses to avoid ambiguous shift-reduce conflicts in some places, resulting in the horrible "(<annotation>)" !



Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Lorenzo Donati-3
In reply to this post by Egor Skriptunoff-2
On 08/06/2019 07:55, Egor Skriptunoff wrote:

> On Fri, Jun 7, 2019 at 12:35 AM Lorenzo Donati wrote:
>
>>
>> The more I think about it, the more I find the syntax with "@"
>> more readable and more easily "expandable": parametrized
>> annotations anyone? Like for example:
>>
>> local @const myTable = @table(64) {} -- preallocates 64 elements in
>> the array part
>>
>>
>>
>
>
> Please note that Lua lexer allows inserting a space between any two
> lexems. For example, the following is allowed in Lua syntax: ::
> labelname :: x = math . pi goto labelname
>
> My question is: if an attribute name and an opening parenthesis are
> distinct lexems (and there could be a space or a newline in
> between), then what is the pure syntactical way to determine where an
> attribute is terminated?
>
> Your example: local myTable = @table(64) {}
>
> Variant1: "@table(64)" is the attribute "{}" is the expression
>
> Variant2: "@table" is the attribute "(64){}" is the expression: you
> are invoking number 64 and pass empty table as argument :-)
>
> It seems that for your example to work Lua parser should know every
> attribute to determine whether it expects parameters or not. It would
> be more correct if Lua could parse both known and unknown attributes
> in the same way without knowing their semantics. It's funny, but
> angle-bracket-based syntax works here without ambiguity problems:
> local myTable = <table(64)> {}
>
> So, both <attribute> syntax and @attribute syntax are not
> "suitable". But an example of a "suitable" syntax could be easily
> constructed: @[table 64] or @(table 64). They are both "equal", but I
> prefer the last one. To make parameterless attributes look nicer, two
> syntactical forms may be allowed: @attributename              -- for
> attributes without arguments @(attributename arguments)  -- for
> attributes with arguments The "attributename" is simple or
> dot-separated compound identifier (such as "Torch.Tensor") The
> "arguments" is a comma-separated list of Lua expressions, similar to
> a list of arguments for a function's invocation. For example, to
> preallocate a table with 64 array slots and 8 hash slots: local
> @const myTable = @(table 64, 8) {} The @const attribute might also be
> written as @(const) if you wish.
>
> Such syntax guarantees unambiguous parsing everywhere: in local
> definition, in function definition, inside any expression, etc.
>
>
> P.S. Why @[table 64] and @(table 64) are "suitable", but @<table 64>
> is not? Because of incompatibility with Lua expression syntax:
> @(table a>b and a or b)  is OK @<table a>b and a or b>  is not OK
>
>
> P.P.S. Prefix-ish and postfix-ish approaches are just a mirror of
> each other. Neither is better than another.
>
> local @Torch.Tensor z = @Torch.Tensor x + @Torch.Tensor y
> (@Torch.Tensor z):abs() local @integer c = (@deterministic
> add)(@integer a, @integer b) local @const myTable = @(table 64, 8)
> {}
>
> local z :Torch.Tensor = x :Torch.Tensor + y :Torch.Tensor (z
> :Torch.Tensor):abs() local c :integer = (add :deterministic)(a
> :integer, b :integer) local myTable :const = {} :(table 64, 8)
>


OOF! Nice points. Thanks!

Anyway I didn't think too well to other possible sources of ambiguities
when I "spat out" my suggestion, so it's nice you fleshed them out so
thoroughly.

Your @(attribute parameters) syntax looks quite well, IMO, and gets my
point through: it is a possible future enhancement and the "@" makes it
much less prone to confusion with other language constructs

> The only difference between these two approaches is the occurrence of
> the sad smile in the last line. Lua syntax should not contain any
> sadness :-)
>

I fully agree!!! :-)

Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Philippe Verdy
In reply to this post by Philippe Verdy
Note also that the compiler may still insert runtime check in the generated code to see if the declared "*const" annotations are effectively true before looping again:
  local *const tmpx, *const tmpy = x
  if tmpx < y then
    while true do
      something(tmpx)
      assert x == tmpx
    end
  end


Le sam. 8 juin 2019 à 10:13, Philippe Verdy <[hidden email]> a écrit :
Another example:
  while x < y do something(x) end

Nothing indicates to the compiler that the subexpression "y" never changes when calling "something(x)", so the compiler has to fetch it at each loop.

Adding an annotation makes it clear:
  while x < *const y do something(x) end

The compiler will generate a code as if we had written:
  local *const tmpy = y
  while x < tmpy do something(x) end

Note that "y" above may be any complex subexpression, possibly including function calls, submember accesses and so on, with side effects.

In this example we have still not annotated the variable "x" but we could do so as well:
  while *const x < *const y do something(x) end

which will then turn into a single evaluation of the expression "*const < *const y": if it evaluates to false, no loop will be performed, otherwise it will be an infinite loop, as is we had written:
  local *const tmpx, *const tmpy = x, y
  if tmpx < tmpy then
    while true do something(tmpx)
 end

But the compiler will also note that tmpy is used only once (in the internally generated if) so it will drop the tmpy variable (by substitution) to compile the equivalent of:
  local *const tmpx, *const tmpy = x
  if tmpx < y then
    while true do something(tmpx)
 end


Le sam. 8 juin 2019 à 09:59, Philippe Verdy <[hidden email]> a écrit :
Another useful case for annotations is to allow compilers to perform some optimizations that are otherwise unsafe.

Just look at this expression:
  f()+f(),
it contains two syntaxically equivalent subexpressions f(). But the compiler cannot cache the result to call the function only once and keep the result in an internally generated caching variable, so it cannot act as if we had written:
  local x=f()
  x+x
As well we cannot declare variables (like x here) in the middle of expressions. But we can annotate the subexpression (f()) to be "constant" and having no side effect:
  *const (f()) + *const (f())
Here the annotation is just written "*const", it does not require surrounding parentheses because "*" is not a valid unary operator in expressions. But the surrounding parentheses around f() are needed, because otherwise the annotation would apply only to "f" and not f(), but we could avoid these extraparentheses if annotations in expressions behave like unary operators and are right-associative in which case it becomes:
  *const f() + *const f()

Now the compiler can generate the internal temporary variable (let's name it "x" even if the name is not visible in the lexical scope) itself for calling f() only once and store its result which is then used in the expression
  x + x

Now that same expression contains two fetches of the same variable, but the compiler still does not know the type of that variable, so it cannot optimize it further using a multiplication by a constant as if it was 
  x * 2
and where it would not even need the temporary variable, so that it would be equivalent to
  f() * 2

To do that, we need a second annotation declaring that the value of the function is a number:
  *const *number f() + *const *number f()
The compiler now detects two occurences of the same subexpression "*number f()" both of them cachable because of "*const". So it first infers as if we had declared:
  local *const x = *number f()
which also translates to
  local *const *number x = f()
and then "sees" the expression
  x + x
 which it can now safely optimize (because it knows that "x" is a number in both operands of the "+" operation to
  x * 2
and then, because the temporary variable "x" is now used only once, it can eliminate it by substitution and generate the same thing as if we had written
  f() * 2

Annotations can then be very useful for provide hints to the compiler, notably everywhere it cannot safely perform type inference. If needed, the compiler will insert type-checking assertion code at run time (throwing errors if the assertion failed, for example, here, if f() did not return a number).

If these hints are not recognized by the engine, then the evaluation of f()+f() will be unchanged, the function will be called twice, possibly returning two different values and possibly not numbers, and a runtime check will see how to compute the addition (within the intrinsic _add(x,y) function call).


Le sam. 8 juin 2019 à 08:44, Philippe Verdy <[hidden email]> a écrit :
As well, given the large possible choice for the first significant token of the annotation, we can say that:

- if the first token is "*" or "@", it is reserved for the standard specification of Lua including the future ones (so what follows that "*" must obey to these specifications)

- all other usable tokens are for extensions that can be entirely and easily ignored by a conforming parser that don't recognize it, it should not invalidate the interpretation of the rest of the syntax. But Lua may add a constraint for them, requiring these annotations to use the "surrounding rule" (with parentheses).



Le sam. 8 juin 2019 à 08:32, Philippe Verdy <[hidden email]> a écrit :


Le sam. 8 juin 2019 à 07:56, Egor Skriptunoff <[hidden email]> a écrit :
On Fri, Jun 7, 2019 at 12:35 AM Lorenzo Donati wrote:

The more I think about it, the more I find the syntax with "@" more
readable and more easily "expandable": parametrized annotations anyone?
Like for example:

local @const myTable = @table(64) {} -- preallocates 64 elements in the
array part




 
Please note that Lua lexer allows inserting a space between any two lexems.
For example, the following is allowed in Lua syntax:
   ::
      labelname
   ::
   x = math
   .
   pi
   goto labelname

My question is:
if an attribute name and an opening parenthesis are distinct lexems
(and there could be a space or a newline in between),
then what is the pure syntactical way to determine where an attribute is terminated?

Your example:
   local myTable = @table(64) {}

Variant1:
   "@table(64)" is the attribute
   "{}" is the expression

Variant2:
   "@table" is the attribute
   "(64){}" is the expression: you are invoking number 64 and pass empty table as argument :-

You are  repeating my identical remark when I replied to Lorenzo Donati, about why it is ambiguous (and then developed later).

And I also summarized it, but I can repeat my former analysis:

Any annotation in Lua can ONLY FOLLOW another token that:
-  marks the start of a simple statement (like "local", "return", or even ";" for the empty statement), or
 - marks the start of a composite syntaxic unit (like "(", "[", "{", or "begin"), or
 - marks the end of a composite syntaxic unit (like ")", "]", "}", or "end").

So it cannot occur in the middle of an expression (except possibly immediately after "(", "[", "{" or ")", "]", "}", etc. but this depends on the permitted choice for the first token of annotations).

The first token used by the annotation
- MUST NOT be a valid unary operator (like "+", "-" or "not"),
- MUST NOT be a number constant or string constant.
- but it MAY be ANY other existing token

That first token MAY then ALSO unambiguouly be:
- binary operators like ("*", "..", "div", "or", "and", "<", "=", etc.) or ","
- or other reserved keywords used in compound statements ("begin", "end", "if", "then", "else", "do", "while", "repeat", "until", "for", "return", "break", "local", "function", etc.)
- or currently unused token like "@"
- or even possibly ";" (but I would not allow it as it would be errorprone with source code that is possibly partially commented out)

provided that:
- the annotation is ENTIRELY surrounded by "(...)" or "[...]" or "{...}" which would be required (except after some tokens like "local" which mark an explicit start of a new statement that can be annotated)
- or the first token is a currently undefined one like "@" (where the previous surrounding is not always needed)

So we have a large choice for defining them unambiguously and generalizing them!

I still think that choosing "@" for the first token is the best choice. But this does not invalidate the choice of "*" or "<", provided the surrounding rule is used.
and for the proposed syntax "<annotation>" is the worst choice, if we need to surround it by additional parentheses to avoid ambiguous shift-reduce conflicts in some places, resulting in the horrible "(<annotation>)" !



Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Lorenzo Donati-3
In reply to this post by Philippe Verdy
Sorry for top posting, but...

...could you please abstain from answering your own messages. This
completely disrupts a thread flow and makes it extremely difficult to
follow the different sub-threads. Especially when it is done repeatedly
and "recursively".

Moreover, it is very bad netiquette and it's a technique that should be
used only in "emergency circumstances" when one absolutely MUST
correct/add something to his own post before anyone else has the chance
of answering.

Thank you.

-- Lorenzo



On 08/06/2019 05:44, Philippe Verdy wrote:

> Yes but Java requires the ";" terminator, so there's no ambiguity when
> parsing, even if the annotation precedes all the rest of the statement.
>
> In Lua, without the required ";" there will an ambiguity of parsing if the
> annotation does not follow immediately a statement initial keyword (local,
> function, for, return, if, then, else, begin...), or a "(" or "[" or "{".
>
>
> Le ven. 7 juin 2019 à 18:07, Dibyendu Majumdar <[hidden email]> a
> écrit :
>
>>> There is a big difference between all those syntaxes
>> [snip]
>>> They are all prefixed to the whole item to which they
>>> apply. Following their syntax, we should write '@toclose local x = 1',
>>> instead of 'local @toclose x = 1'.
>>>
>>
>> In Java the annotation precedes the type in a declaration; it being
>> classed as a type modifier in the grammar (this is one of the uses).
>> Lua of course doesn't have type declarations therefore annotations
>> cannot be placed in the same way.
>>
>> Regards
>>
>>
>



Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Philippe Verdy
In reply to this post by Egor Skriptunoff-2
Your proposed use of ":" is ambiguous as well because it is funcamentally a binary operator:

- it is a creating a "functor" type object between a pair of objects (the first one being the object to give to the first parameter of a function call, the second one being the identifier of one of its members) and that functor is then followed by one additional parameter (a single unary expression) or by zero or more additional parameters (in a list of arbitrary expressions)

As such, your syntax:

{} :table

is ambiguous if the annotation it is not surrounded by parentheses (including the first token ":"):
{} (:table)
{} (:table 64, 68)



Le sam. 8 juin 2019 à 07:56, Egor Skriptunoff <[hidden email]> a écrit :
On Fri, Jun 7, 2019 at 12:35 AM Lorenzo Donati wrote:

The more I think about it, the more I find the syntax with "@" more
readable and more easily "expandable": parametrized annotations anyone?
Like for example:

local @const myTable = @table(64) {} -- preallocates 64 elements in the
array part




 
Please note that Lua lexer allows inserting a space between any two lexems.
For example, the following is allowed in Lua syntax:
   ::
      labelname
   ::
   x = math
   .
   pi
   goto labelname

My question is:
if an attribute name and an opening parenthesis are distinct lexems
(and there could be a space or a newline in between),
then what is the pure syntactical way to determine where an attribute is terminated?

Your example:
   local myTable = @table(64) {}

Variant1:
   "@table(64)" is the attribute
   "{}" is the expression

Variant2:
   "@table" is the attribute
   "(64){}" is the expression: you are invoking number 64 and pass empty table as argument :-)

It seems that for your example to work Lua parser should know every attribute to determine whether it expects parameters or not.
It would be more correct if Lua could parse both known and unknown attributes in the same way without knowing their semantics.
It's funny, but angle-bracket-based syntax works here without ambiguity problems:
   local myTable = <table(64)> {}

So, both <attribute> syntax and @attribute syntax are not "suitable".
But an example of a "suitable" syntax could be easily constructed: @[table 64] or @(table 64).
They are both "equal", but I prefer the last one.
To make parameterless attributes look nicer, two syntactical forms may be allowed:
   @attributename              -- for attributes without arguments
   @(attributename arguments)  -- for attributes with arguments
The "attributename" is simple or dot-separated compound identifier (such as "Torch.Tensor")
The "arguments" is a comma-separated list of Lua expressions, similar to a list of arguments for a function's invocation.
For example, to preallocate a table with 64 array slots and 8 hash slots:
   local @const myTable = @(table 64, 8) {}
The @const attribute might also be written as @(const) if you wish.

Such syntax guarantees unambiguous parsing everywhere: in local definition, in function definition, inside any expression, etc.


P.S.
Why @[table 64] and @(table 64) are "suitable", but @<table 64> is not?
Because of incompatibility with Lua expression syntax:
   @(table a>b and a or b)  is OK
   @<table a>b and a or b>  is not OK


P.P.S.
Prefix-ish and postfix-ish approaches are just a mirror of each other.
Neither is better than another.

   local @Torch.Tensor z = @Torch.Tensor x + @Torch.Tensor y
   (@Torch.Tensor z):abs()
   local @integer c = (@deterministic add)(@integer a, @integer b)
   local @const myTable = @(table 64, 8) {}

   local z :Torch.Tensor = x :Torch.Tensor + y :Torch.Tensor
   (z :Torch.Tensor):abs()
   local c :integer = (add :deterministic)(a :integer, b :integer)
   local myTable :const = {} :(table 64, 8)

The only difference between these two approaches is the occurrence of the sad smile in the last line.
Lua syntax should not contain any sadness :-)
Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Philippe Verdy
In reply to this post by Lorenzo Donati-3
Le sam. 8 juin 2019 à 10:25, Lorenzo Donati <[hidden email]> a écrit :
Sorry for top posting, but...

...could you please abstain from answering your own messages. This
completely disrupts a thread flow and makes it extremely difficult to
follow the different sub-threads. Especially when it is done repeatedly
and "recursively".

But it was threaded correctly subtopic by subtopic, you may reply to independant ones they are all attached to the initial thread in a single group.
Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Philippe Verdy
In reply to this post by Lorenzo Donati-3
Le sam. 8 juin 2019 à 10:18, Lorenzo Donati <[hidden email]> a écrit :
On 08/06/2019 07:55, Egor Skriptunoff wrote:
> On Fri, Jun 7, 2019 at 12:35 AM Lorenzo Donati wrote:
> P.P.S. Prefix-ish and postfix-ish approaches are just a mirror of
> each other. Neither is better than another.

OOF! Nice points. Thanks!

That's false, the prefix-ish and postfix-ish approaches are not equivalent. Notably the postfix-ish style does not allow annotating unary expressions:
   -x(:annotation)
unless you use extra parentheses:
   -(x(:annotation))
or
   (-x)(:annotation)

But it does occur with the prefix-ish style (which is currently proposed with "local *toclose x", except that the leading token is a "*" instead of a ":", but this does not change the syntaxic problem as both tokens are binary operators) that behaves like other unary operators (with a right associativity):
   -(:annotation)x

Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Lorenzo Donati-3
In reply to this post by Philippe Verdy
On 08/06/2019 10:29, Philippe Verdy wrote:

> Le sam. 8 juin 2019 à 10:25, Lorenzo Donati <[hidden email]> a
> écrit :
>
>> Sorry for top posting, but...
>>
>> ...could you please abstain from answering your own messages. This
>> completely disrupts a thread flow and makes it extremely difficult to
>> follow the different sub-threads. Especially when it is done repeatedly
>> and "recursively".
>>
>
> But it was threaded correctly subtopic by subtopic, you may reply to
> independant ones they are all attached to the initial thread in a single
> group.
>

Sorry, but this is not the point. And this is not how people use mailing
lists. From what you say it seems you are trying to enforce "your way"
of organizing a thread.

We could go on forever discussing whether or not it is a good idea, but
the fact is that mailing lists are old, pre-web things and the current
de-facto "standard" is not what you propose or use.

The fact that people still /want/ to use this "prehistorical" <grin>
medium, and not a new-fangled blog/forum/chat/webwatever thingy, means
that the "standard way" to use it is THE "right way".

Of course there are some limited community-related ML netiquette
variations, but the core "rules" are the same. And this is why (IMO) MLs
are still used. Once you know how to use one, you know how to use any
one (just lurk a bit after registration and read some of the archives to
get the feel of the community and you are ready to go!).

The "standard" way to do what you say is: quote the relevant part of the
original message, possibly snipping-out the rest if it gets in the way,
then reply to the part you are interested in.

If you keep on doing what you do, people might filter-out your name in
their mail client to avoid clutter, and you'll end-up being listened-to
less rather than more (and the web archive of the thread will look like
a mess anyway).

If you find it difficult to put everything in a single message that's
probably a symptom of one of these "problems":

1. Your message is definitely too long for a mailing list format (too
much inline code? put it in an attachment file, if it makes sense).

2. You are rambling too much. Stick to the point in the subject and open
a new thread to discuss other ideas. You can say you are opening a new
thread and people could read it if they are interested. You can't post a
link, like in a forum, but again, MLs don't work like a forum.

3. You didn't think well about what you are trying to say and your
messages are not part of a real answer/question dialogue (as it is
supposed to be on a mailing list), but are more a "flow of
consciousness" kind of thing. In other words, you are using a mailing
list more like you would use a blog, but that's not the way MLs are
supposed to work.

4. Your message is not really a message. MLs have been thought as a way
for a /group/ of people to collaborate and exchange information and
opinions in a quick, easy TEXTUAL way, in which everyone could (ideally)
follow what /every other/ people on the thread are saying.
They are not a place for posting essays.


In other words, keep in mind what a mailing list is NOT (e.g. not a
forum, not a chat, not a blog, not a WIKI, not a BBS, not an online
magazine, etc.)




Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Philippe Verdy

Le sam. 8 juin 2019 à 11:09, Lorenzo Donati <[hidden email]> a écrit :
1. Your message is definitely too long for a mailing list format (too
much inline code? put it in an attachment file, if it makes sense).

It was not too long 

2. You are rambling too much. Stick to the point in the subject and open
a new thread to discuss other ideas.

I was not "rambling" at all, and up to the point of the topic "future of annotations".

All was about ambiguity of parsing (and the problems they already cause in Lua or the possible cost/complexity in implementations, and possible caveats/errors for programmers trying to use them).

We've got various proposals already, we need to evaluate them in the context of "future of annotations", and that's what I did.

For now the "*" notations, used in a prefix-ish style, do not cause major problems (even if they look C-ish and related to "pointer" types, which they are not in Lua) in their current limited scope of use (in "local" variable declarations in which they are still safe, not requiring extra parentheses to define their scope of application which is however not very well precise: does it extend to all variables declared in a comma-separated list being an annotation of "local", or variable by variable that they annotate?)

And several leading tokens have already been discussed: "*", "<" (in a pair), ":", "@". I compared them and look at those that cause problems, which are notably those that can be used also syntactically as binary operators ("*", "/", "+", "-", "^", "div", ":", "<", etc) or separators (",", ";", "(", ")", etc.)

But it is not dramatic (we may still use "*") if we can surround annotations when needed (including their leading token) by parentheses, to avoid the ambiguity with currified function calls, which potentially occurs in "sin -30" where the unary operator "-" becomes 'unexpectedly' a binary substraction without these parentheses.

Also the placement (prefix or postfix) of the annotation has an important syntaxic distinction that must be disambiguated, related to their scope of application (what they annotate) and I tend to prefer the prefix placement which behaves better with other unary operators, but does not change the existing 'problem' with currified function calls except if the annotations cannot be confused with a binary operator.

For now, only "@" does not have any one of these problems (when used as a prefixed annotation) simply because it cannot be used also as a binary operator (for now, but most probably never...).
Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

szbnwer@gmail.com
hi there! :)

Philippe:

[not so important part]

> Note also that Javascript has similar issues, but it does not require ";" only because it differenciates whitespaces (those that include a newline not part of a comment from other whitespaces); you cannot safely remove all newlines in Javascript and merge all whitespaces to a single space, without adding a few ";" where needed. So Javascript's parser defines the newline token!

thats even more complicated than that. if u write
```
x=
3
```
then it will work, cuz there is an unfinished expression (and its
maybe even more complicated than this ... once ive seen all the rules,
it was a bit twisted...)
(off topic, but not, cuz parsing is related to the new syntax.)

[not so important part]

> The "expression statements" in Lua currently only allow expressions that are simple function calls (with the form "a(b...)" or "a b..." if it is currified), but it currently forbids expressions using unary or binary operators, like "a + b" or "(a + b)" or "a[b]", as valid statements, even if they are intrinsicly a function call, to "__add(a, b)" and "__getindex(a, b)" in these examples.

> And I see no fundamental problem in extending "expressions statements" in Lua to any valid Lua expression (even if this has no effect such as using a simple number or string, or "-1" with the unary operator, or "1+2" with a binary operator). It does not change the syntax complexity at all, but just simplifies it with no additional cost and without changing any existing semantics or runtime and debugging behavior.

thats kinda similar to one of my greatest wishes, that is writing `a
and b(c) or d(e)` (as a freestanding statement) instead of `local _=a
and b(c) or d(e)` that is really an ugly hack (i mean `local _=` :D )
(((the other greatest is `{}:x()`)))

the main reason ive got against both of these is the ambiguity in cases like
```
local a, b, c=0
x and y() or z()
```
where b will become something evil, but we already need the semicolons
here and there. otherwise i dare to believe that lua folks love the
`and`+`or` "hacks" and it would be nice to have these instead of
writing `if x then y() else z() end` in simple cases and i think the
alternative way is even faster to interpret mentally and helps against
global warming (never mind, im crazy) cuz saves some cpu cycles :D
(mayb not, but i just want to believe it like so. :D )

btw i think the best is to group locals only where it makes sense, like:
```
local a, b, c -- uninitialized, no harm can happen, cuz no assignment
returnRandomAmountOfVals()
local x, y, z=1, 2, 3 -- all 3 vars are taken, no accidents
returnRandomAmountOfVals()
```
and anyway we can already mess with assignments with any function calls...
(off topic, but not, cuz i think that the same reasoning applies against them)



Egor:
> @attributename              -- for attributes without arguments
   @(attributename arguments)  -- for attributes with arguments

+1, but a big ONE!
currently this is the most friendly for my eyes

Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

szbnwer@gmail.com
sorry, i accidentally sent my msg undone while editing, so here is the end

(also, sorry for the length, however i marked the calmly skippable stuffs)

note, that the 2nd "[not so important part]" wanted to be "[/not so
important part]" in the previous message!



Egor:

> @attributename              -- for attributes without arguments
   @(attributename arguments)  -- for attributes with arguments

+1, but a big ONE!
currently this is the most friendly for my eyes; there is a simple and
an advanced form; its more harmonic with other languages; those huge
`@` signs are good indicators to not misinterpret the code, and also,
it doesnt overload existing stuffs to give space for confusion.

[not so important part]
i even thought that i hate the new lua when ive seen `*toclose`, but
up to this point, slowly im about to accept the new game... mostly cuz
its just a matter of organizing codes, as we lived fine without
this/these, and it just gives more more twist to the codeflow; and cuz
i prefer minimalism and luajit, and it feels a bit like as a pointless
run for growing the abyss, that is really against the luaverse, where
luajit have  its own rank and weight that cant be overlooked, but it
just wont help much to aid anything, but only a cornercase ive never
seen, and that will just overcomplicate out nice playground...

(there is no need to correct me about these or anything like, i think
it could take much words to start to talk about these, and i didnt
even intented to be perfectly legit/clean here, just somewhat
summarize my view/feelings/thoughts/whatever. also, plz dont hate me
for anything in this block, its nothing too serious...)
[/not so important part]



Lorenzo:

[clearly offtopic; and not even so much important!]

about mailing lists, threads, netiquette:

i think Philippe made a big and great effort, thought much about it,
ideas came after ideas, and didnt even say bulls*t or repeat himself
in empty phrases, or gone too far from the topic. otherwise netiquette
is clearly an offtopic, and it wasnt just 3-5 lines of text, but NO
offense!! i see ur good intention, and u werent even rude (and i dont
even really know how to hate anyone, im a hippie :D )

so i think there is no way to be consistent, when ppl start a new
thread related to some hot topics, ill read whichever thread have the
oldest reply and the whole will become messy as they are actually
still somewhat related. furthermore we are humans, we have a lot to
think/say, and its impossible to not come up here and there with
anything that have weaker or no relation with the actual topic...
things dont really have clear boundaries, more ppl will talk about the
same, more threads will be about the same, and different topics will
have their relations. we are trying to "serialize" endless graphs... i
think a search for "cat" on lua-l will actually yield something that
is not `cat example.lua`, but if not, then maybe i can give a shot for
"coffee"... :D

actually i have an idea (for real e-democracy) that can resolve all of
bad wordings, endless repetitions from different mouths, organization
and whatever current problems we have on any forums, but that requires
its own ecosystem and not suitable for the current systems, so we have
to live with issues and try our best til i make it done, but thats a
too long topic for now, even if not totally a secret, just in its
incubator state against illuminati and the like whoever would/could
wreck it with money/power/big team/opposite will/whatever before it
would become any much mature, but the good thing in it is that its in
lua and will be free - uhmm ... whenever ill reach that point... :D i
hope before the world will start burn in flames, but not before it
will be enough mature to not catalyze just the same (if anything, but
im optimistic :) )

my actual point of view about off topics netiquette and the like is
that i know that im an "alien", and i dont think that my random bits
would deserve their own topics and get more than a half reply, but i
only want it to be read by the right ppl (just think about it, look at
my messages, and ull see they wont fit neither that way :D ) so i only
try my best to be kind with everyone, to warn the audience about
anything like unimportant/offtopic/whatever stuffs, and to talk
rarely, while hoping i wont be hated by anyone, and wont get on those
blacklists... :D

otherwise suggestions are always welcome (preferably off list, i
already disturbed enough for now :D ) about my behavior/style/whatever
to try to make everyone as happy as im able to do so, but not more
than that, cuz im an "alien", but with good intentions... :)



basic html gmail collapesed the messages when sent the previous msg,
so i just wont search for a quote, but also +1 for the "happy lua
syntax" guideline as well! :D

[/clearly offtopic; and not even so much important!]



all the bests to everyone, especially for Lorenzo, cuz i really didnt
want to be any much offensive or whatever like so! :) peace, and lua
forever! ;)

Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Philippe Verdy
In reply to this post by szbnwer@gmail.com


Le sam. 8 juin 2019 à 23:28, [hidden email] <[hidden email]> a écrit :
Philippe:
> Note also that Javascript has similar issues, but it does not require ";" only because it differenciates whitespaces (those that include a newline not part of a comment from other whitespaces); you cannot safely remove all newlines in Javascript and merge all whitespaces to a single space, without adding a few ";" where needed. So Javascript's parser defines the newline token!

thats even more complicated than that. if u write
```
x=
3
```
then it will work, cuz there is an unfinished expression (and its
maybe even more complicated than this ... once ive seen all the rules,
it was a bit twisted...)

Still there are newline tokens, but in most positions they are ignorable. This complicates a bit the Javascript syntax because the optional presence of newline tokens has to be taken into account. But in some places, their presence is significant (and plays a role like ";"). That's why I wrote they were real tokens and not treated like whitespaces (after discarding all comments, a Javascript parser can compact all whitespaces, except in string contents, to a single SPACE token if there are no newlines in them, and to a single NEWLINE token if there's at least one; that SPACE token can be ignored everywhere, but there still remains an optional NEWLINE between most tokens in its syntax and that optional NEWLINE must be treated specially, in each syntaxic rule).

There's still no such rule in Lua: all whitespaces are equal and compactable to a single one, which is discardable everywhere in the actual list of tokens processed by the syntaxic parser, so this greatly simplifies the rules. However I wonder if Lua should not adopt the same policy as Javascript, because the collapse of two lines which may look as valid separate statements causes unexpected effects with currification (which is just a facility to remove the extra parentheses which can still be used when needed).

(off topic, but not, cuz parsing is related to the new syntax.)

[not so important part]

> The "expression statements" in Lua currently only allow expressions that are simple function calls (with the form "a(b...)" or "a b..." if it is currified), but it currently forbids expressions using unary or binary operators, like "a + b" or "(a + b)" or "a[b]", as valid statements, even if they are intrinsicly a function call, to "__add(a, b)" and "__getindex(a, b)" in these examples.

> And I see no fundamental problem in extending "expressions statements" in Lua to any valid Lua expression (even if this has no effect such as using a simple number or string, or "-1" with the unary operator, or "1+2" with a binary operator). It does not change the syntax complexity at all, but just simplifies it with no additional cost and without changing any existing semantics or runtime and debugging behavior.

thats kinda similar to one of my greatest wishes, that is writing `a
and b(c) or d(e)` (as a freestanding statement) instead of `local _=a
and b(c) or d(e)` that is really an ugly hack (i mean `local _=` :D )
(((the other greatest is `{}:x()`)))

the main reason ive got against both of these is the ambiguity in cases like
```
local a, b, c=0
x and y() or z()
```
where b will become something evil, but we already need the semicolons
here and there. otherwise i dare to believe that lua folks love the
`and`+`or` "hacks" and it would be nice to have these instead of
writing `if x then y() else z() end` in simple cases and i think the
alternative way is even faster to interpret mentally and helps against
global warming (never mind, im crazy) cuz saves some cpu cycles :D
(mayb not, but i just want to believe it like so. :D )

btw i think the best is to group locals only where it makes sense, like:
```
local a, b, c -- uninitialized, no harm can happen, cuz no assignment
returnRandomAmountOfVals()
local x, y, z=1, 2, 3 -- all 3 vars are taken, no accidents
returnRandomAmountOfVals()
```
and anyway we can already mess with assignments with any function calls...
(off topic, but not, cuz i think that the same reasoning applies against them)



Egor:
> @attributename              -- for attributes without arguments
   @(attributename arguments)  -- for attributes with arguments

+1, but a big ONE!
currently this is the most friendly for my eyes
Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

Roberto Ierusalimschy
In reply to this post by Hisham
> So, I guess the short version of my rambling is:
>
> 1. please consider @ rather than <> in Lua 5.4 if you consider
> extending this someday in Lua > 5.4
> 2. please allow per-variable annotations in multi-variable
> declarations in Lua 5.4, no matter what syntax you end up choosing.

Maybe I missed the answer, but repeating Dibyendu's question: How
to reconcile a syntax that prefixes an entire local statement with
per-variable annotations?

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: future of annotations in Lua?

szbnwer@gmail.com
Roberto:
> Maybe I missed the answer, but repeating Dibyendu's question: How
to reconcile a syntax that prefixes an entire local statement with
per-variable annotations?

whats wrong with `local <const> a, b, <const> c, d`? do u mean that
`local <const> a, b, c, d` means by default that a-d are all
constants, so actually an opt-out would be needed for those that
shouldnt be constants?

if i understood u correctly, then i think the interpretation of my 2nd
expression above just shouldnt make b-d constants; or i could think
about (with the `@` notation) this for making all vars constants:
`local @@const a, b, c, d` - so doubling the at symbol would extend
the range of the const from the 1st variable to all of them. this way
both needs are covered, and its really a visible thing to prevent
messing around stuffs like `a.b()` vs `a:b()` that took away some
happy time from all of us here not even once. :D (no offense for `:`,
basically im fine with it. :) )

123