[Suggestion] Not requiring commas for table entry separation

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[Suggestion] Not requiring commas for table entry separation

Joshua Jensen
Yep, I know there is a patch for this.  In fact, over a year ago, I
applied it to a personal distribution to test it out, understanding the
potential problems that could arise.  I meant to remove it, but I
forgot.  Finally, a year later, I discovered I had left the patch in and
removed it.  The results were very bad.  Every one of hundreds of Lua
data files (files containing table information, not code) had at least
one comma missing.  Most of them had no commas.  A discussion ensued,
and it was decided that, despite the potential ambiguity, the commas
would remain optional.

Especially when having non-programmers create or modify these data
files, not requiring commas is a boon.  The most common mistake is to
simply forget to put the commas in, and then the scripts have errors.
Some missing commas are incredibly hard to track down, depending on
circumstances.

There's also the matter of "prettiness."  When dealing with thousands of
lines of data, I find it easier to look at something like:

MyInfo =
{
	Value1 = 5
	PrettyValues =
	{
		--	Num1	Num2	String	Num3
		{	5.5	10	"A string"	20	}
		{	10	15	"Hello"	25	}
	}
}

Than the required commas version:

MyInfo =
{
	Value1 = 5,
	PrettyValues =
	{
		--	Num1	Num2	String	Num3
		{	5.5,	10,	"A string",	20,	},
		{	10,	15,	"Hello",	25,	},
	},
}

Certainly there is an ambiguity when dealing with bracketed [key] values
preceded by an identifier.  For the stuff we use Lua's data description
facilities for, this is a rarity (actually, it hasn't happened yet).

Anyway, it was simple enough to add to Lua 5.0 Beta.  In lparser.c, the
constructor function is changed to as below.

This is just a suggestion for making Lua's non-programmer (and
programmer) usage even more intuitive.

Josh

---------------------------------

//  do {
  for (;;) {
    lua_assert(cc.v.k == VVOID || cc.tostore > 0);
    testnext(ls, ';');  /* compatibility only */
    if (ls->t.token == '}') break;
    closelistfield(fs, &cc);
    switch(ls->t.token) {
      case TK_NAME: {  /* may be listfields or recfields */
        lookahead(ls);
        if (ls->lookahead.token != '=')  /* expression? */
          listfield(ls, &cc);
        else
          recfield(ls, &cc);
        break;
      }
      case '[': {  /* constructor_item -> recfield */
        recfield(ls, &cc);
        break;
      }
      default: {  /* constructor_part -> listfield */
        listfield(ls, &cc);
        break;
      }
    }
    if (ls->t.token == ',' || ls->t.token == ';')
      next(ls);
    else if (ls->t.token == '}')
      break;
  }
//  } while (testnext(ls, ',') || testnext(ls, ';'));
  check_match(ls, '}', '{', line);



Reply | Threaded
Open this post in threaded view
|

Re: [Suggestion] Not requiring commas for table entry separation

Peter Hill-2
Joshua Jensen:
> Certainly there is an ambiguity when dealing with bracketed [key] values
> preceded by an identifier.

And for an identifier preceeding a string, constructor, or parenthesised
expression. Ie,
    {f "xyzzy"} is {f("xyzzy")} or {f, "xyzzy"}?
    {f {"value"}} is {f({"value"})} or {f, {"value"}}
    {f(5+6)} is {f(5+6)} or {f, (5+6)}?

I'd *like* to not need a comma... afterall I'd like to be able to wrap a
list of global assignments, eg:
    x = 1
    y = 2
into a table constructor
    {
    x = 1,
    y = 2,
    }
without having to add the commas.

But there is some serious ambiguity here :-(. Otoh, this is the same problem
that statements face with the (up until recently) optional ";" separator. It
is handled thusly:

(a) ["if"] = "token"
This format (though a consistent & useful way to access globals) is
prohibited. Instead one must use something like: _G["if"] = "token".

If constructors used a consistent style they would instead use something
like: X = {_SELF["if"] = "token"}. Hey, it might actually be nice to have
access to _SELF inside a constructor (allowing self-referential tables)
although anonymous functions don't currently have that luxury.

Alternately, one could just use a different non-index syntax for evaluated
table assignments. Eg, the rather ugly:
    X = {@"if" = "token"}
and
    @"if" = "token"

(b) f "" AND (c) f {}
Unlike inside a constructor, where values are always meaningful, statements
are not interested in return values. As such the statement expression syntax
is restricted to 'procedure calls' only, so the above cases are not a
problem. Otoh that does help 'procedure calls'...

(d) f (1+2)
It doesn't handle this. An explicit ';' is required to separate the terms so
';' is no longer optional :-(. That's confusing enough, but having similar
optional (unless required) syntax for commas in constructors would be a
naive user's nightmare. There must be a better way to keep consistency :-(.



*cheers*
Peter Hill.

PS:
Joshua Jensen said (16th Feb):
> Subject: Non-stop complaining...
> And on a very personal note, the conversations I think clog the list are:
(snip)
> * Syntax issues

AND

Joshua Jensen said (17 Feb):
>  Subject: [Suggestion] Not requiring commas for table entry separation

Huh?!?! You've got me confused now 8-O.
Are 'commas' an ok non-clogging "syntax issue" but inband operating system
filetype information creeping in to Lua code in the guise of comments
aren't? ;-)



Reply | Threaded
Open this post in threaded view
|

RE: [Suggestion] Not requiring commas for table entry separation

Peter Prade-2
In reply to this post by Joshua Jensen
Joshua Jensen:
> Especially when having non-programmers create or modify these data
> files, not requiring commas is a boon.  The most common mistake is to
> simply forget to put the commas in, and then the scripts have errors.
> Some missing commas are incredibly hard to track down, depending on
> circumstances.

This is absolutely true. I made the same observation.

Joshua Jensen:
> Certainly there is an ambiguity when dealing with bracketed [key] values
> preceded by an identifier.

Peter Hill:
> And for an identifier preceeding a string, constructor, or parenthesised
> expression. Ie,
>     {f "xyzzy"} is {f("xyzzy")} or {f, "xyzzy"}?
>     {f(5+6)} is {f(5+6)} or {f, (5+6)}?

yep, those 2 problems were probably what kept the lua authors from further
simplifying the table constructors syntax.

They already listened to our input and made it possible to use a "," instead
of a ";" in a mixed constructor
(since Lua 4.1work4 you can also use {1,2,x=3} instead of {1,2;x=3}).

see also: http://lua-users.org/wiki/TableConstructors

Joshua Jensen reports:
> Certainly there is an ambiguity when dealing with bracketed [key] values
> preceded by an identifier.  For the stuff we use Lua's data description
> facilities for, this is a rarity (actually, it hasn't happened yet).

I guess this is true.
In the cases, where you want the easier syntax because non-programmers use
it to enter data, you don't have these conflicts. And in the case where you
have the ambiguity, it is probably in a table constructor entered by a
programmer, who won't mind adding seperators.

But the real problem here is when the programmer is used to omitting the
seperators in table constructors and then is entering function values in one
of his tables for the first time - this might create hard to find bugs.
(at least harder to find than a missing comma in a table constructor ;-)

Cheers,
Peter Prade


Reply | Threaded
Open this post in threaded view
|

RE: [Suggestion] Not requiring commas for table entry separation

Joshua Jensen
> Joshua Jensen:
> > Certainly there is an ambiguity when dealing with bracketed [key] 
> > values preceded by an identifier.
> 
> Peter Hill:
> > And for an identifier preceeding a string, constructor, or 
> > parenthesised expression. Ie,
> >     {f "xyzzy"} is {f("xyzzy")} or {f, "xyzzy"}?
> >     {f(5+6)} is {f(5+6)} or {f, (5+6)}?
> 
> yep, those 2 problems were probably what kept the lua authors 
> from further simplifying the table constructors syntax.

Arguably, maybe whitespace should make a difference.  I had even
forgotten about the {f "xyzzy"} syntax translating to {f("xyzzy")}.

In any case, there are ambiguities, but in resolving the ambiguities,
the syntax becomes more complex for the simple situations.  I can't tell
you the number of times I've had to help somebody find their missing
comma.  Commenting out huge portions of the Lua data file is often
necessary to isolate it.  Likewise, if:

f = 5

and you punched in the data:

{ f "xyzzy" }

_I_ would have expected (as would the non-programmer) the final result
to be:

{ 5 "xyzzy" }

And that makes it even more obvious why the commas are required (as
unfortunate as that may be for the simple data description case).  I use
Lua as a replacement for languages like XML.  It's overall cleaner
syntax and more powerful description capabilities (IMHO) make it way
easier to describe the data.

> But the real problem here is when the programmer is used to 
> omitting the seperators in table constructors and then is 
> entering function values in one of his tables for the first 
> time - this might create hard to find bugs. (at least harder 
> to find than a missing comma in a table constructor ;-)

Agreed, although a dump of the table contents to a text file would
probably make it pretty obvious what entries were wrong.  Still, I would
fully expect one of the programmer/non-programmer types to do the {
globalValue "This is a string" } syntax when describing their data.  So,
catch-22.  The syntactic sugar for { function "argument" } makes a few
things easier to read (although I typically see scripts from the Lua
authors like { function"argument" }).  However, the simplicity of not
requiring commas is also really nice.

Thanks,
Josh



Reply | Threaded
Open this post in threaded view
|

Re: [Suggestion] Not requiring commas for table entry separation

RLak
In reply to this post by Joshua Jensen
> Hmmm... could this not be solved by having two different types of table
> constructors? One has the mandatory comma-separated syntax, the other
> one uses spaces, but doesn't support all types of indexes.

> Lika, say:

> t = << a = << foo = 1 2 3 >> b = "hello" >>

That makes me want to read the inner table as foo = {1, 2, 3}

I think it's easier to tell people to use commas :)

R.





Reply | Threaded
Open this post in threaded view
|

Re: [Suggestion] Not requiring commas for table entry separation

Björn De Meyer
In reply to this post by Peter Prade-2
Hmmm... could this not be solved by having two different types of table 
constructors? One has the mandatory comma-separated syntax, the other
one
uses spaces, but doesn't support all types of indexes.

Lika, say: 

t = << a = << foo = 1 2 3 >> b = "hello" >> 

Yeah, I know the << >> signs are not so pretty, but we use
[[ and ]] for strings as well, so...

-- 
"No one knows true heroes, for they speak not of their greatness." -- 
Daniel Remar.
Björn De Meyer 
[hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: [Suggestion] Not requiring commas for table entry separation

Joshua Jensen
In reply to this post by RLak
> I think it's easier to tell people to use commas :)

If only that were true... <sigh>...

-Josh



Reply | Threaded
Open this post in threaded view
|

Re: [Suggestion] Not requiring commas for table entry separation

Peter Hill-2
In reply to this post by Joshua Jensen
Joshua Jensen:
> Likewise, if:
>     f = 5
>
> and you punched in the data:
>     { f "xyzzy" }
>
> _I_ would have expected (as would the non-programmer) the final result
> to be:
>     { 5 "xyzzy" }
>
> And that makes it even more obvious why the commas are required (as
> unfortunate as that may be for the simple data description case).

And
    {complex{1,2}}
to be
    {complex, {1,2}}

Alas, Lua is just too smart for its own good :-(.


The real problem lies in functions being first class objects, so that:
    x = f
and
    x = f (5)
are both valid. Ie, one can't tell if an argument will follow the function
or not.

Of course this power is lost on the naive user, who'd like
    {5 6}
to be interpreted as
    {5,6}
rather than
    {5(6)}

It's not an easy problem :-(.

Cheers,
Peter Hill.