When is a string not a string?

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

When is a string not a string?

Kenneth Lorber
(A related question was posed in a different manner back in 2007 but the discussion didn't go anywhere after noting a grammar conflict that IMHO is not relevant for useful use cases. [1])

When is a string not a string?  When it's string literal.

Here's a trivial example adapted from a different context[2]:

> function string:alike(other) return self:lower() == other:lower() end
> s = "WomBat"
> s:alike"wombat"
true
> "wombat":alike(s)
stdin:1: unexpected symbol near '"wombat"'

Since according to the manual there is one metatable for strings and the string library sets it, we're missing the opportunity to write:
 io.write("This is the value: '%d'\n":format(x))

There are other useful cases, but this is the one I think is best (that is, useful and clear) should be enough for people to tell me this is a bad idea :-)

Opinions please: Why is letting a string literal be more like a string a good or bad idea?

Thanks,
Ken


[1] http://lua-users.org/lists/lua-l/2007-05/msg00059.html
[2] http://lua-users.org/lists/lua-l/2007-03/msg00493.html
Reply | Threaded
Open this post in threaded view
|

Re: When is a string not a string?

Dirk Laurie-2
Op Wo. 23 Jan. 2019 om 17:03 het Kenneth Lorber <[hidden email]> geskryf:

>
> (A related question was posed in a different manner back in 2007 but the discussion didn't go anywhere after noting a grammar conflict that IMHO is not relevant for useful use cases. [1])
>
> When is a string not a string?  When it's string literal.
>
> Here's a trivial example adapted from a different context[2]:
>
> > function string:alike(other) return self:lower() == other:lower() end
> > s = "WomBat"
> > s:alike"wombat"
> true
> > "wombat":alike(s)
> stdin:1: unexpected symbol near '"wombat"'

This is a syntax error. You need parentheses.

> ("wombat"):alike(s)
true

Reply | Threaded
Open this post in threaded view
|

Re: When is a string not a string?

dyngeccetor8
In reply to this post by Kenneth Lorber
On 1/23/19 6:03 PM, Kenneth Lorber wrote:

> [...]
>
>> function string:alike(other) return self:lower() == other:lower() end
>> s = "WomBat"
>> s:alike"wombat"
> true
>> "wombat":alike(s)
> stdin:1: unexpected symbol near '"wombat"'
>
> Since according to the manual there is one metatable for strings and the string library sets it, we're missing the opportunity to write:
>  io.write("This is the value: '%d'\n":format(x))
>
> There are other useful cases, but this is the one I think is best (that is, useful and clear) should be enough for people to tell me this is a bad idea :-)
>
> Opinions please: Why is letting a string literal be more like a string a good or bad idea?
>
> Thanks,
> Ken
>
> [...]

Personally I don't like it.


We already have syntax sugar allowing to omit parenthesis when
argument is string literal or table.

Looks like it was added on Python hype. But I think it just adds
obscurity:

  * > print 'Hello' .. ', ' .. 'World'

    does not what it looks like (unlike BASIC languages).

  * > print 0

    does not work, unlike [[ print '0' ]].

So some speech is already needed in mentorship tone: "Look lad, there
is special rule in language grammar, allowing skipping parenthesis
for function call when sole argument is string literal or table
constructor. Not number literal or boolean or nil!"


With your additions new obscure cases will be added:

  * > 'Hello' .. ', ' .. 'World':print()

    does not what it looks like.

  * > 0:print()

    does not work, unlike [[ '0':print() ]].

And almost same explanation speech will be needed.


-- Martin

Reply | Threaded
Open this post in threaded view
|

Re: When is a string not a string?

Жаботинский Евгений
In reply to this post by Kenneth Lorber
"Kenneth Lorber" <[hidden email]>:

> [...]
>
>>  function string:alike(other) return self:lower() == other:lower() end
>>  s = "WomBat"
>>  s:alike"wombat"
>
> true
>>  "wombat":alike(s)
>
> stdin:1: unexpected symbol near '"wombat"'
>
> Since according to the manual there is one metatable for strings and the string library sets it, we're missing the opportunity to write:
>  io.write("This is the value: '%d'\n":format(x))
>
> There are other useful cases, but this is the one I think is best (that is, useful and clear) should be enough for people to tell me this is a bad idea :-)
>
> Opinions please: Why is letting a string literal be more like a string a good or bad idea?
>
> Thanks,
> Ken
>
> [1] http://lua-users.org/lists/lua-l/2007-05/msg00059.html
> [2] http://lua-users.org/lists/lua-l/2007-03/msg00493.html


My opinion is that we have the wrong syntax sugar.

When calling functions, we generally pass *multiple* values as parameters to a *single* callable.
Therefore, I think most would expect "as little as possible" as the value of what to call, and some sort of delimiter after the values to pass to it.


* >  print "Hello".."World"
* >  print s1..s2

According to my understanding, both lines look like they do the same, while first does not and the second does not compile.
The only delimiter visible here is the end of the line. Or at least I see a string literal as a single value, as opposed to text between delimiters.
Therefore, printing the result of concatenation is expected.
Instead, it calls print on the first string, and then concatenates the result with the second, which currently fails in the best case.
Plus, I can't think of a good use for this feature. The obvious deprecated-python-like `print "Message."` cannot even be used with format!


* >  s = ">".." %d":format(v)
* >  s = s.." %d":format(v)
* >  s = s..f:format(v)

To me, the above lines look like they definitely do the same kind of thing.
The last one does exactly what it looks like to me, while the others do not compile as of yet.
And I'm pretty sure very few, if anyone, would expect the last line to call format on the result of concatenation.

In all languages I can think of, you have to use parentheses whenever you need a method or a field of something that isn't a single value or a field.
At the same time, you generally don't need them for a single value. Some examples:

* >  // C++:
* >  string s = "Hello";
* >  int l = s.size();
* >  int p = (s + "World").find("oW");
* >  //int t = "Hello".size() -- Makes no sense, since string literal is not a string object, but (const char*) that has no methods.
* >  const char *p = "Hello!" + 2; // Points to the third letter. This shows that string literals are just normal values, I guess?
* >  struct S a[10];
* >  a[1].b.c
* >  (a + 1)->b.c // Same as above
* >  a->b == a[0].b // true

* >  # Python:
* >  sys.stderr.write('0x{:04X}: {:s}\n'.format(offset, msg))  # 0xBEEF: Found it!
* >  s = s+' {:02X}'  # Add hex representation of a byte to s
* >  print("Numbers: " + ", ".join(a))  # Numbers: 4, 8, 15, 16, 23, 42
* >  (delimiter + " ").join(['First', 'Second', 'Third'])  # First? Second? Third
* >  (42).to_bytes(2, 'little')  # 0b'\x2a\0'
* >  # 42.to_bytes(2, 'little')  -- Does not work, unlike 'a'.upper(), probably because dot could also be a decimal point.


So, to sum up this overly long explanation, I think allowing to omit parentheses around string literals for method calls on them would be nice.
Since Lua uses colon to denote method calls, that could even be allowed for numbers, in case someone gives them a useful metatable.
On the other hand, calling functions on strings without parentheses looks ambiguous and should be deprecated IMO.
Basically, I think that string literals should be treated the same way as variables holding strings.


Note, however, that calling functions with single table as an argument without extra parentheses is a bit different:

* >  table.concat{a, b, c}
* >  fun{"Yes!", what, 2, why=42}

The curly braces here look quite a bit like parentheses. They visually delimit the arguments.
It actually looks more like passing several values, as opposed to single table, so it's much easier to not expect any more args to follow.
I personally see this syntax more like Python's *varargs plus **kwargs usage.
This can definitely be useful, since Lua does not support named arguments or **kwargs natively.


Thanks for reading.
-- Evgeniy Zhabotinskiy

Reply | Threaded
Open this post in threaded view
|

Re: When is a string not a string?

Soni "They/Them" L.


On 2019-01-24 4:50 a.m., Жаботинский Евгений wrote:

> "Kenneth Lorber" <[hidden email]>:
>> [...]
>>
>>>   function string:alike(other) return self:lower() == other:lower() end
>>>   s = "WomBat"
>>>   s:alike"wombat"
>> true
>>>   "wombat":alike(s)
>> stdin:1: unexpected symbol near '"wombat"'
>>
>> Since according to the manual there is one metatable for strings and the string library sets it, we're missing the opportunity to write:
>>   io.write("This is the value: '%d'\n":format(x))
>>
>> There are other useful cases, but this is the one I think is best (that is, useful and clear) should be enough for people to tell me this is a bad idea :-)
>>
>> Opinions please: Why is letting a string literal be more like a string a good or bad idea?
>>
>> Thanks,
>> Ken
>>
>> [1] http://lua-users.org/lists/lua-l/2007-05/msg00059.html
>> [2] http://lua-users.org/lists/lua-l/2007-03/msg00493.html
>
> My opinion is that we have the wrong syntax sugar.
>
> When calling functions, we generally pass *multiple* values as parameters to a *single* callable.
> Therefore, I think most would expect "as little as possible" as the value of what to call, and some sort of delimiter after the values to pass to it.
>
>
> * >  print "Hello".."World"
> * >  print s1..s2
>
> According to my understanding, both lines look like they do the same, while first does not and the second does not compile.
> The only delimiter visible here is the end of the line. Or at least I see a string literal as a single value, as opposed to text between delimiters.
> Therefore, printing the result of concatenation is expected.
> Instead, it calls print on the first string, and then concatenates the result with the second, which currently fails in the best case.
> Plus, I can't think of a good use for this feature. The obvious deprecated-python-like `print "Message."` cannot even be used with format!
>
>
> * >  s = ">".." %d":format(v)
> * >  s = s.." %d":format(v)
> * >  s = s..f:format(v)
>
> To me, the above lines look like they definitely do the same kind of thing.
> The last one does exactly what it looks like to me, while the others do not compile as of yet.
> And I'm pretty sure very few, if anyone, would expect the last line to call format on the result of concatenation.
>
> In all languages I can think of, you have to use parentheses whenever you need a method or a field of something that isn't a single value or a field.
> At the same time, you generally don't need them for a single value. Some examples:
>
> * >  // C++:
> * >  string s = "Hello";
> * >  int l = s.size();
> * >  int p = (s + "World").find("oW");
> * >  //int t = "Hello".size() -- Makes no sense, since string literal is not a string object, but (const char*) that has no methods.
> * >  const char *p = "Hello!" + 2; // Points to the third letter. This shows that string literals are just normal values, I guess?
> * >  struct S a[10];
> * >  a[1].b.c
> * >  (a + 1)->b.c // Same as above
> * >  a->b == a[0].b // true
>
> * >  # Python:
> * >  sys.stderr.write('0x{:04X}: {:s}\n'.format(offset, msg))  # 0xBEEF: Found it!
> * >  s = s+' {:02X}'  # Add hex representation of a byte to s
> * >  print("Numbers: " + ", ".join(a))  # Numbers: 4, 8, 15, 16, 23, 42
> * >  (delimiter + " ").join(['First', 'Second', 'Third'])  # First? Second? Third
> * >  (42).to_bytes(2, 'little')  # 0b'\x2a\0'
> * >  # 42.to_bytes(2, 'little')  -- Does not work, unlike 'a'.upper(), probably because dot could also be a decimal point.

Custom literals:

local foo = u128"10000000000000000000000000000000000000"
local bar = re"[A-Za-z_][A-Za-z0-9_]*"

>
>
> So, to sum up this overly long explanation, I think allowing to omit parentheses around string literals for method calls on them would be nice.
> Since Lua uses colon to denote method calls, that could even be allowed for numbers, in case someone gives them a useful metatable.
> On the other hand, calling functions on strings without parentheses looks ambiguous and should be deprecated IMO.
> Basically, I think that string literals should be treated the same way as variables holding strings.
>
>
> Note, however, that calling functions with single table as an argument without extra parentheses is a bit different:
>
> * >  table.concat{a, b, c}
> * >  fun{"Yes!", what, 2, why=42}
>
> The curly braces here look quite a bit like parentheses. They visually delimit the arguments.
> It actually looks more like passing several values, as opposed to single table, so it's much easier to not expect any more args to follow.
> I personally see this syntax more like Python's *varargs plus **kwargs usage.
> This can definitely be useful, since Lua does not support named arguments or **kwargs natively.
>
>
> Thanks for reading.
> -- Evgeniy Zhabotinskiy
>


Reply | Threaded
Open this post in threaded view
|

Re: When is a string not a string?

Dirk Laurie-2
And they can have their own metatables.

Who needs Rebol?


Op Do. 24 Jan. 2019 om 14:01 het Soni "They/Them" L.
<[hidden email]> geskryf:
>
> Custom literals:
>
> local foo = u128"10000000000000000000000000000000000000"
> local bar = re"[A-Za-z_][A-Za-z0-9_]*"

Reply | Threaded
Open this post in threaded view
|

Re: When is a string not a string?

Жаботинский Евгений
In reply to this post by Soni "They/Them" L.
"Soni "They/Them" L." <[hidden email]>:
> Custom literals:
>
> local foo = u128"10000000000000000000000000000000000000"
> local bar = re"[A-Za-z_][A-Za-z0-9_]*"

Hm, yes. That is useful.
Also, in these examples the function call actually looks like a literal prefix.
If such convention is followed, it really is obvious what the code does.

Fun fact:

> "%02X":format(42)
> -- Error
> function s(v) return v end
> s"%02X":format(42)
> -- "2A"

So… I guess it does not make much sense to force use of parentheses for method calls only around "primitive" string literals.

-- Evgeniy Zhabotinskiy

Reply | Threaded
Open this post in threaded view
|

Re: When is a string not a string?

Soni "They/Them" L.


On 2019-01-24 12:32 p.m., Жаботинский Евгений wrote:

> "Soni "They/Them" L." <[hidden email]>:
>> Custom literals:
>>
>> local foo = u128"10000000000000000000000000000000000000"
>> local bar = re"[A-Za-z_][A-Za-z0-9_]*"
> Hm, yes. That is useful.
> Also, in these examples the function call actually looks like a literal prefix.
> If such convention is followed, it really is obvious what the code does.
>
> Fun fact:
>
>> "%02X":format(42)
>> -- Error
>> function s(v) return v end
>> s"%02X":format(42)
>> -- "2A"
> So… I guess it does not make much sense to force use of parentheses for method calls only around "primitive" string literals.
>
> -- Evgeniy Zhabotinskiy
>

also give this a try:

function x(x) print("xing") return x end
x(x)
("foo"):gsub(".*", print)
x
"bar":gsub(".*", print)

Reply | Threaded
Open this post in threaded view
|

Re: When is a string not a string?

Dirk Laurie-2
In reply to this post by Жаботинский Евгений
Op Do. 24 Jan. 2019 om 16:32 het Жаботинский Евгений
<[hidden email]> geskryf:

> So… I guess it does not make much sense to force use of parentheses for method calls only around "primitive" string literals.

There is nothing string-specific about parentheses around a literal.
We also see it with table literals.

The reason why they are required. is not "forcing" anything for some
capricious, avoidable reason. It is part of the price that we pay for
Lua normally not requiring semicolons. It is a much lower price, IMHO,
than Python's white-space sensitivity,

Every attempt to reduce the need for parentheses introduces a possible
ambiguity, whose resolution might after all require a semicolon.

Reply | Threaded
Open this post in threaded view
|

Re: When is a string not a string?

Egor Skriptunoff-2
In reply to this post by Жаботинский Евгений
On Thu, Jan 24, 2019 at 9:51 AM Жаботинский Евгений wrote:

I think allowing to omit parentheses around string literals for method calls on them would be nice.


 
+1
 
It would be useful:
   "%02X":format(n)
   s = " ":rep(10-#s)..s
   n = n*10 + digit:byte() - "0":byte()
   n = n*16 + "0123456789ABCDEF":find(hex_digit:upper()) - 1
   x = "<d":unpack("<I4I4":pack(n1,n2))
 
I'd also like indexing literal tables:
   {...}[i]
   RGB = {red=0xFF0000, green=0x00FF00}[color_name]
   if not {["Lua 5.2"]=1, ["Lua 5.3"]=1}[_VERSION] then error"Not supported" end


Reply | Threaded
Open this post in threaded view
|

Re: When is a string not a string?

Gabriel Bertilson
In reply to this post by Жаботинский Евгений
Actually, parentheses are required around any type of literal, not
just a string literal, in order to call a method on it or to index it:

> ({ 'a', 'b', 'c', concat = table.concat }):concat("|")
a|b|c
> { 'a', 'b', 'c', concat = table.concat }:concat(", ")
stdin:1: unexpected symbol near '{'
> ({ 'a', 'b', 'c', concat = table.concat }).concat
function: 0xXXXXXXXXXXXX
> { 'a', 'b', 'c', concat = table.concat }.concat
stdin:1: unexpected symbol near '{'

> debug.setmetatable(print, { __index = { curry = function (self, arg) return function (...) return self(arg, ...) end end } })
> (function (a, b) return a + b end):curry(1)(2)
3
> function (a, b) return a + b end:curry(1)(2)
stdin:1: <name> expected near '('
> (function (a, b) return a + b end).curry
function: 0xXXXXXXXXXXXX
> function (a, b) return a + b end.curry
stdin:1: <name> expected near '('

> debug.setmetatable(2, { __index = math })
> (2):pow(3)
8.0
> 2:pow(3)
stdin:1: unexpected symbol near '2'
> (2).pow
function: 0xXXXXXXXXXXXX
> 2.pow
stdin:1: unexpected symbol near '2.'

If this restriction were removed from string literals, it would make
sense to remove it from table, number, and function literals as well,
though only string and table literals would be likely to be indexed
because indexing numbers and functions requires adding a metatable to
them with the debug library.

— Gabriel

— Gabriel



On Thu, Jan 24, 2019 at 8:32 AM Жаботинский Евгений
<[hidden email]> wrote:

>
> "Soni "They/Them" L." <[hidden email]>:
> > Custom literals:
> >
> > local foo = u128"10000000000000000000000000000000000000"
> > local bar = re"[A-Za-z_][A-Za-z0-9_]*"
>
> Hm, yes. That is useful.
> Also, in these examples the function call actually looks like a literal prefix.
> If such convention is followed, it really is obvious what the code does.
>
> Fun fact:
>
> > "%02X":format(42)
> > -- Error
> > function s(v) return v end
> > s"%02X":format(42)
> > -- "2A"
>
> So… I guess it does not make much sense to force use of parentheses for method calls only around "primitive" string literals.
>
> -- Evgeniy Zhabotinskiy
>

Reply | Threaded
Open this post in threaded view
|

Re: When is a string not a string?

Egor Skriptunoff-2
In reply to this post by Dirk Laurie-2


On Fri, Jan 25, 2019 at 1:32 AM Dirk Laurie <[hidden email]> wrote:

Every attempt to reduce the need for parentheses introduces a possible
ambiguity, whose resolution might after all require a semicolon.



You are wrong here.
Removing parentheses from the following expressions will NOT add any new ambiguity in Lua syntax.
   ("str"):foo()
   ({a=42})[foo]
As you can see, these expressions are already a source of ambiguity due to they are starting with "(".
You can't make them more ambiguous than they already are :-)
 
 
Reply | Threaded
Open this post in threaded view
|

Re: When is a string not a string?

Жаботинский Евгений
"Egor Skriptunoff" <[hidden email]>:

> On Fri, Jan 25, 2019 at 1:32 AM Dirk Laurie <[hidden email]> wrote:
>> Every attempt to reduce the need for parentheses introduces a possible
>> ambiguity, whose resolution might after all require a semicolon.
>
> You are wrong here.
> Removing parentheses from the following expressions will NOT add any new ambiguity in Lua syntax.
>    ("str"):foo()
>    ({a=42})[foo]
> As you can see, these expressions are already a source of ambiguity due to they are starting with `(`.
> You can't make them more ambiguous than they already are :-)

That's an interesting rake! It's not even literal-specific, by the way:

> print("Starting work...")
> (workload or default):start()

This innocent-looking code will throw "attempt to call a nil value" when trying to call the result of print().
I think `_=` or `;` is better added *before the second line*, as that's the one causing the problem.

Allowing to omit parentheses around literals would only allow to step on the very same rake without
using an extra pair of parentheses. Having said that, having to type out slightly longer and uglier code
might be just enough of a deterrent, so there may be some sense in forcing the use of parentheses.

Also, note that this rake can be stepped on *only when a statement has no left side*, that is, when
there is no assignment. That only makes sense if a function is called for its side effects only.
String literals have no methods with side effects by default, and IMO chance that someone adds some
and starts using them *just because parentheses became optional* is pretty negligible.
A switch statement can also be implemented by constructing a table of functions defined in-place,
immediately indexing it and calling the result, but again, I don't think many would start doing that just
because the extra parentheses are no longer mandatory. All in all, I cannot come up with a good
example of this kind that would scream "That's why you cannot omit parentheses!"

When a literal or a parenthesized expression is *not* placed at the start of a statement, the above kind
of ambiguity is not possible at all. If allowing to omit parentheses would add any ambiguity at all, it
would likely have nothing to do with semicolons. In fact, most of the possible cases should already be
covered by "custom literals":

> function L(v) return v end
> print(L"%08X":format(address))
> f = L{io.stdout, io.stderr, logfile, devnull}[out_stream]

These can be used both with and without extra parentheses, and the results will be exactly the same.
The only difference between those L-prefixed and bare literals are that:

1. Prefixed literals can be indexed (with `.`, `:` or `[]`) and called directly, bare ones cannot.
   (Trying to index or call a bare literal always causes compilation error.)
2. Bare literals can be used as a sole argument to function call, prefixed ones cannot.
   (Trying to use a prefixed literal in this role would be interpreted as starting a new statement instead.)

Making parentheses around bare literals optional will remove the compile error and allow using
bare literals anywhere where prefixed ones can currently be used. The only possible problem is that
if the prefix was the only hint to Lua that a new statement is starting, as per comment for (2), then
using a bare literal instead would cause the same ambiguity that I described several paragraphs ago.
Again, I doubt that anyone who otherwise would prefix normal string or table literals like that
would opt to not do that for the sole reason of eliminating the extra letter.



I've learned quite a bit about the dark corners of Lua while participating in this thread, and my current
opinion is that allowing to omit parentheses around string and table literals should not cause any
real problems. It would allow for a bit nicer looking code in some common (and not quite) use cases:

> function string:boxed()
>   local t = "-":rep(#self + 2)
>   return "+%s+\n| %s |\n+%s+":format(t, self, t)
> end
> print('Some text in a box!':boxed())

However, it would not really give anything, except for eliminating a pair or two of parentheses from
time to time. No new use cases, no eliminated problems, just pure aesthetics. Whether it is worth
the effort and the potential risk that something somewhere goes wrong somehow, I don't know.

-- Evgeniy Zhabotinskiy