... as an "expand list" unary postfix operator

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

... as an "expand list" unary postfix operator

Duncan Cross
Hi List,

Back in June, somewhere in the middle of a thread that threw around
various ideas about the next version of Lua, I made a syntax
suggestion that I'd like to bring up again. I'm planning to get a
patch implemented to see how it feels to use and also to see how much
code change would be required - I made a start on this, but I'm still
not all that familiar with Lua's internals, so it might take a while.

My proposal is that if '...' appears just before a comma (in either
the list of arguments in a function call, a separator between two
array elements in a table constructor, or the right hand side of a
multiple variable assignment) or a semicolon (in the case of a
separator between two array elements only) then it changes the meaning
of that comma or semicolon. Instead of truncating or expanding the
list of values that result from evaluating the expression that
preceded it, the full list is preserved.

Here is an example of how I would expect Lua code to look and act with
such a change:

-----
function multivalues()
    return 1, 2, 3
end

function printvalues(a,b,c,d,e,f)
    print('"' .. tostring(a) .. ',' .. tostring(b) .. ',' .. tostring(c)
     .. tostring(d) .. ',' .. tostring(e) .. ',' .. tostring(f) .. '"')
end

-- Function calls
printvalues(multivalues(), 4, 5, 6)    --> "1,4,5,6,nil,nil"
printvalues(multivalues()..., 4, 5, 6) --> "1,2,3,4,5,6"

-- Array constructors
ar = {multivalues(), 4, 5, 6}     --> {1,4,5,6}
ar2 = {multivalues()..., 4, 5, 6} --> {1,2,3,4,5,6}

-- RHS of variable assignments
p,q,r,s,t,u = multivalues(), 4, 5, 6    --> p=1, q=4, r=5, s=6, t=nil, u=nil
p,q,r,s,t,u = multivalues()..., 4, 5, 6 --> p=1, q=2, r=3, s=4, t=5,   u=6
-----

If '...' appears after an expression in any other context, I think it
should be ignored, rather than causing an error. This is mainly so
that people can use it on the last element of a list if they wish -
even though the full expansion would still happen without it.

When I originally suggested this a few people said that it looks odd,
particularly in the case of ... ... - I don't really have any argument
for that, it *is* a bit odd. But hopefully, not too odd to be given
consideration. To me, using ... in this way has a kind of symmetrical
elegance, as it already has a meaning of "accommodate a variable
number of values" - but in this case it would mean, accommodate a
variable number of *output* values, rather than input.

A previous idea with a similar intention was to change the meaning of
the semicolon inside an array constructor. See "Extend table
constructor syntax to allow multiple expansion of multireturn
functions" in the LuaPowerPatches page of the Lua Users' Wiki for a
patch that implements this. I think my proposal has two advantages
over it:
- It does not change the meaning of existing valid code.
- It would work not only for array elements in table constructors, but
also the parameters of a function call and the right-hand side of a
multiple variable assignment.

So, what do people think? Is it something you'd ever use? Is it just
too ugly and weird?

-Duncan
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Pierre LeMoine
I've been missing this a number of times recently, i'd certainly like
and use it. Maybe

> When I originally suggested this a few people said that it looks odd,
> particularly in the case of ... ... - I don't really have any argument
> for that, it *is* a bit odd.

If ... seems wierd, it shouldn't be too hard to change it to something
else, right? maybe  a ~? ...~ multivalues()~
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

steve donovan
In reply to this post by Duncan Cross
On Tue, Aug 11, 2009 at 3:18 PM, Duncan Cross<[hidden email]> wrote:
> So, what do people think? Is it something you'd ever use? Is it just
> too ugly and weird?

I think one meaning for ... is enough ;)  Have you considered it as a
function, say 'explode'?  That would be more visually distinctive.
However, it ends up as yet another rule that people have to learn.

steve d.
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Duncan Cross
On Wed, Aug 12, 2009 at 2:11 PM, steve donovan<[hidden email]> wrote:
> On Tue, Aug 11, 2009 at 3:18 PM, Duncan Cross<[hidden email]> wrote:
>> So, what do people think? Is it something you'd ever use? Is it just
>> too ugly and weird?
>
> I think one meaning for ... is enough ;)  Have you considered it as a
> function, say 'explode'?  That would be more visually distinctive.

You mean, have something that looks like a function call but is
actually not? So these would work:

printvalues(explode(multivalues()), 4, 5, 6)
ar2 = {explode(multivalues()), 4, 5, 6}
p,q,r,s,t,u = explode(multivalues()), 4, 5, 6

If not, and you mean an actual function that simulates it somehow, I'm
not sure how it would work.

> However, it ends up as yet another rule that people have to learn.

That's a pessimistic way of looking at it :) It'd be another "tool"
people can choose to learn about and use, probably if and when they
come across a problem that it would help for.

-Duncan
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Peter Cawley
On Wed, Aug 12, 2009 at 2:30 PM, Duncan Cross<[hidden email]> wrote:

> On Wed, Aug 12, 2009 at 2:11 PM, steve donovan<[hidden email]> wrote:
>> On Tue, Aug 11, 2009 at 3:18 PM, Duncan Cross<[hidden email]> wrote:
>>> So, what do people think? Is it something you'd ever use? Is it just
>>> too ugly and weird?
>>
>> I think one meaning for ... is enough ;)  Have you considered it as a
>> function, say 'explode'?  That would be more visually distinctive.
>
> You mean, have something that looks like a function call but is
> actually not? So these would work:
>
If something looks like a function call, it should be a function call.
As this cannot be implemented as a function call, it should not look
like one.
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Duncan Cross
On Wed, Aug 12, 2009 at 2:34 PM, Peter Cawley<[hidden email]> wrote:

> On Wed, Aug 12, 2009 at 2:30 PM, Duncan Cross<[hidden email]> wrote:
>> On Wed, Aug 12, 2009 at 2:11 PM, steve donovan<[hidden email]> wrote:
>>> On Tue, Aug 11, 2009 at 3:18 PM, Duncan Cross<[hidden email]> wrote:
>>>> So, what do people think? Is it something you'd ever use? Is it just
>>>> too ugly and weird?
>>>
>>> I think one meaning for ... is enough ;)  Have you considered it as a
>>> function, say 'explode'?  That would be more visually distinctive.
>>
>> You mean, have something that looks like a function call but is
>> actually not? So these would work:
>>
> If something looks like a function call, it should be a function call.
> As this cannot be implemented as a function call, it should not look
> like one.
>

I agree - I just wanted to fully understand what steve was suggesting
first, he might have meant a real function that does the same thing
somehow.

-Duncan
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

steve donovan
On Wed, Aug 12, 2009 at 3:39 PM, Duncan Cross<[hidden email]> wrote:
>> If something looks like a function call, it should be a function call.
>> As this cannot be implemented as a function call, it should not look
>> like one.
>>
>
> I agree - I just wanted to fully understand what steve was suggesting
> first, he might have meant a real function that does the same thing
> somehow.

Fair enough, it is a pseudo-function.  I suppose I was thinking of t =
{explode(multiple()),10,20}; it is a pseudo-function because although
normally it means function(...) return ... end, it would then have
special semantics inside a table constructor.

Actually, the usual rule is confusing for beginners and experts alike
- that {multiple(),10} behaves differently from {10,multiple()}, and
that you have to use () for the second case to only insert the first
value returned.

steve d.
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Doug Rogers
In reply to this post by steve donovan
steve donovan wrote:
> On Tue, Aug 11, 2009 at 3:18 PM, Duncan Cross<[hidden email]> wrote:
>> So, what do people think? Is it something you'd ever use? Is it just
>> too ugly and weird?
> I think one meaning for ... is enough ;)

Steve, I can certainly see the argument against multiple meanings for
the same tokens, though Lua (like most languages) has plenty of those
already. And the reasoning is solid for truncation of return values as a
default behavior in the contexts in which it currently happens. But the
fact that we get many surprised queries about this behavior indicates
that users expect to be able to capture all return values in the early
part of an array initialization, etc. While I I don't see it as a big
problem, it's still something that I could see myself using occasionally.

And I think { get_items()..., "no more" } is a clever solution. Being an
ellipsis, the ... token has a relatively clear meaning in this context.
It would take me a while to grok the implications in the parser and VM.
But that's for the OP to do.

So I encourage you, Duncan, to try it out and share any results. Even if
it doesn't work out, you (and we) will learn from it.

Doug

PS - I've always liked the phrase "elegant over clever" when considering
an engineering/programming idea. I noticed that I used 'clever' above
for something that Duncan labeled 'elegant'. I'm on the fence about this
particular idea, but if its implementation is clean, I'd be happy to
swing my mind towards 'elegant'!

______________________________________________________________________________________
The information contained in this email transmission may contain proprietary and business
sensitive information.  If you are not the intended recipient, you are hereby notified that
any review, dissemination, distribution or duplication of this communication is strictly
prohibited.  Unauthorized interception of this e-mail is a violation of law.  If you are not
the intended recipient, please contact the sender by reply email and immediately destroy all
copies of the original message.

Any technical data and/or information provided with or in this email may be subject to U.S.
export controls law.  Export, diversion or disclosure contrary to U.S. law is prohibited.  
Such technical data or information is not to be exported from the U.S. or given to any foreign
person in the U.S. without prior written authorization of Elbit Systems of America and the
appropriate U.S. Government agency.
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Doug Currie

On Aug 12, 2009, at 10:14 AM, Doug Rogers wrote:
>
> And I think { get_items()..., "no more" } is a clever solution.  
> Being an ellipsis, the ... token has a relatively clear meaning in  
> this context.

I agree. It looks clear to me, too. Being unsupported syntax now means  
there's little risk in adding it: existing code won't break.

> So I encourage you, Duncan, to try it out and share any results.  
> Even if it doesn't work out, you (and we) will learn from it.

Ditto.

e

Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Peter Cawley
On Wed, Aug 12, 2009 at 10:17 PM, Doug Currie<[hidden email]> wrote:

>
> On Aug 12, 2009, at 10:14 AM, Doug Rogers wrote:
>>
>> And I think { get_items()..., "no more" } is a clever solution. Being an
>> ellipsis, the ... token has a relatively clear meaning in this context.
>
> I agree. It looks clear to me, too. Being unsupported syntax now means
> there's little risk in adding it: existing code won't break.
>
>> So I encourage you, Duncan, to try it out and share any results. Even if
>> it doesn't work out, you (and we) will learn from it.
>
To get the ball rolling with some code, I've attached a diff which
allows "..." after an expression in an expression list and after a
list value in a table constructor, along with a file which has a few
tests / examples in it. My knowledge of the Lua source code is not
overly great, so the patch may be rather rough in places.

This patch is implemented in two parts. Firstly, there is the table
constructor part, which involves a major change to the SETLIST opcode.
Instead of the SETLIST specifying where to insert the values, each
Table structure remembers where SETLIST should insert values. This
conveniently does away with the SETLIST with c == 0 edge condition,
which can simplify the bytecode verifier. The downsides are an extra
int field in every table, and lack of bytecode compatibility.

The second part is for expanding expressions in expression lists. In
terms of pseudo code, this transforms expression lists like "e1, e2
..., e3, ..." into "e1, detuple(tuple(e2), e3, ...)", though tuple and
detuple are VM opcodes rather than functions. The TUPLE opcode takes 2
or more stack values, copies them to a new tuple data array area of a
lua_State, and replaces them with a single value which points to the
new position of the values in the tuple array. The DETUPLE opcode
expands zero or more tuples back into the stack.

expvar.diff (24K) Download Attachment
tuples.lua (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

steve donovan
In reply to this post by Doug Currie
On Wed, Aug 12, 2009 at 11:17 PM, Doug Currie<[hidden email]> wrote:
>> So I encourage you, Duncan, to try it out and share any results. Even if
>> it doesn't work out, you (and we) will learn from it.
>
> Ditto.

Thinking about this, it is really a cool little feature, precisely
because there's no existing code that gets broken. We get new idioms,
like concatenating two lists {unpack(t1)...,unpack(t2)}, etc (although
like {unpack(t)} this is not recommended for big tables).

Now, would this also apply to function argument lists, which is the
other place where multiple returns are not discarded if at the end?

steve d.
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Peter Cawley
On Thu, Aug 13, 2009 at 1:00 PM, steve donovan<[hidden email]> wrote:
> Now, would this also apply to function argument lists, which is the
> other place where multiple returns are not discarded if at the end?

According to the BNF description of Lua, the arguments in a function
call are an expression list, so IMO it should apply here too (and my
patch does so).

E:\CPP\2K8\lua-5.1.4-proto>Debug\lua-5.1.4-proto.exe
Lua 5.1.4  Copyright (C) 1994-2008 Lua.org, PUC-Rio
> t = {4, 5, 6}
> print(unpack(t) ..., 7)
4       5       6       7
> print(unpack{4, unpack{5, 6} ..., 7} ..., 8)
4       5       6       7       8

Also note that attached is a revised patch.

expvar.diff (26K) Download Attachment
tuples.lua (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Duncan Cross
On Thu, Aug 13, 2009 at 1:19 PM, Peter Cawley<[hidden email]> wrote:
> On Thu, Aug 13, 2009 at 1:00 PM, steve donovan<[hidden email]> wrote:
>> Now, would this also apply to function argument lists, which is the
>> other place where multiple returns are not discarded if at the end?
>
> According to the BNF description of Lua, the arguments in a function
> call are an expression list, so IMO it should apply here too (and my
> patch does so).

It is part of my original proposal as well - I think it is an
important factor, that this should work consistently anywhere it makes
sense. In fact Peter's tuples.lua example makes me realise that in my
original post I'd actually missed out another case - in the list of
*return* values from a function, e.g.:

-----
function multivalues()
  return 1,2,3
end

function multivalues2()
  return multivalues()..., 4
end
-----

Thank you for creating your patch so quickly, Peter, it is much
appreciated - I was still in the early stages with mine. I've tried it
out here and it works with everything I can think to throw at it. I
can see it's not a trivial change - it does require new opcodes, etc.
- but it doesn't look too drastic to me.

-Duncan
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Duncan Cross
In reply to this post by Peter Cawley
On Thu, Aug 13, 2009 at 1:19 PM, Peter Cawley<[hidden email]> wrote:

> On Thu, Aug 13, 2009 at 1:00 PM, steve donovan<[hidden email]> wrote:
>> Now, would this also apply to function argument lists, which is the
>> other place where multiple returns are not discarded if at the end?
>
> According to the BNF description of Lua, the arguments in a function
> call are an expression list, so IMO it should apply here too (and my
> patch does so).
>
> E:\CPP\2K8\lua-5.1.4-proto>Debug\lua-5.1.4-proto.exe
> Lua 5.1.4  Copyright (C) 1994-2008 Lua.org, PUC-Rio
>> t = {4, 5, 6}
>> print(unpack(t) ..., 7)
> 4       5       6       7
>> print(unpack{4, unpack{5, 6} ..., 7} ..., 8)
> 4       5       6       7       8
>
> Also note that attached is a revised patch.
>

Thanks again for this Peter, I've experimented with it a bit more and
found some bugs - both relating to trying the "Return duplicated
arguments" test with different numbers of arguments:
+ Zero arguments ends up with a single "nil" value (it looks like
detuple is expanding an empty list to one nil value?)
+ One argument ends up with some weird value that Lua represents as ":
0000000" instead (a tuple that never got detupled?)

I have only tried the revised patch, so I don't know if these are
present in the original one as well.

Cheers,
-Duncan
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Peter Cawley
On Thu, Aug 13, 2009 at 4:21 PM, Duncan Cross<[hidden email]> wrote:
>
> Thanks again for this Peter, I've experimented with it a bit more and
> found some bugs - both relating to trying the "Return duplicated
> arguments" test with different numbers of arguments:
> + Zero arguments ends up with a single "nil" value (it looks like
> detuple is expanding an empty list to one nil value?)
> + One argument ends up with some weird value that Lua represents as ":
> 0000000" instead (a tuple that never got detupled?)

Should now be fixed.

expvar.diff (27K) Download Attachment
tuples.lua (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

steve donovan
On Thu, Aug 13, 2009 at 6:09 PM, Peter Cawley<[hidden email]> wrote:
> Should now be fixed.

Patch applied fine, then built with MS cl compiler.  None of my code
falling over so far.

The {unpack(t1)...,unpack(t2)} idiom is a good deal faster than the
manual method of creating a table which is the concatenation of t1 and
t2.

For t1 and t2 both having 3 elements, then this method takes 1.6 vs 3.5 sec;

For t1 and t2 having 10 elements, then we get 2.4 vs 7.2 sec.

It's possible that someone might be able to squeeze a little extra bit
of speed out of the normal code (pulling in an explicit table size
setter would probably help) so here is the unscientific code behind
the numbers:

steve d.

-----
N = 1e6

function concatn1 (t1,t2)
    for K = 1,N do
        local t = {}
        for i = 1,#t1 do
            t[i] = t1[i]
        end
        local k = #t1 + 1
        for i = 1,#t2 do
            t[k] = t2[i]
            k = k + 1
        end
    end
end

function concatn2 (t1,t2)
    for K = 1,N do
        local t = {unpack(t1)...,unpack(t2)}
    end
end


t1 = {10,20,30,1,2,3,11,12,13,14}
t2 = {40,50,60,4,5,6,41,51,61,62}

t = os.clock()
concatn1(t1,t2)
print('elapsed',os.clock()-t)

t = os.clock()
concatn2(t1,t2)
print('elapsed',os.clock()-t)
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Doug Rogers
In reply to this post by Peter Cawley
Peter Cawley wrote:
> On Thu, Aug 13, 2009 at 4:21 PM, Duncan Cross<[hidden email]> wrote:
>  
>> Thanks again for this Peter...
> Should now be fixed.
>  

Great stuff, Peter, and clearly a lot of work. Thank you!

Your replacement of SETLIST by TUPLE and DETUPLE was a nice insight into
how to achieve the goal. And as I scanned the diff, I noticed quite a
few places where there were more leading '-' than '+' characters; that's
always a good sign that you're on the right track. Of course the final
product contained more '+' lines, especially when it came time to tweak
the VM.

FYI, I put it through the tests at:

  http://www.inf.puc-rio.br/~roberto/lua/lua5.1-tests.tar.gz

It passed completely.

I had a little trouble with the expected strings for the interpreter
tests (-i) in main.lua. Running "lua -i < prepfile > outfile" included
echoes of the text from the prepfile, just as if they were typed by
hand. I played around with how I was launching the test (shell prompt,
shell script, emacs compile) but couldn't make it go away. I'm sure
there's some tty setting that I need to make for it to work (I'm using
Ubuntu 8.04). Any tips for this?

Since the code was clearly working and it seemed like there were only a
few '-i' tests, I inspected each false negative by hand and commented
out the checkout() calls one at a time.

Props to you, Peter.

Doug


______________________________________________________________________________________
The information contained in this email transmission may contain proprietary and business
sensitive information.  If you are not the intended recipient, you are hereby notified that
any review, dissemination, distribution or duplication of this communication is strictly
prohibited.  Unauthorized interception of this e-mail is a violation of law.  If you are not
the intended recipient, please contact the sender by reply email and immediately destroy all
copies of the original message.

Any technical data and/or information provided with or in this email may be subject to U.S.
export controls law.  Export, diversion or disclosure contrary to U.S. law is prohibited.  
Such technical data or information is not to be exported from the U.S. or given to any foreign
person in the U.S. without prior written authorization of Elbit Systems of America and the
appropriate U.S. Government agency.
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Roberto Ierusalimschy
> I had a little trouble with the expected strings for the interpreter  
> tests (-i) in main.lua. Running "lua -i < prepfile > outfile" included  
> echoes of the text from the prepfile, just as if they were typed by  
> hand. I played around with how I was launching the test (shell prompt,  
> shell script, emacs compile) but couldn't make it go away. I'm sure  
> there's some tty setting that I need to make for it to work (I'm using  
> Ubuntu 8.04). Any tips for this?

May it have something to do with the readline library? IIRC, I had
this problem before and it was related to that library. (Something
like it echoes only if you define LUA_USE_READLINE, or vice-versa...)

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Peter Cawley
In reply to this post by Doug Rogers
On Thu, Aug 13, 2009 at 8:08 PM, Doug
Rogers<[hidden email]> wrote:
> Your replacement of SETLIST by TUPLE and DETUPLE was a nice insight into how
> to achieve the goal. And as I scanned the diff, I noticed quite a few places
> where there were more leading '-' than '+' characters; that's always a good
> sign that you're on the right track. Of course the final product contained
> more '+' lines, especially when it came time to tweak the VM.

SETLIST is totally separate from DE/TUPLE. The changing of SETLIST is
the reason for most of the removed lines; the new SETLIST simplifies
the bytecode format, and allows simple and efficient expansion of
expressions in table constructors, with the cost being increased
memory usage from each table being increased by sizeof(int). DE/TUPLE
are the reason for most of the added lines, and are for the very
separate issue of expanding expressions within an expression list. The
TUPLE instruction complicates writing a strong bytecode verifier, as
such a verifier would need to guarantee that a tuple isn't used for
anything other than a DETUPLE instruction. As the current verifier is
considered broken and is probably being removed in 5.2, I don't
consider breaking the bytecode verifier further to be too big of a
crime.
Reply | Threaded
Open this post in threaded view
|

Re: ... as an "expand list" unary postfix operator

Doug Rogers
In reply to this post by Roberto Ierusalimschy
Roberto Ierusalimschy wrote:
>> ... Running "lua -i < prepfile > outfile" included  
>> echoes of the text from the prepfile, ... Any tips for this?
> May it have something to do with the readline library?

Good call. Recompiling with 'make ansi' fixed it.

I've just spent a too many minutes searching the web for a way to make
GNU's Readline library disable echoing. I thought there might be an
environment variable or something I could put in my .inputrc that would
allow readline to detect whether its output is a tty. No luck so far.

Thanks for the answer,

Doug


______________________________________________________________________________________
The information contained in this email transmission may contain proprietary and business
sensitive information.  If you are not the intended recipient, you are hereby notified that
any review, dissemination, distribution or duplication of this communication is strictly
prohibited.  Unauthorized interception of this e-mail is a violation of law.  If you are not
the intended recipient, please contact the sender by reply email and immediately destroy all
copies of the original message.

Any technical data and/or information provided with or in this email may be subject to U.S.
export controls law.  Export, diversion or disclosure contrary to U.S. law is prohibited.  
Such technical data or information is not to be exported from the U.S. or given to any foreign
person in the U.S. without prior written authorization of Elbit Systems of America and the
appropriate U.S. Government agency.
12