Suggestion for new pattern item like %bxy

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Suggestion for new pattern item like %bxy

Tony Papadimitriou
I would like to suggest the addition of a new pattern item which will behave just the like the %bxy one but will return the match without the xy part.
(This should be a simple, low overhead addition.)
 
For example, for
s = ‘1(2)3’
 
%b() now returns ‘(2)’, and the new proposed pattern item should return just ‘2’, i.e., the () removed, just as if :sub(2,-2) was applied to the result of the %bxy match.
 
Why?
 
I use %bxy a lot, it’s very very useful in many situations.  However, I find that practically all of the time (OK, maybe 99%) I’m only interested in what’s inside the delimiters.
So, for now, I always end up with a :sub(2,-2) at end.
 
Why not use :sub(2,-2) then?
 
The problem [apart for flooding my code with a whole bunch of :sub(2,-2)] is I also want to use this in a gsub (for example), and then it’s not as simple.
Yes, I know I can use a function() for gsub.
 
One example: Strip one level of matching parens with a gsub so that
 
‘1(2(3)4)5’ becomes ‘12(3)45’
 
This is a simple example and it can also be tackled with something like
 
s = s:gsub('%b()',function(s) return s:sub(2,-2) end)
but the pattern can be much more involved that cannot easily be emulated differently.
It’s also possible that in that same string I want to keep one match but not the other.
 
Example: ‘1(2)3(4)5’ to become ‘1(2)345’
 
The previous :gsub would no longer work, and a more involved solution is needed.  And it gets worse as the pattern gets more complex.
 
Does anyone else see any merit (or problem, other than the usual minimalism objections) in this proposal?
 
Thanks.
Reply | Threaded
Open this post in threaded view
|

Re: Suggestion for new pattern item like %bxy

Jerome Vuarand
2017-02-15 0:32 GMT+00:00 Tony Papadimitriou <[hidden email]>:
> I would like to suggest the addition of a new pattern item which will behave
> just the like the %bxy one but will return the match without the xy part.
> (This should be a simple, low overhead addition.)

As a default response for that kind of suggestions, if it's that
simple and low overhead, you should definitely implement it yourself
and use that in your own projects. You can modify Lua, and if you want
to use prebuilt Lua binaries you can wrap string module functions in
your Lua code.

> [...]
>
> Does anyone else see any merit (or problem, other than the usual minimalism
> objections) in this proposal?

There's merit, I too use the inside more often than the whole.
Currently there is only one capturing pattern, and there isn't a
single pattern that does matching and capture at the same time, which
is what you're proposing to introduce. So it's a whole new class of
patterns that should be added. It's usually not a big deal to go from
n to n+1, except when n is 0 or 1. I'm not saying it shouldn't be
done, but we probably need more than a couple users saying they'd like
that feature to offset the cost.

Reply | Threaded
Open this post in threaded view
|

Re: Suggestion for new pattern item like %bxy

Dirk Laurie-2
In reply to this post by Tony Papadimitriou
2017-02-15 2:32 GMT+02:00 Tony Papadimitriou <[hidden email]>:

> I would like to suggest the addition of a new pattern item which will
> behave just the like the %bxy one but will return the match without
> the xy part.
...
> Does anyone else see any merit (or problem, other than the usual
> minimalism objections) in this proposal?

Every pattern can be looked at in two ways.

Yang:  What does it provide under gmatch?
Yin: What does it leave behind under gsub?

Let's say your pattern will be called %B. The first question to ask is:
"Is there anything %B can do that %b cannot do, or vice versa?"
and one must look at it from both perspectives.

Under Yang it is a tie. It is just as easy to re-concatenate the delimiters
to the result of %B as it is to :sub them away from the result of %b.

Under Yin %b wins. It is easy to put two delimeters back in but you will
need to repeat the whole exercise in order to take empty balanced pairs
of them out.

I.e. if we have only one of %b and %B, then it must be %b. This kind
of reasoning may well have been part of the design of how %b operates.

Now to your exact question. It is specious to admit only problems "other
than the usual minimalism objections". It is like asking "Apart from your
usual rant that we only have a single garage and one income, would
it not be nice if our family had a second car?"

Minimalism objections cannot be brushed aside like that. Let me
summarize them.

1. It uses up syntax space.
2. It demands more of implementors of compatible libraries.
3. It makes the description of patterns a little longer and a little
harder to follow at first reading.

The question to ask, when considering the addition of a new feature,
is not "Is there any merit in it?" (of course there is, if one can exhibit
even one case where it saves effort) but "Does it offer enough to
offset the usual minimalist objections?". In other words, does it make
real progress or is it a case of feature creep?

Reply | Threaded
Open this post in threaded view
|

Re: Suggestion for new pattern item like %bxy

Egor Skriptunoff-2

> I would like to suggest the addition of a new pattern item which will
> behave just the like the %bxy one but will return the match without
> the xy part.
...
> Does anyone else see any merit (or problem, other than the usual
> minimalism objections) in this proposal?

Every pattern can be looked at in two ways.

Yang:  What does it provide under gmatch?
Yin: What does it leave behind under gsub?

Let's say your pattern will be called %B. The first question to ask is:
"Is there anything %B can do that %b cannot do, or vice versa?"
and one must look at it from both perspectives.

Under Yang it is a tie. It is just as easy to re-concatenate the delimiters
to the result of %B as it is to :sub them away from the result of %b.

Under Yin %b wins. It is easy to put two delimeters back in but you will
need to repeat the whole exercise in order to take empty balanced pairs
of them out.

I.e. if we have only one of %b and %B, then it must be %b.


I disagree.
I'd prefer %B to be implemented instead of %b.

Delimiters are hardcoded into pattern and thus delimiters in the result string
do not carry any useful information.
Delimiters are usually part of the container, not part of the payload.

In real life you are usually throwing away all packing materials
of all the thing you have bought, isn't it?

In your reasoning about Yang and Yin you implicitly assume that
both options (whether result should include delimiters or not) have
equal chances to be useful for user.
You have completely forgotten that in most situations in practice
you are spending your time to clear those delimiters away from the result.

Reply | Threaded
Open this post in threaded view
|

Re: Suggestion for new pattern item like %bxy

Dirk Laurie-2
2017-02-15 12:35 GMT+02:00 Egor Skriptunoff <[hidden email]>:

> Delimiters are usually part of the container, not part of the payload.
>
> In real life you are usually throwing away all packing materials
> of all the thing you have bought, isn't it?
>
> In your reasoning about Yang and Yin you implicitly assume that
> both options (whether result should include delimiters or not) have
> equal chances to be useful for user.
> You have completely forgotten that in most situations in practice
> you are spending your time to clear those delimiters away from the result.

Please don't call your own favourite use case "usually" and
"most situations in practice".

Mine is more typically a case where the source string is being parsed,
e.g. I wish to go from

y = a * ( b + c * ( d - e )) + f

to

t1 = d - e
t2 = b + c * t1
y = a * t2 + f

Reply | Threaded
Open this post in threaded view
|

Re: Suggestion for new pattern item like %bxy

Roberto Ierusalimschy
> 2017-02-15 12:35 GMT+02:00 Egor Skriptunoff <[hidden email]>:
>
> > Delimiters are usually part of the container, not part of the payload.
> >
> > In real life you are usually throwing away all packing materials
> > of all the thing you have bought, isn't it?

As Jorome already pointed out, %b and %B are not on the same foot. You
guys are actually comparing '%B' with '(%b)'. Again repeating Jerome,
a '%B' would be a new class of beast that matches *and* captures,
something that does not currently exist in Lua.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Suggestion for new pattern item like %bxy

Soni "They/Them" L.


On 15/02/17 12:47 PM, Roberto Ierusalimschy wrote:

>> 2017-02-15 12:35 GMT+02:00 Egor Skriptunoff <[hidden email]>:
>>
>>> Delimiters are usually part of the container, not part of the payload.
>>>
>>> In real life you are usually throwing away all packing materials
>>> of all the thing you have bought, isn't it?
> As Jorome already pointed out, %b and %B are not on the same foot. You
> guys are actually comparing '%B' with '(%b)'. Again repeating Jerome,
> a '%B' would be a new class of beast that matches *and* captures,
> something that does not currently exist in Lua.
>
> -- Roberto
>
I read this as "you should use LPeg".

To OP: You should use LPeg.

--
Disclaimer: these emails may be made public at any given time, with or without reason. If you don't agree with this, DO NOT REPLY.


Reply | Threaded
Open this post in threaded view
|

Re: Suggestion for new pattern item like %bxy

Luiz Henrique de Figueiredo
In reply to this post by Dirk Laurie-2
Dirk Laurie said:

> Mine is more typically a case where the source string is being parsed,
> e.g. I wish to go from
>
> y = a * ( b + c * ( d - e )) + f
>
> to
>
> t1 = d - e
> t2 = b + c * t1
> y = a * t2 + f

Here is a program that I've used to generate linear code for arithmetic
expressions. It is a recent incarnation of the one described in the
first journal paper on Lua <http://www.lua.org/spe.html> some 20 years ago!
Start reading at "Another unusual facility provided by fallbacks is the
reuse of Lua's parser".
--lhf

local MT={}
local V={}
local N=0

local function var(name)
 local t={name=name}
 V[name]=t
 _G[name]=t
 return setmetatable(t,MT)
end

local function S(a)
 if type(a)=="table" then return a.name else return a or 0 end
end

local function arithfb(a,b,op)
 local i=op .. "(" .. S(a) .. "," .. S(b) .. ")"
 if V[i]==nil then N=N+1; V[i]=var("t"..N,N); print(V[i].name ..'='..i) end
 return V[i]
end

local t={"add", "sub", "mul", "div", "unm", "pow"}
for i,v in next,t do
 MT["__"..v]=function (a,b) return arithfb(a,b,v) end
end

local function vars(s)
 for x in string.gmatch(s,"(%w+)") do var(x) end
end

vars"x,y"
return 2/3*x +(x^2-y^2)/(3*(x^2+y^2)), 2/3*y-2*(x*y)/(3*(x^2+y^2))