is this behavior correct?

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

is this behavior correct?

Xavier Wang
local s = "abcdef"
print(string.match(s, "(^abc)(def$)"))

shows "nil"

but:
local s = "abcdef"
print(string.match(s, "^(abc)(def)$"))

shows "abc def"

is this behavior correct?
Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

Peter Cawley
On Mon, Jun 13, 2011 at 12:23 PM, Xavier Wang <[hidden email]> wrote:
> local s = "abcdef"
> print(string.match(s, "(^abc)(def$)"))
> shows "nil"
> but:
> local s = "abcdef"
> print(string.match(s, "^(abc)(def)$"))
> shows "abc def"
> is this behavior correct?

According to the manual, yes:

A '^' at the beginning of a pattern anchors the match at the beginning
of the subject string. A '$' at the end of a pattern anchors the match
at the end of the subject string. At other positions, '^' and '$' have
no special meaning and represent themselves.

Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

steve donovan
On Mon, Jun 13, 2011 at 1:26 PM, Peter Cawley <[hidden email]> wrote:
> at the end of the subject string. At other positions, '^' and '$' have
> no special meaning and represent themselves.

Except for ranges like [^abc], which is a character which is not 'a',
'b' or 'c'.  It's good practice to use % in front of ^ and $ if they
are not considered to be 'magic'.

steve d.

Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

eugeny gladkih
In reply to this post by Peter Cawley
On 13.06.2011 15:26, Peter Cawley wrote:

> On Mon, Jun 13, 2011 at 12:23 PM, Xavier Wang<[hidden email]>  wrote:
>> local s = "abcdef"
>> print(string.match(s, "(^abc)(def$)"))
>> shows "nil"
>> but:
>> local s = "abcdef"
>> print(string.match(s, "^(abc)(def)$"))
>> shows "abc def"
>> is this behavior correct?
>
> According to the manual, yes:
>
> A '^' at the beginning of a pattern anchors the match at the beginning
> of the subject string. A '$' at the end of a pattern anchors the match
> at the end of the subject string. At other positions, '^' and '$' have
> no special meaning and represent themselves.
>

that's not standard regexp interpretation.

[~]>perl -e "print 'abcdef' =~ /^(abc)(def)$/"
abcdef
 >perl -e "print 'abcdef' =~ /(^abc)(def$)/"
abcdef

Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

Javier Guerra Giraldez
2011/6/13 john gladkih 599133195 <[hidden email]>:
> that's not standard regexp interpretation.

Lua patterns are not regexp

--
Javier

Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

Dirk Laurie
On Mon, Jun 13, 2011 at 08:59:10PM +0200, Javier Guerra Giraldez wrote:
> 2011/6/13 john gladkih 599133195 <[hidden email]>:
> > that's not standard regexp interpretation.
>
> Lua patterns are not regexp
>
:-)))

Dirk

/ \(\*\|\*\*\)\([^\* ]\|[^\*]\( [^\*]\)\+\)\+\1/

Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

Philippe Lhoste
In reply to this post by Javier Guerra Giraldez
On 13/06/2011 20:59, Javier Guerra Giraldez wrote:
> 2011/6/13 john gladkih 599133195<[hidden email]>:
>> that's not standard regexp interpretation.
>
> Lua patterns are not regexp

Certainly not a standard one, at least...
A standard RE implementation, even if not as complete as PCRE, would probably be as big as
full Lua code itself (including its pattern implementation...). And a RE engine isn't hard
to plug in, you can find several implementations around.

PS.: Any opinion on the two RE implementations by Google? One is small, the other is quite
complete, no?

--
Philippe Lhoste
--  (near) Paris -- France
--  http://Phi.Lho.free.fr
--  --  --  --  --  --  --  --  --  --  --  --  --  --


Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

Ben Kelly
On June 14, 2011 08:46:20 AM Philippe Lhoste wrote:
> On 13/06/2011 20:59, Javier Guerra Giraldez wrote:
> > 2011/6/13 john gladkih 599133195<[hidden email]>:
> >> that's not standard regexp interpretation.
> >
> > Lua patterns are not regexp
>
> Certainly not a standard one, at least...
>
Not *any* one. Lua patterns are missing some features required to act as
regular expressions (the grouping operator () and the alternation operator |,
specifically), while having some additional features (%b) that let them
describe nonregular languages!

This results in the surprising and annoying situation that with lua patterns,
you can describe some languages that you cannot describe with regular
expressions, but at the same time there are many regular languages that you
cannot describe with lua patterns.

> A standard RE implementation, even if not as complete as PCRE, would
> probably be as big as full Lua code itself (including its pattern
> implementation...). And a RE engine isn't hard to plug in, you can find
> several implementations around.
>
Every time I hear this ("a full RE implementation would be as big as Lua
itself") it always seems to equate "full RE implementation" with PCRE, or
POSIX EREs, or something else that starts with regular expressions and then
adds a huge number of extra features. How big would something be that *just*
adds () and |?

        Ben Kelly


Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

Dirk Laurie
On Sun, Jun 19, 2011 at 04:28:01PM +0200, Ben Kelly wrote:
> This results in the surprising and annoying situation that with lua patterns,
> you can describe some languages that you cannot describe with regular
> expressions, but at the same time there are many regular languages that you
> cannot describe with lua patterns.
It neither suprises nor annoys me that Lua has found elegant and
readable ways of allowing you to do what in other languages requires
methods that are neither.

Lua patterns give a quick-and-easy solution to 99% of the stuff a
casual user needs, and the fact that the mistake of overworking the
backslash is avoided, means that one can read a Lua pattern instead
of laboriously having to decode it.

For those who need more power inside Lua, there is always LPEG, which
makes regular expressions look tired and obsolete.

Regular expressions are like stout: great for those who've learned to
love it, but repugnant to others.  And there is always an Irish pub
around the corner called "Perl".

Dirk  

Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

Philippe Lhoste
On 20/06/2011 11:46, Dirk Laurie wrote:
> Regular expressions are like stout: [...]

I am probably too geek (and I don't like beer...), as I first thought you made a typo and
intended to write stdout... (interpolated on the fly to "as a means to debug"!).

--
Philippe Lhoste
--  (near) Paris -- France
--  http://Phi.Lho.free.fr
--  --  --  --  --  --  --  --  --  --  --  --  --  --


Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

steve donovan
On Mon, Jun 20, 2011 at 1:07 PM, Philippe Lhoste <[hidden email]> wrote:
> made a typo and intended to write stdout... (interpolated on the fly to "as
> a means to debug"!).

Full regexps are not a means to debug ;)

Despite my Irish roots, I'm not keen either on stout or regexps...

steve d.

Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

David Kastrup
steve donovan <[hidden email]> writes:

> On Mon, Jun 20, 2011 at 1:07 PM, Philippe Lhoste <[hidden email]> wrote:
>> made a typo and intended to write stdout... (interpolated on the fly to "as
>> a means to debug"!).
>
> Full regexps are not a means to debug ;)
>
> Despite my Irish roots, I'm not keen either on stout or regexps...

St Patrick drove off the snakes from Ireland, and he probably would have
taken to regexps as well, had they been around at that time.

--
David Kastrup


Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

Lorenzo Donati-2
On 20/06/2011 14.17, David Kastrup wrote:

> steve donovan<[hidden email]>  writes:
>
>> On Mon, Jun 20, 2011 at 1:07 PM, Philippe Lhoste<[hidden email]>  wrote:
>>> made a typo and intended to write stdout... (interpolated on the fly to "as
>>> a means to debug"!).
>>
>> Full regexps are not a means to debug ;)
>>
>> Despite my Irish roots, I'm not keen either on stout or regexps...
>
> St Patrick drove off the snakes from Ireland, and he probably would have
> taken to regexps as well, had they been around at that time.
>
Oh well, nowadays St.Patrick would probably have had a bad time, since
PCRE-like regexps are way too common (more than snakes, anyway - at
least in urban context :-)

Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

Lorenzo Donati-2
In reply to this post by Philippe Lhoste
On 20/06/2011 13.07, Philippe Lhoste wrote:
> On 20/06/2011 11:46, Dirk Laurie wrote:
>> Regular expressions are like stout: [...]
>
> I am probably too geek (and I don't like beer...), as I first thought
> you made a typo and intended to write stdout... (interpolated on the fly
> to "as a means to debug"!).
>

Although I like beer, I thought of a typo too! :-)

Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

Lorenzo Donati-2
In reply to this post by Dirk Laurie
On 20/06/2011 11.46, Dirk Laurie wrote:

> On Sun, Jun 19, 2011 at 04:28:01PM +0200, Ben Kelly wrote:
>> This results in the surprising and annoying situation that with lua patterns,
>> you can describe some languages that you cannot describe with regular
>> expressions, but at the same time there are many regular languages that you
>> cannot describe with lua patterns.
> It neither suprises nor annoys me that Lua has found elegant and
> readable ways of allowing you to do what in other languages requires
> methods that are neither.
>
> Lua patterns give a quick-and-easy solution to 99% of the stuff a
> casual user needs, and the fact that the mistake of overworking the
> backslash is avoided, means that one can read a Lua pattern instead
> of laboriously having to decode it.
>
> For those who need more power inside Lua, there is always LPEG, which
> makes regular expressions look tired and obsolete.
>

Yes, I appreciate the expressive power of Lpeg (although I don't
understand its syntax very much). The problem is (but maybe it's just
me) its steep learning curve and the lack of tutorials.

Moreover PCRE-like regexps are extremely common, so knowing how to use
them is likely to be a very "reusable" knowlegde (ranging from other
programming languages to utility programs), whereas Lpeg is a "Lua-only"
tool (so the effort to learn it must be really counterbalanced by the
benefits one can achieve).

Nevertheless, I admit regexps are somewhat ugly to write and *really*
ugly to read (especially when you didn't write them in the first place)!


> Regular expressions are like stout: great for those who've learned to
> love it, but repugnant to others.  And there is always an Irish pub
> around the corner called "Perl".
>
> Dirk
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

Philippe Lhoste
On 20/06/2011 15:15, Lorenzo Donati wrote:
> Moreover PCRE-like regexps are extremely common, so knowing how to use them is likely to
> be a very "reusable" knowlegde (ranging from other programming languages to utility
> programs), whereas Lpeg is a "Lua-only" tool (so the effort to learn it must be really
> counterbalanced by the benefits one can achieve).

Note that the concept itself, ie. Peg, is implemented in several languages, although with
differing syntaxes, of course, depending on the (lack of) capability to have DSLs for example.

In Java (and Scala), for example, you have the Parboiled library.

So half of the effort is to learn the concept, and the other half is to learn the given
implementation, but the first half is reusable... :-)

Beside, you can do with Pegs everything you can do in regexes (with a slightly more
verbose syntax, which can be seen as an advantage or not...), but also much more! So it is
worth the investment in time.

--
Philippe Lhoste
--  (near) Paris -- France
--  http://Phi.Lho.free.fr
--  --  --  --  --  --  --  --  --  --  --  --  --  --


Reply | Threaded
Open this post in threaded view
|

Re: is this behavior correct?

Lorenzo Donati-2
On 20/06/2011 15.24, Philippe Lhoste wrote:

> On 20/06/2011 15:15, Lorenzo Donati wrote:
>> Moreover PCRE-like regexps are extremely common, so knowing how to use
>> them is likely to
>> be a very "reusable" knowlegde (ranging from other programming
>> languages to utility
>> programs), whereas Lpeg is a "Lua-only" tool (so the effort to learn
>> it must be really
>> counterbalanced by the benefits one can achieve).
>
> Note that the concept itself, ie. Peg, is implemented in several
> languages, although with differing syntaxes, of course, depending on the
> (lack of) capability to have DSLs for example.
>
> In Java (and Scala), for example, you have the Parboiled library.
>
> So half of the effort is to learn the concept, and the other half is to
> learn the given implementation, but the first half is reusable... :-)
>
> Beside, you can do with Pegs everything you can do in regexes (with a
> slightly more verbose syntax, which can be seen as an advantage or
> not...), but also much more! So it is worth the investment in time.
>
Oh yes, I'm aware of that, thanks!

That's why I call it unfortunate the fact that I lack the time to learn
them right now.