builtin pattern matching

classic Classic list List threaded Threaded
4 messages Options
Jim
Reply | Threaded
Open this post in threaded view
|

builtin pattern matching

Jim
On 5/10/19, Sean Conner <[hidden email]> wrote:
> Well, MS-DOS and AmigaOS spring to mind.

i am pretty sure lots of Lua code runs on DOS and AmigaOS these days,
hence still supporting those is crucial indeed. :D

> I never liked regex. Ever. I find them exceedingly hard to
> understand as they devolve into line noise in my opinion.  Not only do I
> find them hard to read, but you can't combine them (unlike LPEG, which you
> can easily build up piecemeal and combine---very nice).

that may be true, but coders used to Perl, Ruby (these 2 even provide regex
syntax as part of their own syntax) et al are used to pattern matching
with regex,
so it would be nice if Lua could provide some decent builtin form of
regex pattern matching (at least as lib, wasting extra syntax for
regex is not necessary IMO but maybe it is even possible via a
metatable).

maybe it would also be useful to just have builtin simple shell style
pattern matching
(*, ?, [...]) for simple patterns, though this can be done already
with Lua's patterns.
this is frequently used for file path matching via glob() and friends
and has to be
provided by any posix conforming libc.

all the user asked for back then was the addition of an OR '|' operator to Lua's
builtin patterns (which saves one from using if-then-else for all
possible pattern
alternatives).

Reply | Threaded
Open this post in threaded view
|

Re: builtin pattern matching

Dirk Laurie-2
Op Sa. 11 Mei 2019 om 10:32 het Jim <[hidden email]> geskryf:

>
> On 5/10/19, Sean Conner <[hidden email]> wrote:
> > Well, MS-DOS and AmigaOS spring to mind.
>
> i am pretty sure lots of Lua code runs on DOS and AmigaOS these days,
> hence still supporting those is crucial indeed. :D
>
> > I never liked regex. Ever. I find them exceedingly hard to
> > understand as they devolve into line noise in my opinion.  Not only do I
> > find them hard to read, but you can't combine them (unlike LPEG, which you
> > can easily build up piecemeal and combine---very nice).
>
> that may be true, but coders used to Perl, Ruby (these 2 even provide regex
> syntax as part of their own syntax) et al are used to pattern matching
> with regex,
> so it would be nice if Lua could provide some decent builtin form of
> regex pattern matching (at least as lib, wasting extra syntax for
> regex is not necessary IMO but maybe it is even possible via a
> metatable).
>
> maybe it would also be useful to just have builtin simple shell style
> pattern matching
> (*, ?, [...]) for simple patterns, though this can be done already
> with Lua's patterns.
> this is frequently used for file path matching via glob() and friends
> and has to be
> provided by any posix conforming libc.
>
> all the user asked for back then was the addition of an OR '|' operator to Lua's
> builtin patterns (which saves one from using if-then-else for all
> possible pattern
> alternatives).

It's not as simple as that. The built-in string library matches
character classes, not strings. At each point, a decision is made
whether the current character matches the current pattern element. The
[set] and [^set] character classes, which is the closest we get to an
OR, match one character.

The entire pattern-matching engine would have to be redesigned to
provide a OR for subpatterns of length greater than 1.

Reply | Threaded
Open this post in threaded view
|

Re: builtin pattern matching

Sam Putman
In reply to this post by Jim


>
> that may be true, but coders used to Perl, Ruby (these 2 even provide regex

What’s fantastic about these languages, and Python, Java, dozens of others:

They exist and you can use them!

Lua has lpeg, which is superior in a variety of ways.

If you want to get good at Lua programming, I heartily suggest learning it.

cheers,
-Sam.
Jim
Reply | Threaded
Open this post in threaded view
|

Re: builtin pattern matching

Jim
In reply to this post by Dirk Laurie-2
On 5/11/19, Dirk Laurie <[hidden email]> wrote:
> It's not as simple as that. The built-in string library matches
> character classes, not strings. At each point, a decision is made
> whether the current character matches the current pattern element. The
> [set] and [^set] character classes, which is the closest we get to an
> OR, match one character.
>
> The entire pattern-matching engine would have to be redesigned to
> provide a OR for subpatterns of length greater than 1.

what about having a look at Squirrel's pattern matching implementation ?

https://raw.githubusercontent.com/albertodemichelis/squirrel/master/sqstdlib/sqstdrex.cpp

it is written in C++, although in a procedural style that should not be too hard
to translate back to C. seems like it works for unicode aswell.

sqstdrex.cpp (29K) Download Attachment