LPEG design patterns

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

LPEG design patterns

joy mondal
Hi,

In LPEG , you can pass a variable to your parser using Carg.

BUT if you build your grammar dynamically for EACH string / file you could pre-create state for each of your Cmt functions.

- One design is stateless ( which has some dubious clarity )

- One design creates a separate grammar for each string , which is likely slower ?

Question 1 :: is option 2 a anti-pattern ? do we stick to option 1 ?

Another issue I feel you have with using Carg is that all your rules have Carg  all over the place.

local patt1 = P(Carg(1)* .................)

local patt2 = P(Carg(1)* .................)

I understand that its possible to pass multiple arguments with multiple states :

local patt1 = P(Carg(1)* .................)

local patt2 = P(Carg(2)* .................)

local patt2 = P(Carg(3)* .................)

But it doesn't make things any clearer, and now you need to remember which number corresponds to which variable.

My question is that of confidence and design - how large does your parser have to get before this method of arranging stops making sense ?

best wishes,

Joy
Reply | Threaded
Open this post in threaded view
|

Re: LPEG design patterns

William Ahern
On Wed, May 01, 2019 at 10:37:35PM +0400, joy mondal wrote:

>  Hi,
>
> In LPEG , you can pass a variable to your parser using Carg.
>
> BUT if you build your grammar dynamically for EACH string / file you could
> pre-create state for each of your Cmt functions.
>
> - One design is stateless ( which has some dubious clarity )
>
> - One design creates a separate grammar for each string , which is likely
> slower ?

I also assume it would be slower. But if you're facing a dilemma it seems
like it'd be an easy thing to verify experimentally.
 

> Question 1 :: is option 2 a anti-pattern ? do we stick to option 1 ?
>
> Another issue I feel you have with using Carg is that all your rules have
> Carg  all over the place.
>
> local patt1 = P(Carg(1)* .................)
>
> local patt2 = P(Carg(1)* .................)
>
> I understand that its possible to pass multiple arguments with multiple
> states :
>
> local patt1 = P(Carg(1)* .................)
>
> local patt2 = P(Carg(2)* .................)
>
> local patt2 = P(Carg(3)* .................)
>
> But it doesn't make things any clearer, and now you need to remember which
> number corresponds to which variable.

You could also just pass a table with named fields. Positional arguments is
merely an obvious choice from an implementation perspective--function
parameters in Lua are positional and lpeg.match is a function. Positional
argument captures isn't a prescription for grammar composition, just an
artifact of the language.

Reply | Threaded
Open this post in threaded view
|

Re: LPEG design patterns

Sean Conner
In reply to this post by joy mondal
It was thus said that the Great joy mondal once stated:

>  Hi,
>
> In LPEG , you can pass a variable to your parser using Carg.
>
> BUT if you build your grammar dynamically for EACH string / file you could
> pre-create state for each of your Cmt functions.
>
> - One design is stateless ( which has some dubious clarity )
>
> - One design creates a separate grammar for each string , which is likely
> slower ?

  You missed one---you could always use global variables and avoid Carg() or
building a separate grammer entirely.

> Question 1 :: is option 2 a anti-pattern ? do we stick to option 1 ?

  *I* think so, but that's me.  I can't speak for others.

> Another issue I feel you have with using Carg is that all your rules have
> Carg  all over the place.
>
> local patt1 = P(Carg(1)* .................)
>
> local patt2 = P(Carg(1)* .................)

  It depends upon what you are doing.  I just finished a formatting program
(implements a form of OrgMode for my own blogging needs [1]) and I counted
only 15 instances of Carg() in 93 (if I counted correctly) named LPEG rules,
half of them in one rule:

local style = Cmt(P"//" * Carg(1),stack "i")
            + Cmt(P"/"  * Carg(1),stack "em")
            + Cmt(P"**" * Carg(1),stack "b")
            + Cmt(P"*"  * Carg(1),stack "strong")
            + Cmt(P"+"  * Carg(1),stack "del")
            + Cmt(P"="  * Carg(1),stack "code")
            + Cmt(P"~~" * Carg(1),stack "tt")
            + Cmt(P"~"  * Carg(1),stack "kbd")

> I understand that its possible to pass multiple arguments with multiple
> states :
>
> local patt1 = P(Carg(1)* .................)
>
> local patt2 = P(Carg(2)* .................)
>
> local patt2 = P(Carg(3)* .................)
>
> But it doesn't make things any clearer, and now you need to remember which
> number corresponds to which variable.

  Easy enough to solve:

        local MACROS = 1
        local STACK  = 2
        local QUOTE  = 3

        local patt1 = P(Carg(MACROS) * ... )
        local patt2 = P(Carg(STACK)  * ... )
        local patt3 = P(Carg(QUOTE)  * ... )

Or pass a table---that's what I'm doing:

        local state =
        {
          email_all = false,
          stack     = {},
          quote     = {},
          abbr      = {},
        }
       
> My question is that of confidence and design - how large does your parser
> have to get before this method of arranging stops making sense ?

  Hard to say.  The parser I wrote is over 700 lines long and the method
doesn't seem all that onerous to me.  I prefer the Carg() method over the
others because 1) there's no global data to mess up and 2) I only pay the
compilation overhead once.

  -spc (You will not believe the number of times I got the "body may accept
        empty string" error while writing this program ... )

[1] https://github.com/spc476/mod_blog/blob/master/Lua/format.lua

        LPEG code starts at line 150.

        A sample input file is here:

        https://github.com/spc476/mod_blog/blob/master/NOTES/testmsg

Reply | Threaded
Open this post in threaded view
|

Re: LPEG design patterns

Roberto Ierusalimschy
>   You missed one---you could always use global variables and avoid Carg() or
> building a separate grammer entirely.

You can also use closures, avoiding global variables. The functions
in the pattern use external local variables, and some helper
functions allow you to change the values of these variables.

do
  local X

  function helperX (newx) X = newx end

  -- your pattern goes here, using 'X'
  ...
end

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: LPEG design patterns

joy mondal
Hi,

Thanks William, Sean and Roborto.

I was worried about overusing just one variable (Carg 1) to hold all state information.

Without moving to a more object oriented design using setmetables.

Seems like the nobody does that.

For example :

local Patt = P('ABC')

Cmt (Patt,parent.ABC)

where parent is a object which holds state information.

Rather that what we do today:

local Patt = (Carg 1)*P('ABC')

Cmt (Patt,ABC) -- ABC is stateless

Cheers !

On Thu, May 2, 2019 at 5:02 PM Roberto Ierusalimschy <[hidden email]> wrote:
>   You missed one---you could always use global variables and avoid Carg() or
> building a separate grammer entirely.

You can also use closures, avoiding global variables. The functions
in the pattern use external local variables, and some helper
functions allow you to change the values of these variables.

do
  local X

  function helperX (newx) X = newx end

  -- your pattern goes here, using 'X'
  ...
end

-- Roberto