Lpeg Cg question

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Lpeg Cg question

Andrew Starks
```lua


local lpeg = require'lpeg'
local C = lpeg.C
local P = lpeg.P
local Cg = lpeg.Cg


local find = function(str)
    return (P(1) - str)^0 * C(str)
end

local cg_test =  Cg(find("string") * find("cool"))

local test = "my string is not cool."

print(cg_test:match(test))
--> string\tcool

print(select("#", cg_test:match(test)))
--> 2

--Same as...
print((find("string") * find("cool")):match(test))
--> string\tcool

print(select("#", (find("string") * find("cool")):match(test)))
--> 2

```

I'm expecting...

```lua

print(cg_test:match(test))
--> stringcool
print(select("#", cg_test:match(test)))
--> 1
```

...because the manual[1] says:

     An anonymous group serves to join values from several captures
into a single capture.



What am I missing?

--Andrew

[1]: http://www.inf.puc-rio.br/~roberto/lpeg/#cap-g

Reply | Threaded
Open this post in threaded view
|

Re: Lpeg Cg question

Pierre-Yves Gérardy
On Tue, May 13, 2014 at 10:16 PM, Andrew Starks <[hidden email]> wrote:

> ...because the manual[1] says:
>
>      An anonymous group serves to join values from several captures
> into a single capture.
>
>
>
> What am I missing?

It concerns LPeg's own representation of captures. Groups are
flattened when returned.

For example, in a folding capture, groups are treated as a single
capture, and the callbacks receives all sub-captures as arguments.

There are also some inconsistencies in how other captures handle groups:

    (Cg(C(1) * C(1)) * Cg(C(1) * C(1)) / 2):match"abcd"    --> b
    (Cg(C(1) * C(1)) * Cg(C(1) * C(1)) / "%2"):match"abcd" --> c

`/2` takes the second production of the capture stream, whereas
`/"%2"` takes the value of the second group, truncated to one value.

—Pierre-Yves

Reply | Threaded
Open this post in threaded view
|

Lpeg Cg question

Andrew Starks


On Tuesday, May 13, 2014, Pierre-Yves Gérardy <<a href="javascript:_e(%7B%7D,&#39;cvml&#39;,&#39;pygy79@gmail.com&#39;);" target="_blank">pygy79@...> wrote:
On Tue, May 13, 2014 at 10:16 PM, Andrew Starks <[hidden email]> wrote:

> ...because the manual[1] says:
>
>      An anonymous group serves to join values from several captures
> into a single capture.
>
>
>
> What am I missing?

It concerns LPeg's own representation of captures. Groups are
flattened when returned.

For example, in a folding capture, groups are treated as a single
capture, and the callbacks receives all sub-captures as arguments.
 
 
There are also some inconsistencies in how other captures handle groups:

    (Cg(C(1) * C(1)) * Cg(C(1) * C(1)) / 2):match"abcd"    --> b
    (Cg(C(1) * C(1)) * Cg(C(1) * C(1)) / "%2"):match"abcd" --> c

`/2` takes the second production of the capture stream, whereas
`/"%2"` takes the value of the second group, truncated to one value.

—Pierre-Yves


I don't have my mind around this, but I'm trying... 

Is it something like: "these are still individual captures, but for the purposes of numbering, they're grouped, which makes it possible to specify a subset of captures from a set."

The first time I heard of "formal grammars" or a parsing expression grammar was after I read a bit about LPeg. So, from that perspective:

The text lead me to believe that Cg was defined verbosely as "concatenates captures into a single value."

An example in the LPeg doc or a clarifying disclaimer would have been appreciated, but your answer is great too!
Reply | Threaded
Open this post in threaded view
|

Re: Lpeg Cg question

Dirk Laurie-2
2014-05-14 2:40 GMT+02:00 Andrew Starks <[hidden email]>:

> I don't have my mind around this, but I'm trying...
>
> Is it something like: "these are still individual captures, but for the
> purposes of numbering, they're grouped, which makes it possible to specify a
> subset of captures from a set."

### Group and back captures

These patterns are useless by themselves. They are always used as
part of a bigger pattern.

The first version of this document said: "The remaining capture
functions: `Cg` and `Cb`, are definitely too advanced for this primer."
That is still basically true, but one application is within reach:
the Group-Back combination allows values to be stored and retrieved.

__`Cg(p,name), p:Cg(name)`__ (Group)
~   Collects all the captures `p` made into a single entity, and gives
    that entity a name for future reference. Does not actually add any
    captured values to the big pattern.

__`Cb(name)`__   (Back)
~   Retrieves the entity with the given name, and supplies
    its captures to the big pattern.

> An example in the LPeg doc or a clarifying disclaimer would have been
> appreciated, but your answer is great too!

The above quote is from
<https://github.com/dlaurie/lua-notes/blob/master/lpeg-brief.txt>
which explains LPeg from the point of a newbie who has just reached
the point where things start to make sense.

Reply | Threaded
Open this post in threaded view
|

Re: Lpeg Cg question

Andrew Starks


On Tuesday, May 13, 2014, Dirk Laurie <[hidden email]> wrote:
2014-05-14 2:40 GMT+02:00 Andrew Starks <[hidden email]>:

> I don't have my mind around this, but I'm trying...
>
> Is it something like: "these are still individual captures, but for the
> purposes of numbering, they're grouped, which makes it possible to specify a
> subset of captures from a set."

### Group and back captures

These patterns are useless by themselves. They are always used as
part of a bigger pattern.

The first version of this document said: "The remaining capture
functions: `Cg` and `Cb`, are definitely too advanced for this primer."
That is still basically true, but one application is within reach:
the Group-Back combination allows values to be stored and retrieved.

__`Cg(p,name), p:Cg(name)`__ (Group)
~   Collects all the captures `p` made into a single entity, and gives
    that entity a name for future reference. Does not actually add any
    captured values to the big pattern.

__`Cb(name)`__   (Back)
~   Retrieves the entity with the given name, and supplies
    its captures to the big pattern.

> An example in the LPeg doc or a clarifying disclaimer would have been
> appreciated, but your answer is great too!

The above quote is from
<https://github.com/dlaurie/lua-notes/blob/master/lpeg-brief.txt>
which explains LPeg from the point of a newbie who has just reached
the point where things start to make sense.


I read through your document. I like the idea of a promise to refrain from editing, once you're smarter. :)

I use Cg with Ct and Cb every now-and-again. I was speaking more to anonymous version and the meaning of "grouping the values of multiple captures into a single capture" and what it's good for.

-Andrew


Reply | Threaded
Open this post in threaded view
|

Re: Lpeg Cg question

Chris Emerson
In reply to this post by Andrew Starks
On Tue, May 13, 2014 at 07:40:13PM -0500, Andrew Starks wrote:
> Is it something like: "these are still individual captures, but for the
> purposes of numbering, they're grouped, which makes it possible to
> specify a subset of captures from a set."
>
> The first time I heard of "formal grammars" or a parsing expression grammar
> was after I read a bit about LPeg. So, from that perspective:
>
> The text lead me to believe that Cg was defined verbosely as "concatenates
> captures into a single value."

I guess it's more a single capture with multiple values.

> An example in the LPeg doc or a clarifying disclaimer would have been
> appreciated, but your answer is great too!

There is one example at http://www.inf.puc-rio.br/~roberto/lpeg/ ; see the
"Name-value lists" example, where two captures inside a Cg end up being
passed as separate parameters to rawset().

Chris

Reply | Threaded
Open this post in threaded view
|

Re: Lpeg Cg question

Andrew Starks
On Wed, May 14, 2014 at 8:44 AM, Chris Emerson
<[hidden email]> wrote:

> On Tue, May 13, 2014 at 07:40:13PM -0500, Andrew Starks wrote:
>> Is it something like: "these are still individual captures, but for the
>> purposes of numbering, they're grouped, which makes it possible to
>> specify a subset of captures from a set."
>>
>> The first time I heard of "formal grammars" or a parsing expression grammar
>> was after I read a bit about LPeg. So, from that perspective:
>>
>> The text lead me to believe that Cg was defined verbosely as "concatenates
>> captures into a single value."
>
> I guess it's more a single capture with multiple values.
>
>> An example in the LPeg doc or a clarifying disclaimer would have been
>> appreciated, but your answer is great too!
>
> There is one example at http://www.inf.puc-rio.br/~roberto/lpeg/ ; see the
> "Name-value lists" example, where two captures inside a Cg end up being
> passed as separate parameters to rawset().
>
> Chris
>

Thank you! That is perfect and I missed that. I think as I was
scanning, the use case of Ct and Cg(patt, "name") stepped in to make
me conclude that this example wasn't getting at the anonymous use of
it. But there it is.

I am in your debt, and Roberto's, of course.

--Andrew