LPEG > 0.10 regression: 'B' (pattern may not have fixed length)

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

LPEG > 0.10 regression: 'B' (pattern may not have fixed length)

Daurnimator
Trying to use Peter Odding's lxsh
Issue has been filed for a while over there:
https://github.com/xolox/lua-lxsh/issues/5
But this should probably be fixed in LPEG itself?


The below works fine with lpeg 0.10:

> lpeg = require"lpeg"
> D = lpeg.R'09'
> BB = lpeg.B(-D, 1)
stdin:1: bad argument #1 to 'B' (pattern may not have fixed length)

Reply | Threaded
Open this post in threaded view
|

Re: LPEG > 0.10 regression: 'B' (pattern may not have fixed length)

Daurnimator
On 30 May 2015 at 11:31, Daurnimator <[hidden email]> wrote:

> Trying to use Peter Odding's lxsh
> Issue has been filed for a while over there:
> https://github.com/xolox/lua-lxsh/issues/5
> But this should probably be fixed in LPEG itself?
>
>
> The below works fine with lpeg 0.10:
>
>> lpeg = require"lpeg"
>> D = lpeg.R'09'
>> BB = lpeg.B(-D, 1)
> stdin:1: bad argument #1 to 'B' (pattern may not have fixed length)

Today I finally got around to looking at this again.
I was working on updating peter odding's lxsh module for lua 5.3
https://github.com/daurnimator/lua-lxsh
And found that in lpeg 1.0 some of the tests fail (they passed in lpeg 0.10).

To get things working, I had to force a check for the length via a lookahead.
See https://github.com/daurnimator/lua-lxsh/commit/b620081cb8644e577f7f642d482087e3a0d366ba

Reply | Threaded
Open this post in threaded view
|

Re: LPEG > 0.10 regression: 'B' (pattern may not have fixed length)

Roberto Ierusalimschy
> On 30 May 2015 at 11:31, Daurnimator <[hidden email]> wrote:
> > Trying to use Peter Odding's lxsh
> > Issue has been filed for a while over there:
> > https://github.com/xolox/lua-lxsh/issues/5
> > But this should probably be fixed in LPEG itself?
> >
> >
> > The below works fine with lpeg 0.10:
> >
> >> lpeg = require"lpeg"
> >> D = lpeg.R'09'
> >> BB = lpeg.B(-D, 1)
> > stdin:1: bad argument #1 to 'B' (pattern may not have fixed length)
>
> Today I finally got around to looking at this again.
> I was working on updating peter odding's lxsh module for lua 5.3
> https://github.com/daurnimator/lua-lxsh
> And found that in lpeg 1.0 some of the tests fail (they passed in lpeg 0.10).
>
> To get things working, I had to force a check for the length via a lookahead.
> See https://github.com/daurnimator/lua-lxsh/commit/b620081cb8644e577f7f642d482087e3a0d366ba

I am not sure what this message is about. Is it trying to say that
the above piece of code does not work with LPeg 1.0?

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: LPEG > 0.10 regression: 'B' (pattern may not have fixed length)

Daurnimator
On 13 April 2016 at 04:27, Roberto Ierusalimschy <[hidden email]> wrote:

>> On 30 May 2015 at 11:31, Daurnimator <[hidden email]> wrote:
>> > Trying to use Peter Odding's lxsh
>> > Issue has been filed for a while over there:
>> > https://github.com/xolox/lua-lxsh/issues/5
>> > But this should probably be fixed in LPEG itself?
>> >
>> >
>> > The below works fine with lpeg 0.10:
>> >
>> >> lpeg = require"lpeg"
>> >> D = lpeg.R'09'
>> >> BB = lpeg.B(-D, 1)
>> > stdin:1: bad argument #1 to 'B' (pattern may not have fixed length)
>>
>> Today I finally got around to looking at this again.
>> I was working on updating peter odding's lxsh module for lua 5.3
>> https://github.com/daurnimator/lua-lxsh
>> And found that in lpeg 1.0 some of the tests fail (they passed in lpeg 0.10).
>>
>> To get things working, I had to force a check for the length via a lookahead.
>> See https://github.com/daurnimator/lua-lxsh/commit/b620081cb8644e577f7f642d482087e3a0d366ba
>
> I am not sure what this message is about. Is it trying to say that
> the above piece of code does not work with LPeg 1.0?
>
> -- Roberto
>

Consider this piece of code:

```
local lpeg = require 'lpeg'

local D = lpeg.R'09'
local I = lpeg.R('AZ', 'az', '\127\255') + '_'
local SOS = lpeg.P(function(s, i) return i == 1 end) -- start of string
local EOS = -lpeg.P(1) -- end of string
local B = -(I + D) -- word boundary

-- Transform a string with keywords into an LPeg pattern that matches a keyword
-- followed by a word boundary.

local patt = lpeg.P("foo")+lpeg.P("bar")+lpeg.P("qux")
local BB = lpeg.B(B) + SOS -- starting boundary
local AB = B + EOS -- ending boundary

patt = BB * patt * AB

patt = lpeg.P{ patt + 1 * lpeg.V(1) }

print(patt:match(" foo"))
```

It works in lpeg 0.10. but not in lpeg 1.0
To get it to work in lpeg 1.0 I applied this change:

-local B = -(I + D) -- word boundary
+local B = #lpeg.P(1) + -(I + D) -- word boundary

Reply | Threaded
Open this post in threaded view
|

Re: LPEG > 0.10 regression: 'B' (pattern may not have fixed length)

Roberto Ierusalimschy
> Consider this piece of code:
>
> ```
> local lpeg = require 'lpeg'
>
> local D = lpeg.R'09'
> local I = lpeg.R('AZ', 'az', '\127\255') + '_'
> local SOS = lpeg.P(function(s, i) return i == 1 end) -- start of string
> local EOS = -lpeg.P(1) -- end of string
> local B = -(I + D) -- word boundary
>
> -- Transform a string with keywords into an LPeg pattern that matches a keyword
> -- followed by a word boundary.
>
> local patt = lpeg.P("foo")+lpeg.P("bar")+lpeg.P("qux")
> local BB = lpeg.B(B) + SOS -- starting boundary
> local AB = B + EOS -- ending boundary
>
> patt = BB * patt * AB
>
> patt = lpeg.P{ patt + 1 * lpeg.V(1) }
>
> print(patt:match(" foo"))
> ```
>
> It works in lpeg 0.10. but not in lpeg 1.0
> To get it to work in lpeg 1.0 I applied this change:
>
> -local B = -(I + D) -- word boundary
> +local B = #lpeg.P(1) + -(I + D) -- word boundary

The "look behind" (lpeg.B) function changed from 0.10 to 0.11.

In 0.10, it received an explicit number to "go back" before the match,
and the default was one. In 0.11 and later versions, it uses the
"pattern length" as the length to go back.  For instance,
lpeg.B("123") will try to match "123" three positions
behind the current one,
and lpeg.B("1") will go back one position.
In your code, BB goes back the size of B, which is zero.
Probably what you want is this:

- BB = lpeg.B(-(I + D)) + SOS -- starting boundary
+ BB = -peg.B(I + D) + SOS -- starting boundary

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: LPEG > 0.10 regression: 'B' (pattern may not have fixed length)

Roberto Ierusalimschy
This whole story goes back to these messages:

  https://github.com/xolox/lua-lxsh/issues/5             (the problem)
  http://lua-users.org/lists/lua-l/2015-07/msg00356.html   (the "fix")

The issue is that I "fixed" the wrong problem, because of a bad
error message. The original problem was the one that appeared in the
OP:

> lpeg = require"lpeg"
> D = lpeg.R'09'
> BB = lpeg.B(-D, 1)
stdin:1: bad argument #1 to 'B' (pattern may not have fixed length)

But of course 'B' has a fixed length, and so the "obvious fix" for that
message was to allow fixed lengths equal to zero. But look behind with
zero lengths may be correct from a formal point of view, but they are
useless (as it means "go back zero positions") and probably not doing
what the programmer expects. The correct fix should be to rephrase the
error message, not to allow look behind with zero-length patterns.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: LPEG > 0.10 regression: 'B' (pattern may not have fixed length)

Duncan Cross
In reply to this post by Daurnimator
On Fri, Apr 15, 2016 at 12:48 AM, Daurnimator <[hidden email]> wrote:
>
> Consider this piece of code:
>
> ```
> local lpeg = require 'lpeg'
>
> local D = lpeg.R'09'
> local I = lpeg.R('AZ', 'az', '\127\255') + '_'
> local SOS = lpeg.P(function(s, i) return i == 1 end) -- start of string

This is completely tangential to the topic at hand, but wouldn't this
work just as well for SOS?

local SOS = -lpeg.B(1)

Reply | Threaded
Open this post in threaded view
|

Re: LPEG > 0.10 regression: 'B' (pattern may not have fixed length)

Daurnimator
In reply to this post by Roberto Ierusalimschy
On 16 April 2016 at 02:53, Roberto Ierusalimschy <[hidden email]> wrote:

> This whole story goes back to these messages:
>
>   https://github.com/xolox/lua-lxsh/issues/5             (the problem)
>   http://lua-users.org/lists/lua-l/2015-07/msg00356.html   (the "fix")
>
> The issue is that I "fixed" the wrong problem, because of a bad
> error message. The original problem was the one that appeared in the
> OP:
>
>> lpeg = require"lpeg"
>> D = lpeg.R'09'
>> BB = lpeg.B(-D, 1)
> stdin:1: bad argument #1 to 'B' (pattern may not have fixed length)
>
> But of course 'B' has a fixed length, and so the "obvious fix" for that
> message was to allow fixed lengths equal to zero. But look behind with
> zero lengths may be correct from a formal point of view, but they are
> useless (as it means "go back zero positions") and probably not doing
> what the programmer expects. The correct fix should be to rephrase the
> error message, not to allow look behind with zero-length patterns.
>
> -- Roberto
>

btw, I just had this lpeg.B issue come up in a totally separate context.
I was updating lunamark to work with 5.3, and wrote this:
https://github.com/jgm/lunamark/pull/14/commits/51756bfca858795a423f4ced9e7c164a35f1638b

Interestingly, they were passing a fixed-length pattern of length 1 to
lpeg.B, but passing a second argument of 2.
i.e. checking which single character was 2 before the current position.

Reply | Threaded
Open this post in threaded view
|

Re: LPEG > 0.10 regression: 'B' (pattern may not have fixed length)

Roberto Ierusalimschy
In reply to this post by Duncan Cross
> > local SOS = lpeg.P(function(s, i) return i == 1 end) -- start of string
>
> This is completely tangential to the topic at hand, but wouldn't this
> work just as well for SOS?
>
> local SOS = -lpeg.B(1)

Sure.

-- Roberto