Lua 5.3.4 and LPeg 1.0.1 dumped core on me

Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Lua 5.3.4 and LPeg 1.0.1 dumped core on me

Sean Conner

  So I'm participating in this years National Novel Generation Month [1] and
I finished my entry.  While generating the output, I had to use Lua 5.1
because Lua 5.3 would crash---hard.  

[spc]lucy:~/writings/nanogenmo/2018>lua run.lua TheEmeraldCityOfOz.txt >/tmp/output
Segmentation fault (core dumped)
[spc]lucy:~/writings/nanogenmo/2018>lua-51 run.lua TheEmeraldCityOfOz.txt >/tmp/output
[spc]lucy:~/writings/nanogenmo/2018>

  I'm not sure what's going on.  From the core dump:

(gdb) where
#0  0x00bfb98c in memcpy () from /lib/tls/libc.so.6
#1  0x0805b7ff in luaL_addlstring ()
#2  0x00113487 in substcap () from /home/spc/.luarocks/lib/lua/5.3/lpeg.so
#3  0x00112c47 in pushcapture () from /home/spc/.luarocks/lib/lua/5.3/lpeg.so
#4  0x0011374f in getcaptures () from /home/spc/.luarocks/lib/lua/5.3/lpeg.so
#5  0x00116ef4 in lp_match () from /home/spc/.luarocks/lib/lua/5.3/lpeg.so
#6  0x0804ef6a in luaD_precall ()
#7  0x08058986 in luaV_execute ()
#8  0x0804f27d in luaD_call ()
#9  0x0804f2be in luaD_callnoyield ()
#10 0x0804d0eb in lua_callk ()
#11 0x00112d31 in pushcapture () from /home/spc/.luarocks/lib/lua/5.3/lpeg.so
#12 0x001131be in addonestring () from /home/spc/.luarocks/lib/lua/5.3/lpeg.so
#13 0x00113495 in substcap () from /home/spc/.luarocks/lib/lua/5.3/lpeg.so
#14 0x00112c47 in pushcapture () from /home/spc/.luarocks/lib/lua/5.3/lpeg.so
#15 0x0011374f in getcaptures () from /home/spc/.luarocks/lib/lua/5.3/lpeg.so
#16 0x00116ef4 in lp_match () from /home/spc/.luarocks/lib/lua/5.3/lpeg.so
#17 0x0804ef6a in luaD_precall ()
#18 0x080589e1 in luaV_execute ()
#19 0x0804f27d in luaD_call ()
#20 0x0804f2be in luaD_callnoyield ()
#21 0x0804d146 in f_call ()
#22 0x0804e8ac in luaD_rawrunprotected ()
#23 0x0804f6ec in luaD_pcall ()
#24 0x0804d1a8 in lua_pcallk ()
#25 0x0804b0e4 in docall ()
#26 0x0804baba in pmain ()
#27 0x0804ef6a in luaD_precall ()
#28 0x0804f264 in luaD_call ()
#29 0x0804f2be in luaD_callnoyield ()
#30 0x0804d146 in f_call ()
#31 0x0804e8ac in luaD_rawrunprotected ()
#32 0x0804f6ec in luaD_pcall ()
#33 0x0804d1a8 in lua_pcallk ()
#34 0x0804bb91 in main ()

System specifics:  x86 Linux (so 32-bit).  I haven't tried it on any other
system yet.  But the code is available:

        https://github.com/spc476/NaNoGenMo-2018

You may need to modify run.lua to comment out two lines of code:

diff --git a/run.lua b/run.lua
index 2f939c4..1cfa81c 100644
--- a/run.lua
+++ b/run.lua
@@ -22,7 +22,7 @@
 -- luacheck: ignore 611
 
 local lpeg   = require "lpeg"
-local wrap   = require "org.conman.string".wrap
+--local wrap   = require "org.conman.string".wrap
 local valley = require "valley"
 local chef   = require "chef"
 
@@ -79,5 +79,5 @@ f:close()
 
 text = normalize:match(text)
 text = story:match(text)
-text = dowrap:match(text)
+--text = dowrap:match(text)
 print(text)

to get it to run (or not, in my case).

And as I said, Lua 5.1 works with no issues.  Oh, I just noticed---my Lua
5.1 is pulling in LPeg 0.12, not LPeg 1.0.1.  Hmm ...

  -spc (So it may be an LPeg issue ... )

[1] https://github.com/NaNoGenMo/2018

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.3.4 and LPeg 1.0.1 dumped core on me

Andrew Gierth
>>>>> "Sean" == Sean Conner <[hidden email]> writes:

 Sean> So I'm participating in this years National Novel Generation
 Sean> Month [1] and I finished my entry. While generating the output, I
 Sean> had to use Lua 5.1 because Lua 5.3 would crash---hard.

Looks like lpeg is at fault: substcap is passing a bad length to
luaL_addlstring.

Here in substcap:

403         while (!isclosecap(cs->cap)) {  /* traverse nested captures */
404           const char *next = cs->cap->s;
405           luaL_addlstring(b, curr, next - curr);  /* add text up to capture */

"next" is 0x0 while "curr" is a valid pointer.

If it helps, here is the text:

(gdb) print curr
$5 = 0x80128317f "in. By that time you may be more sensible.\""

 Sean> System specifics: x86 Linux (so 32-bit). I haven't tried it on
 Sean> any other system yet. But the code is available:

I tested on 64-bit freebsd and lua 5.3.5, it died with "not enough
memory for buffer allocation", but the cause looks to be the same as
your error.

--
Andrew.

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.3.4 and LPeg 1.0.1 dumped core on me

Andrew Gierth
>>>>> "Andrew" == Andrew Gierth <[hidden email]> writes:
>>>>> "Sean" == Sean Conner <[hidden email]> writes:

 Sean> So I'm participating in this years National Novel Generation
 Sean> Month [1] and I finished my entry. While generating the output, I
 Sean> had to use Lua 5.1 because Lua 5.3 would crash---hard.

 Andrew> Looks like lpeg is at fault: substcap is passing a bad length to
 Andrew> luaL_addlstring.

 Andrew> Here in substcap:

 Andrew> 403         while (!isclosecap(cs->cap)) {  /* traverse nested captures */
 Andrew> 404           const char *next = cs->cap->s;
 Andrew> 405           luaL_addlstring(b, curr, next - curr);  /* add text up to capture */

 Andrew> "next" is 0x0 while "curr" is a valid pointer.

The offending null value of cs->cap->s comes from here:

/*
** Add capture values returned by a dynamic capture to the capture list
** 'base', nested inside a group capture. 'fd' indexes the first capture
** value, 'n' is the number of values (at least 1).
*/
static void adddyncaptures (const char *s, Capture *base, int n, int fd) {
  int i;
  base[0].kind = Cgroup;  /* create group capture */
  base[0].siz = 0;
  base[0].idx = 0;  /* make it an anonymous group */
  for (i = 1; i <= n; i++) {  /* add runtime captures */
...

This code thinks it can create an anonymous group capture with no valid
.s pointer (to contain the results of a match-time capture). I _think_
the failure is normally hidden by the fact that base[0] is _usually_ a
reuse of the same Capture that was used for the match-time capture
itself - except that the array of Capture structures might get
reallocated between the two uses, without preserving the old value of
the (now unused) Capture.

Here is a small testcase (note that shortening the string eliminates the
error):

local lpeg = require 'lpeg'
local P,Cs,Cmt = lpeg.P, lpeg.Cs, lpeg.Cmt
local pat1 = P"a" / "b" + Cmt(P"c", function(_,p) return p,"d" end) + P(1)
local pat = Cs(pat1^1)
print(pat:match("abcdabcdabcdabcdabcdabcdabcdabc"))'

--
Andrew.

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.3.4 and LPeg 1.0.1 dumped core on me

Andrew Gierth
>>>>> "Andrew" == Andrew Gierth <[hidden email]> writes:

 Andrew> This code thinks it can create an anonymous group capture with
 Andrew> no valid .s pointer (to contain the results of a match-time
 Andrew> capture). I _think_ the failure is normally hidden by the fact
 Andrew> that base[0] is _usually_ a reuse of the same Capture that was
 Andrew> used for the match-time capture itself - except that the array
 Andrew> of Capture structures might get reallocated between the two
 Andrew> uses, without preserving the old value of the (now unused)
 Andrew> Capture.

Further analysis shows that my original idea was wrong; correct
substitution _requires_ that the value of base[0].s actually be
preserved rather than initialized in adddyncaptures. So the fix seems to
be to ensure that it is preserved across expansion of the captures
array:

--- lpvm.c.orig 2018-11-10 09:37:29 UTC
+++ lpvm.c
@@ -311,7 +311,7 @@ const char *match (lua_State *L, const c
           if (fr + n >= SHRT_MAX)
             luaL_error(L, "too many results in match-time capture");
           if ((captop += n + 2) >= capsize) {
-            capture = doublecap(L, capture, captop, n + 2, ptop);
+            capture = doublecap(L, capture, captop, n + 1, ptop);
             capsize = 2 * captop;
           }
           /* add new captures to 'capture' list */

By telling doublecap that one less Capture structure is "new", we make
it preserve one more of the existing values, which is the one we need.

This patch fixes both my testcase and the original report, as far as I
can tell.

--
Andrew.

Reply | Threaded
Open this post in threaded view
|

Is this an LPeg bug? (was Re: Lua 5.3.4 and LPeg 1.0.1 dumped core on me)

Sean Conner
It was thus said that the Great Andrew Gierth once stated:

> >>>>> "Andrew" == Andrew Gierth <[hidden email]> writes:
>
>  Andrew> This code thinks it can create an anonymous group capture with
>  Andrew> no valid .s pointer (to contain the results of a match-time
>  Andrew> capture). I _think_ the failure is normally hidden by the fact
>  Andrew> that base[0] is _usually_ a reuse of the same Capture that was
>  Andrew> used for the match-time capture itself - except that the array
>  Andrew> of Capture structures might get reallocated between the two
>  Andrew> uses, without preserving the old value of the (now unused)
>  Andrew> Capture.
>
> Further analysis shows that my original idea was wrong; correct
> substitution _requires_ that the value of base[0].s actually be
> preserved rather than initialized in adddyncaptures. So the fix seems to
> be to ensure that it is preserved across expansion of the captures
> array:
>
> --- lpvm.c.orig 2018-11-10 09:37:29 UTC
> +++ lpvm.c
> @@ -311,7 +311,7 @@ const char *match (lua_State *L, const c
>            if (fr + n >= SHRT_MAX)
>              luaL_error(L, "too many results in match-time capture");
>            if ((captop += n + 2) >= capsize) {
> -            capture = doublecap(L, capture, captop, n + 2, ptop);
> +            capture = doublecap(L, capture, captop, n + 1, ptop);
>              capsize = 2 * captop;
>            }
>            /* add new captures to 'capture' list */
>
> By telling doublecap that one less Capture structure is "new", we make
> it preserve one more of the existing values, which is the one we need.
>
> This patch fixes both my testcase and the original report, as far as I
> can tell.

  Just in case Roberto & Co. did not see this thread---is this an actual bug
in LPeg?

  -spc


Reply | Threaded
Open this post in threaded view
|

Re: Is this an LPeg bug? (was Re: Lua 5.3.4 and LPeg 1.0.1 dumped core on me)

Roberto Ierusalimschy
>   Just in case Roberto & Co. did not see this thread---is this an actual bug
> in LPeg?

It seems so.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.3.4 and LPeg 1.0.1 dumped core on me

Albert Chan
In reply to this post by Andrew Gierth

> Here is a small testcase (note that shortening the string eliminates the
> error):
>
> local lpeg = require 'lpeg'
> local P,Cs,Cmt = lpeg.P, lpeg.Cs, lpeg.Cmt
> local pat1 = P"a" / "b" + Cmt(P"c", function(_,p) return p,"d" end) + P(1)
> local pat = Cs(pat1^1)
> print(pat:match("abcdabcdabcdabcdabcdabcdabcdabc"))'
>
> --
> Andrew.

What is above test code supposed to show ?
I tried it on lpeg 1.0.1 without problem ...

OTTH, if string is longer than ('abcd'):rep(5510), it did *crash*
Your 1 char patch, (n + 2) to (n + 1) seems to fix it.

Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.3.4 and LPeg 1.0.1 dumped core on me

Andrew Gierth
>>>>> "Albert" == Albert Chan <[hidden email]> writes:

 >> Here is a small testcase (note that shortening the string eliminates the
 >> error):
 >>
 >> local lpeg = require 'lpeg'
 >> local P,Cs,Cmt = lpeg.P, lpeg.Cs, lpeg.Cmt
 >> local pat1 = P"a" / "b" + Cmt(P"c", function(_,p) return p,"d" end) + P(1)
 >> local pat = Cs(pat1^1)
 >> print(pat:match("abcdabcdabcdabcdabcdabcdabcdabc"))'

 Albert> What is above test code supposed to show ?
 Albert> I tried it on lpeg 1.0.1 without problem ...

With what output? on what platform? have you modified INITCAPSIZE?

When I run it (barring the stray ' at the end) I get the error "not
enough memory for buffer allocation" (because it's trying to allocate
nearly 2^64 bytes due to a bogus pointer value).

 Albert> OTTH, if string is longer than ('abcd'):rep(5510), it did
 Albert> *crash* Your 1 char patch, (n + 2) to (n + 1) seems to fix it.

The bug does depend on the state of newly-allocated memory (the contents
of a lua full userdata) so it may not be entirely predictable.

--
Andrew.

Reply | Threaded
Open this post in threaded view
|

Re: Is this an LPeg bug? (was Re: Lua 5.3.4 and LPeg 1.0.1 dumped core on me)

Sean Conner
In reply to this post by Roberto Ierusalimschy
It was thus said that the Great Roberto Ierusalimschy once stated:
> >   Just in case Roberto & Co. did not see this thread---is this an actual bug
> > in LPeg?
>
> It seems so.

  Is there any chance of getting this bug fixed in LPeg?

  -spc (Have another LPeg bug coming up ... )


Reply | Threaded
Open this post in threaded view
|

Re: Lua 5.3.4 and LPeg 1.0.1 dumped core on me

Roberto Ierusalimschy
In reply to this post by Andrew Gierth
>  Andrew> This code thinks it can create an anonymous group capture with
>  Andrew> no valid .s pointer (to contain the results of a match-time
>  Andrew> capture). I _think_ the failure is normally hidden by the fact
>  Andrew> that base[0] is _usually_ a reuse of the same Capture that was
>  Andrew> used for the match-time capture itself - except that the array
>  Andrew> of Capture structures might get reallocated between the two
>  Andrew> uses, without preserving the old value of the (now unused)
>  Andrew> Capture.
>
> Further analysis shows that my original idea was wrong; correct
> substitution _requires_ that the value of base[0].s actually be
> preserved rather than initialized in adddyncaptures. So the fix seems to
> be to ensure that it is preserved across expansion of the captures
> array:
>
> --- lpvm.c.orig 2018-11-10 09:37:29 UTC
> +++ lpvm.c
> @@ -311,7 +311,7 @@ const char *match (lua_State *L, const c
>            if (fr + n >= SHRT_MAX)
>              luaL_error(L, "too many results in match-time capture");
>            if ((captop += n + 2) >= capsize) {
> -            capture = doublecap(L, capture, captop, n + 2, ptop);
> +            capture = doublecap(L, capture, captop, n + 1, ptop);
>              capsize = 2 * captop;
>            }
>            /* add new captures to 'capture' list */
>
> By telling doublecap that one less Capture structure is "new", we make
> it preserve one more of the existing values, which is the one we need.
>
> This patch fixes both my testcase and the original report, as far as I
> can tell.

Many thanks for the detailed analysis.

Although your analysis and the fix seem logically correct, I think it
would be cleaner to preserve the initial open-group capture in the
whole process.  As it is, we are removing a capture and creating a new
one, and kind of "by chance" the original value of base[0].s is being
preserved. The same way you got it wrong in your first message, we
will get it wrong again in the future.

-- Roberto