return value of funcion used by lpeg.Cmt

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

return value of funcion used by lpeg.Cmt

Matthias Kluwe
Hi!

The description of lpeg.Cmt on
http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html#matchtime describes
the behaviour when the called function returns a number, false, nil,
or no value. If I understand the example "Lua's long strings"
correctly, it can return true which does what I would expect: the
match is successful.

If this is correct, I'd suggest to add this clarification.

BTW: If I'd had to process a rather _large_ string, using lpeg.Cmt
would be the adequate tool, because the other options hold the
captures in memory until lpeg.match returns, right?

Regards,
Matthias
Reply | Threaded
Open this post in threaded view
|

Re: return value of funcion used by lpeg.Cmt

Roberto Ierusalimschy
> The description of lpeg.Cmt on
> http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html#matchtime describes
> the behaviour when the called function returns a number, false, nil,
> or no value. If I understand the example "Lua's long strings"
> correctly, it can return true which does what I would expect: the
> match is successful.
>
> If this is correct, I'd suggest to add this clarification.

It is correct. The new manual will say that.


> BTW: If I'd had to process a rather _large_ string, using lpeg.Cmt
> would be the adequate tool, because the other options hold the
> captures in memory until lpeg.match returns, right?

I missed you here. All kinds of captures must be kept in memory until
match returns, so that it can return them, no?

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: return value of funcion used by lpeg.Cmt

Matthias Kluwe
Hi!

2010/2/22 Roberto Ierusalimschy <[hidden email]>:

>> The description of lpeg.Cmt on
>> http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html#matchtime describes
>> the behaviour when the called function returns a number, false, nil,
>> or no value. If I understand the example "Lua's long strings"
>> correctly, it can return true which does what I would expect: the
>> match is successful.
>>
>> If this is correct, I'd suggest to add this clarification.
>
> It is correct. The new manual will say that.

Thank you, very fine.

>> BTW: If I'd had to process a rather _large_ string, using lpeg.Cmt
>> would be the adequate tool, because the other options hold the
>> captures in memory until lpeg.match returns, right?
>
> I missed you here. All kinds of captures must be kept in memory until
> match returns, so that it can return them, no?

Yes, obviously. I should have included a small (nonsense) example to
make it clear:

-- count lines

require 'lpeg'

local N = 0
local function fline()
    N = N + 1
    return true
end

local crlf  = lpeg.P'\n' + lpeg.P'\r\n'
local line  = -1 + lpeg.Cmt( ( 1 - crlf )^0, fline )
local lines = line * ( crlf * line )^0

lpeg.match( lines, io.open( arg[ 1 ] ):read( '*a' ) )
io.stdout:write( N, ' lines\n' )

Clearly, the captured data of lpeg.Cmt has to be in memory as it is
passed to fline(). But can I process a multi million line file this
way or are all the multi million captures kept until lpeg.match() has
returned? It was my hope that the 'clue' of lpeg.Cmt is that I can do
that.

Ok, I still read in the whole file at once, so my memory has to be
that big, at least...

Regards,
Matthias
Reply | Threaded
Open this post in threaded view
|

Re: return value of funcion used by lpeg.Cmt

Roberto Ierusalimschy
> -- count lines
>
> require 'lpeg'
>
> local N = 0
> local function fline()
>     N = N + 1
>     return true
> end
>
> local crlf  = lpeg.P'\n' + lpeg.P'\r\n'
> local line  = -1 + lpeg.Cmt( ( 1 - crlf )^0, fline )
> local lines = line * ( crlf * line )^0
>
> lpeg.match( lines, io.open( arg[ 1 ] ):read( '*a' ) )
> io.stdout:write( N, ' lines\n' )
>
> Clearly, the captured data of lpeg.Cmt has to be in memory as it is
> passed to fline(). But can I process a multi million line file this
> way or are all the multi million captures kept until lpeg.match() has
> returned? It was my hope that the 'clue' of lpeg.Cmt is that I can do
> that.

Cmt is processed in real time, so your reasoning is correct. Cmt
captures are 'dischared' as soon as the function executes, only the
resulting captures created by Cmt (if any) are kept.


> Ok, I still read in the whole file at once, so my memory has to be
> that big, at least...

Yes, and the extra memory for all those captures may be cheaper than
all these function calls (assuming you have virtual memory...).

-- Roberto