sting.gsub missing matches over chuck boundary perhaps

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

sting.gsub missing matches over chuck boundary perhaps

Fabian Hutchinson
I am calling LUA from Nginx to make changes to the response body based on regex substitution using gsub.

The body is XML. There are two types of actions: Match a block of XML and remove it and match a block and add something after it.

A sample of the LUA is shown below.

  ngx.arg[1] = string.gsub(ngx.arg[1], [[<Representation id="1" width.-</Representation>]], "")

  ngx.arg[1] = string.gsub(ngx.arg[1], [[(<AdaptationSet>.-</AdaptationSet>)]],[[%1

      <Metadata>

             xxxxxxxxx

      </Metadata>]] )

The line to add data sometimes fails. A response with 16 blocks that match will only have 15 replacements, for example.

I wondered if it was related to the regex being done on chunks of the body and not the whole body or a stream (https://github.com/openresty/lua-nginx-module/issues/817). Does the failure happen when the string being matched is over two chuncks? I changed the config to match just a closing XML tag which is much smaller and only on one line. With this change it does not ever fail.

I cannot understand why the same problem that affected the regex to add content to the body would not affect the regex that removes content even though the pattern being matched is larger and more likely to cross a chunk boundary.

Could it be the fact that it is removing body data and not adding to it?

Accidentally sent email previously not as plain text. Apologies.

Reply | Threaded
Open this post in threaded view
|

Re: sting.gsub missing matches over chuck boundary perhaps

szbnwer@gmail.com
hi there! :)

just a wild guess, but maybe u could check what happens when u
copypaste a working block to the end of the xml node list, and/or
check the target string in a hex editor for finding any invisible
chars (especially if the issue could repeat itself after a manual fix)

(((btw it is "Lua", cuz it isnt an abbreviation or whatever like, but
"moon" in portugese :D )))

good luck, have fun and all the bests to u! :)