sting.gsub missing matches over chuck boundary perhaps

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

sting.gsub missing matches over chuck boundary perhaps

Fabian Hutchinson

I am calling LUA from Nginx to make changes to the response body based on regex substitution using gsub.

The body is XML. There are two types of actions: Match a block of XML and remove it and match a block and add something after it.

A sample of the LUA is shown below.

  ngx.arg[1] = string.gsub(ngx.arg[1], [[<Representation id="1" width.-</Representation>]], "")

  ngx.arg[1] = string.gsub(ngx.arg[1], [[(<AdaptationSet>.-</AdaptationSet>)]],[[%1

      <Metadata>

             xxxxxxxxx

      </Metadata>]] )

The line to add data sometimes fails. A response with 16 blocks that match will only have 15 replacements, for example.

I wondered if it was related to the regex being done on chunks of the body and not the whole body or a stream (https://github.com/openresty/lua-nginx-module/issues/817). Does the failure happen when the string being matched is over two chuncks? I changed the config to match just a closing XML tag which is much smaller and only on one line. With this change it does not ever fail.

I cannot understand why the same problem that affected the regex to add content to the body would not affect the regex that removes content even though the pattern being matched is larger and more likely to cross a chunk boundary.

Could it be the fact that it is removing body data and not adding to it?