[Proposal] Modified Anchor Behaviour for Global Patterns

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[Proposal] Modified Anchor Behaviour for Global Patterns

Duane Leslie
I have just realised that I don't think I ever posted this patch to
the list.  I have been using a modified behaviour for the handling of
the start anchor '^' in `gmatch` and `gsub`.  Essentially I use these
to mean that each match must immediately follow the preceding match.
I find this very useful for ensuring that no unmatched characters are
skipped by the pattern.  This also lets me terminate patterns when a
section delimiter is encountered in the string (which doesn't match
the pattern).

This is a breaking change, but the old behaviour is still accessible.

For the case of `gmatch` this only changes the semantics slightly
where previously an initial '^' character was taken as a literal '^'
so where the existing behaviour is desired it is just a matter of
escaping the '^' as would be necessary in other patterns.

For the case of `gsub` the start anchor currently works, but anchors
against the start of the string only which prevents iteration over the
string.  This same behaviour can still be achieved by passing the 4th
argument which can restrict the match to a single occurrence only.
This means for example `string.gsub("hello", "^%a", string.upper)` to
uppercase the first character in a string would instead become
`string.gsub("hello", "^%a", string.upper, 1)`.

Anyway, I hope someone finds this useful, and it would be nice if this
made it into a future release.

Regards,

Duane.



global_pattern_anchor.patch (1K) Download Attachment