Use atomic matching for complex non-backtracking

You can sometimes improve the performance of your regular expression by preventing parts of it from backtracking when you know that might be useful. Item 38. Avoid unnecessary backtracking had many techniques for this, although it did not mention atomic matching (a feature added in v5.005).

Continue reading “Use atomic matching for complex non-backtracking”

Use alpha assertions for more understandable regexes

[This feature stabilizes in Perl v5.32]

Perl v5.28 adds more-readable, alternate spelled-out forms for some of its regular expression extended patterns. Then, to make those slightly less readable, there are very short initialisms for those. Although these might seem superfluous now, the ability to define new syntax without relying on the limited number of ASCII symbols.

Continue reading “Use alpha assertions for more understandable regexes”

Perl v5.30 new features

Perl v5.29 is the development series leading up to the maintenance release v5.30 sometime in the middle of 2019. As it’s released—roughly monthly—you can get a peek at what’s coming up. You can track the progress by reading the perldelta documentation that comes with each Perl release (although you’ll need to select the development version you want to inspect).

Perl v5.30 lets you match more with the general quantifier

Does the {N,} really match infinite repetitions in a Perl regular expression? No, it never has. You’ve been limited to 32,766 repetitions. Perl v5.30 is about to double that for you. And, if you are one of the people who needed more, I’d like to hear your story.

Continue reading “Perl v5.30 lets you match more with the general quantifier”

Use @{^CAPTURE} to get a list of all the capture buffers

Perl v5.26 adds three new special variables related to captures. The @{^CAPTURE} is an array of all the capture buffers. %{^CAPTURE} is a alias for %+ and stores the actually-matched named capture labels as its keys. %{^CAPTURE_ALL} is an alias for %- and stores all the named capture labels and their matched (or not) values.

Continue reading “Use @{^CAPTURE} to get a list of all the capture buffers”

Mix assignment and reference aliasing with declared_refs

Perl v5.26 adds the experimental declared_refs feature that expands on the experimental refaliasing feature from v5.22. As with all experimental features, this may change or disappear according to perlpolicy.

Continue reading “Mix assignment and reference aliasing with declared_refs”

Use Unicode 10 in Perl v5.28

Perl v5.28 updates to Unicode 10. There are 8,518 new characters, 7,473 which are in the CJK extension. There are 56 new emojis. And, the Bitcoin symbol, ₿. It adds a T. rex, 🦖, but we’re still waiting for a raptor. To Perl they are just characters like any other so you don’t need anything new to deal with them.

Continue reading “Use Unicode 10 in Perl v5.28”

Find the new emojis in Perl’s Unicode support

Perl v5.26 updates itself to Unicode 9. That’s not normally exciting news but people have been pretty enthusiastic about the 72 new emojis that come. As far as Perl cares, they are just valid code points like all of the other ones.

Continue reading “Find the new emojis in Perl’s Unicode support”

Initialize array and hash variables with state

Perl v5.28 allows you to initialize array and hash variables that you declare with state. This is a feature a long time coming and that I’m quite happy as finally arrived.

Since v5.10 and up to v5.26 you could only initialize a state variable if it was a scalar. You could declare a hash or array variable but you couldn’t give it an initial value at the same time. You could do this:

Continue reading “Initialize array and hash variables with state”