The Effective Perler – Page 2 – Effective Perl Programming

Perl v5.32 new features

Perl v5.32 is out and it has some interesting new features. The previous major releases focussed more on finally removing deprecations and shoring up odd cases, and you still find a few of those in this release. Full details, as always, are in the perldelta.

Sawyer X just announced Perl 7 as a major version jump that relabels what is now v5.32. If you’re code is ready for v5.32, you should be mostly ready for Perl 7.

The new isa infix operator (“class instance”)
The streamzip program comes with IO::Compress::Base, so it comes with perl
Regex alpha assertions are stable
Script runs are no longer experimental
Alpha assertions are no longer experimental
Match Unicode names with a pattern

Mixed mode access to undef uses a temporary file

"0" .. "-1" is fixed
Modifiable contexts in constants now throw an exception
User-defined properties override Unicode ones of the same name
Use chained comparison operators
Disable indirect object notation

Use the infix class instance operator

Perl v5.32 adds Paul Evans’s infix isa operator—the “class instance operator” as an experimental feature. It still has some issues to work out which prevent its use at the moment, but it looks promising. It subverts how the UNIVERSAL::isa does its job and breaks that in the process. As an experimental feature, that’s fine, but you shouldn’t use this until that’s worked out.

There’s no word on versions for can or does.

One of the delightful things to note about this is addition is that it is one of the features whose development took place almost entirely through a GitHub issue and pull request. GitHub is now the primary repository for the Perl code, and has been since October 2019. This is a feature that I’ll want to use right away in new production code.

Continue reading “Use the infix class instance operator”

Perl 5.30 fixes single quoted qr” with \N{}

The qr// operator allows you to compile a regular expression without applying it to anything. You get the pattern without the match, and you can reuse the pattern as often as you like. Before v5.30, it had an inconsistency with \N{} sequences, but that’s fixed now.

Continue reading “Perl 5.30 fixes single quoted qr” with \N{}”

No more false postfix lexical declarations in v5.30

Before Perl v5.10 introduced state variables, people did various things to create persistent lexical variables for a subroutine. With v5.30, one of those constructs is now a fatal error.

Often you want a persistent variable to be scoped and private to a subroutine. But, once you leave that scope, normal lexical variables disappear because their reference count drops to zero. So, no persistence.

Continue reading “No more false postfix lexical declarations in v5.30”

Match only the same Unicode script

Earlier this year, this website was the target of some sort of attack in which a bot sent seemingly random data in its requests. The attack wasn’t that big of a deal since I easily blocked it with Cloudflare, but it was interesting. The apparently random data was actually a mix of Latin, Hangul, and Cyrillic. Domain hacks with unusual Unicode characters shows some of these exploits. Curiously, v5.28 added some regex feature that deals with this sort of nonsense.

Continue reading “Match only the same Unicode script”

Use atomic matching for complex non-backtracking

You can sometimes improve the performance of your regular expression by preventing parts of it from backtracking when you know that might be useful. Item 38. Avoid unnecessary backtracking had many techniques for this, although it did not mention atomic matching (a feature added in v5.005).

Continue reading “Use atomic matching for complex non-backtracking”

Use alpha assertions for more understandable regexes

[This feature stabilizes in Perl v5.32]

Perl v5.28 adds more-readable, alternate spelled-out forms for some of its regular expression extended patterns. Then, to make those slightly less readable, there are very short initialisms for those. Although these might seem superfluous now, the ability to define new syntax without relying on the limited number of ASCII symbols.

Continue reading “Use alpha assertions for more understandable regexes”

Perl v5.30 new features

Perl v5.29 is the development series leading up to the maintenance release v5.30 sometime in the middle of 2019. As it’s released—roughly monthly—you can get a peek at what’s coming up. You can track the progress by reading the perldelta documentation that comes with each Perl release (although you’ll need to select the development version you want to inspect).

You no longer can declare lexical with false postfix conditionals
The generalized quantifier can match more
Delimiters must be graphemes
File::Glob::glob() will disappear
Some uses of an unescaped left brace “{” in a regex will be illegal
Upgrades to Unicode 12
Previously deprecated sysread()/syswrite() on :utf8 handles now fatal
qr'\N{name}' is now supported
There’s limited, experimental support for variable-width lookbehinds
Match Unicode property values with a wildcard

Perl v5.30 lets you match more with the general quantifier

Does the {N,} really match infinite repetitions in a Perl regular expression? No, it never has. You’ve been limited to 32,766 repetitions. Perl v5.30 is about to double that for you. And, if you are one of the people who needed more, I’d like to hear your story.

Continue reading “Perl v5.30 lets you match more with the general quantifier”

Use @{^CAPTURE} to get a list of all the capture buffers

Perl v5.26 adds three new special variables related to captures. The @{^CAPTURE} is an array of all the capture buffers. %{^CAPTURE} is a alias for %+ and stores the actually-matched named capture labels as its keys. %{^CAPTURE_ALL} is an alias for %- and stores all the named capture labels and their matched (or not) values.

Continue reading “Use @{^CAPTURE} to get a list of all the capture buffers”