Perl v5.22 adds fancy Unicode word boundaries

Perl v5.22’s regexes added four Unicode boundaries to go along with the vanilla “word” boundary, \b, that you’ve been using for years. These new assertions aren’t going to match perfectly with your expectations of human languages (the holy grail of natural language processing), but they do okay-ish. Although these appear in v5.22.0, as a late edition to the language they were partially broken in the initial release. They were fixed for v5.22.1. Continue reading “Perl v5.22 adds fancy Unicode word boundaries”

Lexical $_ and autoderef are gone in v5.24

Two features that I have previously discouraged are now gone from Perl. The lexical $_ and auto dereferencing.

The lexical $_ was a consequence of the way Perl wanted smart match to work. In a given-when, instead of aliasing $_ like foreach does, the block had an implicit my $_ = .... This interfered with the package version, as I wrote about in Use for() instead of given() and Perl v5.16 now sets proper magic on lexical $_. Continue reading “Lexical $_ and autoderef are gone in v5.24”

Postfix dereferencing is stable is v5.24

Perl’s dereferencing syntax might be, or even should be, responsible for people’s disgust at the language. In v5.20, Perl added the experimental postfix dereferencing syntax that made this analogous to method chaining. This is one of the most pleasing Perl features I’ve encountered in years. Continue reading “Postfix dereferencing is stable is v5.24”

No more -no_match_vars

The English module translates Perl’s cryptic variable names to English equivalents. For instance, $_ becomes $ARG. This means that the match variable $& becomes $MATCH. This also means that using the English module triggered the performance issue associated with the match variables $`, $&, and $' even if you didn’t use those variables yourself—the module used them for you. The Devel::NYTProf debugger had a sawampersand feature to tell you one of those variables appeared in the code. We covered this in Item 33. Watch out for the match variables. Continue reading “No more -no_match_vars”

Use subroutine signatures with anonymous subroutines

While working on my excellent number project, I created a subroutine that took a callback as an argument. When I dereferenced the callback I wanted to supply arguments. Since I was using Perl v5.22, I tried using a subroutine signature with it. Continue reading “Use subroutine signatures with anonymous subroutines”

Use /aa to get ASCII semantics in regexes, for reals this time

When Perl made regexes more Unicode aware, starting in v5.6, some of the character class definitions and match modifiers changed. What you expected to match \d, \s, or \w are more expanvise now (Know your character classes under different semantics). Most of us probably didn’t notice because the range of our inputs is limited. Continue reading “Use /aa to get ASCII semantics in regexes, for reals this time”