Use /aa to get ASCII semantics in regexes, for reals this time

When Perl made regexes more Unicode aware, starting in v5.6, some of the character class definitions and match modifiers changed. What you expected to match \d, \s, or \w are more expanvise now (Know your character classes under different semantics). Most of us probably didn’t notice because the range of our inputs is limited. » Read more…

Turn capture groups into cluster groups

Perl v5.22 adds the /n regex flag that turns all parentheses groups in its scope into non-capturing groups. This can be handy when you want to capture almost nothing but still need to many cluster parts. You do less typing to get that. » Read more…

Make bitwise operators always use numeric context

Most Perl operators force their context on the values. For example, the numeric addition operator, +, forces its values to be numbers. To “add” strings, you use a separate operator, the string concatenation operator, . (which looks odd at the end of a sentence).

The bitwise operators, however, look at the value to determine their context. With a lefthand value that has a numeric component, the bitwise operators do numeric things. With a lefthand value that’s a string, the bit operators become string operators. That’s certainly one of Perl’s warts, which I’ll fix at the end of this article with a new feature from v5.22. » Read more…

Perl v5.22 adds hexadecimal floating point literals

You can specify literal hexadecimal floating-point numbers in v5.22, just as you can in C99, Java, Ruby, and other languages do. Perl, which uses doubles to store floating-point numbers, can represent a limited set of values. Up to now, you’ve had to specify those floating point numbers in decimal, hoping that a double could exactly represent that number. That hope, sometimes unfounded, is the basis for the common newbie question about floating point errors. » Read more…

Use Perl 5.22’s <<>> operator for safe command-line handling

We’ve had the three argument open since Perl 5.6. This allows you to separate the way you want to interact with the file from the filename.

Old Perl requires you to include the mode and filename together, giving Perl the opportunity to interpret what you mean: » Read more…

Perl 5.22 new features

The first Perl 5.22 release candidate is out and there are some new operators and many enhancements to regular expressions that look interesting, along with some improvements that don’t require any work from you. Some of the features are experimental, so be careful that you don’t create problems by overusing them until they settle down. » Read more…

Use v5.20 subroutine signatures

Subroutine signatures, in a rudimentary form, have shown up in Perl v5.20 as an experimental feature. After a few of years of debate and a couple of competing implementations, we have something we can use. And, because it was such a contentious subject, it got the attention a new feature deserves. They don’t have all the features we want (notably type and value constraints), but Perl is in a good position to add those later. » Read more…

Perl v5.18 adds character class set operations

Perl v5.18 added experimental character code set operations, a requirement for full Unicode support according to Unicode Technical Standard #18, which specifies what a compliant language must support and divides those into three levels.

The perlunicode documentation lists each requirement and its status in Perl. Besides some regular expression anchors handling all forms of line boundaries (which might break older programs), set subtraction and intersection in character classes was the last feature Perl needed to be Level 1 compliant. » Read more…

Don’t use named lexical subroutines

Perl v5.18 allows you to define named subroutines that exist only in the current lexical scope. These act (almost) just like the regular named subroutines that you already know about from Learning Perl, but also like the lexical variables that have limited effect. The problem is that the feature is almost irredeemably broken, which you’ll see at the end of this Item. » Read more…

Enforce ASCII semantics when you only want ASCII

When Perl made regexes more Unicode aware, starting in v5.6, some of the character class definitions and match modifiers changed. What you expected to match \d, \s, or \w are more expanvise now (Know your character classes under different semantics). Most of us probably didn’t notice because the range of our inputs is limited. » Read more…

7ads6x98y