miscellany – The Effective Perler

Iterate over multiple elements at the same time

This is a chapter in Perl New Features, a book from Perl School that you can buy on LeanPub or Amazon. Your support helps me to produce more content.

This feature was promoted to a stable version in v5.40.

Perl v5.36 adds experimental support that allows a foreach (or for) to loop iterate over multiple values at the same time by specifying multiple control variables. This is incredibly cool:

use v5.36;
use experimental qw(for_list);

my @animals = qw( Buster Mimi Ginger Nikki );
foreach my( $s, $t ) ( @animals ) {
	say "$s ^^^ $t";
	}

The output shows two iterations of the loop, each which grabbed two values from the list:

Buster ^^^ Mimi
Ginger ^^^ Nikki

Add another parameter; the list now doesn’t divide evenly between the parameters, so any parameter that can’t match with a list item gets undef, just like normal list assignment:

use v5.36;
use experimental qw(for_list);

foreach my( $s, $t, $u ) ( @animals ) {
	say "$s ^^^ $t ^^^ $u";
	}

Since use v5.36 also turns on warnings, you get those “uninitialized” warnings for free when you use those undef values:

Buster ^^^ Mimi ^^^ Ginger
Nikki ^^^  ^^^
Use of uninitialized value ...
Use of uninitialized value ...

Another interesting use combines the new builtin::indexed feature that gets you the index and value at the same time:

use v5.36;
use experimental qw(for_list builtin);
use builtin qw(indexed);

my @animals = qw( Buster Mimi Ginger Nikki );
foreach my( $i, $value ) ( indexed(@animals) ) {
	say "$i: $value";
	}

That’s a bit nicer than going through the indices to access the value in an additional statement:

foreach my $i ( 0 .. $#animals ) {
	my $value = $animals[$i];
	say "$i: $value";
	}

No placeholders (yet)

So far, this new syntax doesn’t have a way to skip values. In a normal list assignment, you discard a value coming from the right hand list with a literal undef:

my( $s, undef, $t ) = @animals

Try that in the for list and you get a syntax error:

foreach my( $s, undef, $u ) ( @animals ) {  # ERROR!
	say "$s ^^^ $u";
	}

Hash keys and values

I’m tempted to use this for hashes, although each inside a while is still probably better since it doesn’t have to build the entire input list in one go:

use experimental qw(for_list);

my %animals = (
	cats => [ qw( Buster Mimi Ginger ) ],
	dogs => [ qw( Nikki ) ],
	);

foreach my( $k, $v ) ( %animals ) {
	say "$k ^^^ @$v";
	}

Since those hash values are array refs, it would be helpful if this feature could use the refaliasing and declared_refs features (Mix assignment and reference aliasing with declared_refs):

use experimental qw(for_list);
use experimental qw(refaliasing declared_refs);

my %animals = (
	cats => [ qw( Buster Mimi Ginger ) ],
	dogs => [ qw( Nikki ) ],
	);

foreach my( $k, \@v ) ( %animals ) {
	say "$k ^^^ @v";
	}

Sadly, the parser doesn’t expect the reference operator inside that for list:

syntax error ... near ", \"

Doing

Prior to builtin multiple iteration, the best way to do the same thing was probably the List::MoreUtils (not part of core) module. The natatime function, which I wished was named n_at_a_time, grabs the number of elements that you specify and returns them as a list. Since it returns a list instead of an array reference, it’s easier to use it with a while:

use List::MoreUtils qw(natatime);

my @x = ('a' .. 'g');
my $iterator = natatime 3, @x;

while( my @vals = $iterator->() ) {
	print "@vals\n";
	}

Another approach uses splice. The easiest thing might be to do it destructively since that requires no index fiddling:

my @x = 'a' .. 'g';
my @temp = @x;

while( my @vals = splice @temp, 0, 3, () ) {
	print "@vals\n";
	}

Here’s an example from the L documentation that does the same thing:

sub nary_print {
  my $n = shift;
  while (my @next_n = splice @_, 0, $n) {
	say join q{ -- }, @next_n;
  }
}

nary_print(3, qw(a b c d e f g h));
# prints:
#   a -- b -- c
#   d -- e -- f
#   g -- h

Playing with the array indices can get this done, but it comes with a lot of baggage. First, an array slice doesn’t return an empty list, so you can’t use that as a condition in the while as in the previous examples. Since it fills in the missing elements with undef, outputting the values possibly comes with warnings. Even if you want to accept those annoyances, you still have to manage the end of array condition ($#X) yourself:

my @x = 'a' .. 'g';

my $start = 0;
my $n     = 3;

while( $start <= $#x ) {
	no warnings qw(uninitialized);
	my @vals = @x[$start, $start + $n - 1];
	print "@vals\n";
	$start += $n;
	}

So yeah, having a multiple iterator feature built into Perl is a huge win.

Summary

The experimental for_list feature lets you take multiple elements of the list in each iteration. This doesn't yet handle many of the list assignment features that would make this as useful as people will want it to be.

From the Perl documentation

perlsyn

Insignificant whitespace in brace constructs

This is a chapter in Perl New Features, a book from Perl School that you can buy on LeanPub or Amazon. Your support helps me to produce more content.

Perl’s coterie of brace constructs become a bit more lenient in v5.34. These things appear in double-quotish constructs, such as \N{CHARNAME} to specify a character by name. And, patterns count as a double-quoted construct (unless you use ' as the delimiter), so these new rules apply to brace constructs such as \k{} (for named backreferences) and the general quantifier, {n,m}.

Continue reading “Insignificant whitespace in brace constructs”

Insignificant leading or trailing whitespace in brace constructs

This is a chapter in Perl New Features, a book from Perl School that you can buy on LeanPub or Amazon. Your support helps me to produce more content.

Specifying characters

These constructs apply to double-quotish interpretation to specify a character by its codepoint or name:

Construct	Description/th>	Item
`\N{CHARNAME}`	Character name	item
`\o{177}`	Octal code point	item
`\x{ABCD}`	Hex code point

There are already loose names for \N{} that ignores whitespace (item), but this feature is a bit different. It ignores horizontal whitespace around a value (but not inside a value):

use v5.10;
use open qw(:std :utf8);

say <<~"HERE";
	Cat face: \N{ BLACK SPADE SUIT }
	Octal:    \o{ 23140 }
	Hex:      \x{ 2660 }
	HERE

This outputs the character we expect:

$ perl5.34.0 whitespace.pl
Spade suit: ♠
Octal:      ♠
Hex:        ♠

If you add space within the value, you don't get the character you want (the \N{} will actually fail):

use v5.34;
use open qw(:std :utf8);

say <<~"HERE";
	Octal:    \o{ 231 40 }
	Hex:      \x{ 26 60 }
	HERE

This discards the cruft once it encounters non-digit characters (just like Perl's string-to-number conversions). This is effectively:

use v5.34;
use open qw(:std :utf8);

say <<~"HERE";
	Octal:    \o{ 231 }
	Hex:      \x{ 26 }
	HERE

It's even worse. You can extra nonsense after the code number and v5.34 will ignore it. Although these have illegal digits (along with the internal space), they still work:

use v5.34;
use open qw(:std :utf8);
use warnings;

say <<~"HERE";
	Octal:    \o{ 231 abc }
	Hex:      \x{ 26 xyz }
	HERE

With trailing tabs or spaces, warnings says that it ignores the cruft and uses what it received so far:

Non-octal character ' ' terminates \o early.  Resolved as "\o{231}" at ...

With leading tabs or spaces, earlier Perls give up right away and uses the null character. The warning from v5.32 is this:

Non-octal character ' ' terminates \o early.  Resolved as "\o{000}" at...

Finally, the whitespace can't be vertical space or other double-quote escapes (it's just literal tabs or spaces). These don't work:

\o{\t231}
\o{
	231 }

In regular expressions, this fails before Perl interprets the pattern, where the /x would be able to handle the vertical whitespace. This would match a null byte because the string-to-number parsing stops at the first newline, returning \000:

m/\o{
	231
	}/x;

In regular expressions

And these constructs apply to regular expression features, and you don't need the /x flag to get this new, insignificant whitespace:

Construct	Description	Chapter
`\b{TYPE}`	Word boundary	Item
`\g{N}`	Numbered backreference	Item 31 (book)
`\g{NAME}`	Named backreference	Item 31 (book)
`\k{NAME}`	Named backreference	Item 31 (book)
`\p{PROPNAME}`	Unicode property name
`\P{PROPNAME}`	Unicode property name
`\x{ABCD}`	Hex code point
`{n,m}`	general quantifier

The rules for these are similar to the same as those from the previous section. Perl ignores the tabs or spaces at the beginning
or the end, but not in the middle (aside from around the , in {n,m}). For example, these all work:

use v5.10;
use open qw(:std :utf8);
use warnings;

$_ = 'aa';

my @patterns = (
	qr/(.)\g{ -1 }/,
	qr/(?.)\g{ first }/,
	qr/(?.)\k{ first }/,
	qr/\b{ sb }(.)/,
	qr/(\o{ 141 })\g{ -1 }/,
	qr/(\p{Letter})\g{ -1 }/,
	qr/(.)\g{ -1 }/,
	qr/(\x{ 61 })\g{ -1 }/,
	);

foreach my $pattern ( @patterns ) {
	say /$pattern/
	};

Specify octal numbers with the 0o prefix

Perl v5.34 allows you to specify octal literals with the 0o prefix, as in 0o123_456. This is consistent with the existing constructs to specify hexadecimal literal 0xddddd and binary literal 0bddddd. The builtin oct() function accepts any of these forms.

Previously, you specified octal with just a leading zero:

chmod 0644, $file;
mkdir 0755, $file;

Now you can do that an extra character that specifies the base:

chmod 0o644, $file;
mkdir 0o755, $file;

This makes it consistent with 0b for binary and 0x for hexadecimal. See “Scalar value constructors” in perldata.

And, remember that v5.14 added the \o{NNN} notation to specify characters by their octal number. We’re still waiting for octal floating point values (we got the hex version in v5.22), but don’t hold your breath.

Perhaps we’ll get 0d sometime so that all the bases.

Undef a scalar to release its memory

When you store a large string in a scalar, perl allocates the memory to store that string and associate it with the scalar. It uses the same memory even if you assign a much shorter value to the same scalar. Use the functional form of undef to let perl reuse that memory for something else. This is important when you want to reuse the variable or the lifetime of the variable is very long.

Continue reading “Undef a scalar to release its memory”

Perl v5.28 can delete key-value slices

Perl v5.20 introduced key-value slices that worked on hashes and arrays. You could extract values by their keys or indices as well as assigning to those.

The key-value slice delete is way to extract the keys and values you want and delete them at the same time. You can destructively
Continue reading “Perl v5.28 can delete key-value slices”

Perl v5.26 now recognizes version control conflict markers

Perl v5.26 can now detect and warn you about a version control conflict markers in your code. In prior versions, the compiler would try to interpret those as code and would complain about a syntax error. You program still fails to compile but you get a better error message. Maybe some future Perl will bifurcate the program, run both versions, and compare the results (don’t hold your breath):

Continue reading “Perl v5.26 now recognizes version control conflict markers”

In-place editing gets safer in v5.28

In-place editing is getting much safer in v5.28. Before that, in rare circumstances it could lose data. You may have never noticed the problem and even with all the times I’ve explained it in a Perl class I haven’t really thought about it. This was first reported as early as December 2002 and after we get v5.28 it won’t be a problem anymore. Continue reading “In-place editing gets safer in v5.28”

Beware of the removal of when in Perl v5.28

[Although I haven’t seen an official notice besides a git commit that reverts the changes, by popular outcry these changes won’t be in v5.28. It’s not that they won’t happen but they won’t be in v5.28. People who depend on Perl should stay vigilant. My advice in the first paragraph stands—change is coming and we don’t know what it is yet.]

Perl v5.28 might do away with when—v5.27.7 already has. Don’t upgrade to v5.28 until you know you won’t be affected by this! This change doesn’t follow the normal Perl deprecation or experimental feature policy. If you are using given-when, stop doing that. If you aren’t using it, don’t start. And everyone should consider if a major change like this on such short notice is comfortable for them. It’s not a democracy but you can still let the core developers know which way you want your favorite language to go.

Continue reading “Beware of the removal of when in Perl v5.28”

keys in scalar context now returns the number of keys

Starting in v5.26, a hash in scalar context evaluates to the number of keys in the hash. You might have thought that it always did that just like an array (not a list!) in scalar context evaluates to the number of items. But nope—it evaluated to a seemingly useless number called the “hash statistics”. Now it’s fixed to do what most people thought it already did. For what it’s worth, keys (or values) in scalar context already provided the count.

Continue reading “keys in scalar context now returns the number of keys”