Perl v5.20 combines multiple my() statements

Perl v5.20 continues to clean up and optimize its internals. Now perl optimizes a series of lexical variable declarations into a single list declaration.

You can write them out as separate statements:

my( $dog );
my( $cat );
my( $bird );

v5.20 turns that into a single declaration.

my( $dog, $cat, $bird );

That doesn’t sound like a bit deal, but it saves several steps, and steps translate to time no matter what they do.

Remember where you’re likely to declare lexical variables—at the start of subroutines. Call the subroutine enough times and those little bits of time add up:

sub foo {
	my( $dog );
	my( $cat );
	my( $bird );

	...
	}

With v5.20, either way you write it you get the faster form:

sub foo {
	my( $dog, $cat, $bird );
	...
	}

If I’m not initializing variables, I initialize with a list anyway, but I tend to declare variables as close to their use as I can get away with, which often means that I don’t list them at the top of the subroutine. Writing short subroutines, however, means I probably have them near the top.

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Reddit

In v5.20, -F implies -a implies -n

Perl was once known for its one-liners in its sysadmin heydays. People would pass around lists of these one liners, many of which replaced complicated pipelines that glued together various unix utilities to do some impressive system maintenance.


  • -a splits the input line on whitespace and puts the result in @F
  • -n adds while( <> ) { ... }
  • The -F specifies what -a should split on

You could write a program to read a line, split on whitespace, and print the elements:

while( <> ) {
	@F = split;
	print "@F\n";
	}

As a one-liner, that program is much sorter:

% perl -a -ne 'print "@F\n"'

If you want to change the split pattern, you use -F. For instance, you have a line with fields separated by colons:

cat:bird:dog:lizard:fox

You use -F to set the colon as the split pattern:

% perl -aF: -ne 'print "@F\n"'

Prior to v5.20, you had to specify each of those switches, even though they implied each other. The -F switch only makes sense with -a, and -a needs something to put data into $_, hence -n.

% perl5.20 -F: -le 'print join "=", @F' input.txt

(The -l adds the newline after the print.)

If you don’t need the -F, the -a still implies the -n:

% perl5.20 -ale 'print join "=", @F' input.txt

You can see what a one-liner does by deparsing it (Use B::Deparse to see what perl thinks the code is):

% perl5.20 -MO=Deparse -F: -le 'print join "=", @F' input.txt

The code comments I added myself:

BEGIN { $/ = "\n"; $\ = "\n"; }
LINE: while (defined($_ = )) {     # from -n
    chomp $_;                            # from -n
    our(@F) = split(/:/, $_, 0);         # split from -a, @F from -F
    print join('=', @F);                 # argument to -e
}                                        # from -n

To make this new feature work reliably, you have to ensure that you are using at least v5.20. If someone tries your shortened one-liner with an earlier version, it won’t work. That’s the danger with using the new features in any version. Specifying the version negates the saved typing since it also requires you to declare @F since declaring a minimum version also turns on strict since v5.12.

% perl -M5.20.0 -Mvars=@F -lne 'print join "=", @F' input.txt

Further Reading

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Reddit

Perl 5.20 introduces “Key/Value Slices”

Perl v5.20 adds the “Key/Value Slice”, which extracts multiple keys and their corresponding values from a container (hash or array). It uses the %, which is new, legal syntax for a variable name with subscripts after it:


use v5.20;  # you don't need this for the new syntax

my %smaller_hash = %big_hash{ @keys };
my %index_hash   = %big_array[ @indices ];

As with the @ sigil, you know variable type you’re dealing with by the indexing syntax after it. The % does not signify a hash; it denotes that you are getting the index (key) with the value.

Looking at that example, you might mistakenly think that these slices return hashes. They don’t. They return lists which have an index then a value. That’s pairwise like the list representation of a hash, but it can also repeat keys (which hashes can’t do):

use v5.20;  # you don't need this for the new syntax

my %big_hash     = qw( cat Buster dog Addy bird Poppy );

my @array = %big_hash{ qw(cat cat) };

say "@array";

The resulting list duplicates that entries for cat:

cat Buster cat Buster

This new type of slice returns a list, which is just data (i.e. not a variable). That means you can’t use the hash or array operators on the result, which would be a neat trick. You can’t take a reference to the entire result, because that’s the same as taking a reference to a list to get a list of references. The result is not an lvalue, so you can’t assign to it or modify it directly.

Previous to v5.20, you can do the same thing with a map:

my @array = map { $_ => $big_hash{$_} } @keys;

That’s not that bad, making this new feature less than compelling. Use it if you need v5.20 for something else, but don’t make this feature the one that forces people to upgrade.

Further Reading

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Reddit

Perl 5.20 optimizes return at the end of a subroutine

Want to save 10 nanoseconds? Perl v5.20 optimizes a return at the end of a subroutine to use two fewer ops in the optimized version. During compilation, a subroutine like this one:


sub some_sub { ...; return $foo }

turns into a subroutine like this one, without the return

sub some_sub { ...; $foo }

You can see the difference in the output from the B::Concise module (which you can use through the O frontend). Prior to v5.20, there are five steps to return the first argument:

$ perl5.18.0 -MO=Concise,baz,-exec -e 'sub baz { return $_[0] }'
main::baz:
1  <;> nextstate(main 1 -e:1) v
2  <0> pushmark s
3  <$> aelemfast(*_) s
4  <@> return K
5  <1> leavesub[1 ref] K/REFC,1
-e syntax OK

But, prior to v5.20, if you didn’t use the return keyword, there are only three steps after the PUSHMARK isn’t there:

$ perl5.18.0 -MO=Concise,baz,-exec -e 'sub baz { $_[0] }'
main::baz:
1  <;> nextstate(main 1 -e:1) v
2  <$> aelemfast(*_) s
3  <1> leavesub[1 ref] K/REFC,1
-e syntax OK

You can read about PUSHMARK in the perlcall documentation. It’s a signal to perl to remember where the current stack pointer is.

With v5.20, perl optimizes the return version to have the same steps:

$ perl5.20.0 -MO=Concise,baz,-exec -e 'sub baz { return $_[0] }'
main::baz:
1  <;> nextstate(main 1 -e:1) v
2  <$> aelemfast(*_) s
3  <1> leavesub[1 ref] K/REFC,1
-e syntax OK

$ perl5.20.0 -MO=Concise,baz,-exec -e 'sub baz { $_[0] }'
main::baz:
1  <;> nextstate(main 1 -e:1) v
2  <$> aelemfast(*_) s
3  <1> leavesub[1 ref] K/REFC,1
-e syntax OK

This will be happy news to people who stick to Perl Best Practices, which recommends that you always use an explicit return. Damian’s recommendation is to denote intent, but now you don’t suffer if it’s the last statement in the subroutine.

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Reddit

Perl 5.20 uses its own random number generator

Prior to v5.20, perl used whatever random number generator the system provided. This meant that the same program could have statistically different results based on the quality of that function. The rand() for Windows had a max of 32,768 (15 bits), while POSIX has drand48 (48 bits). This sort of numerical un-portability has always been a problem with perl since it’s relied on the underlying libc for so much.

100  One Hundred Random Numbers

Not any more. It’s all internal to perl now. With v5.20 and beyond, you’ll get the same pseudorandom number generator that everyone else with v5.20 and later gets.

For the effective programmer though, this doesn’t really matter because you shouldn’t be using the pseudorandom number generator for anything important. We call it rand, but it’s not really. We should have called it fake_random, good_enough_random, or i_wont_install_a_module_so_ill_deal_with_it_random. The name in Perl comes from the name in libc (e.g. the GNU libc function list), just like many of the oddly named functions such as abs, chmod, or getgrent. From the name, we get sloppy talking about it’s output as “random numbers” instead of the correct “pseudorandom” number.

Sinan Ãœnür examines how well Perl’s rand does with coin flips and concludes it comes up short (Perl 5.20.0 brings a “better” PRNG to Windows). An older presentation from the Wellington Perl mongers goes through some serious math to talk about better pseudorandom numbers. The documentation for Math::Random::Secure has more interesting details. Several other modules provide rand replacements.

There are ways to get real random numbers. Atmospheric noise, nuclear decay, and other processes are random and their measurement can supply the numbers. The random.org website, for one, can supply these, and the Net::Random makes the connection for you.

Even though rand still isn’t random, at least everyone can use the same thing without any extra work. I like it any time the perl can bring this stuff inside to make it more portable.

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Reddit

Perl 5.20 new features

Perl 5.20 is out and there are some nice syntax changes that make life easier for Perlers, along with some improvements that don’t require any work from you. Some of the features are experimental, so be careful that you don’t create problems by overusing them until they settle down.

You can download the Perl source from CPAN. For Windows, Strawberry Perl 5.20 is available now.


Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Reddit

Experimental features now warn (reaching back to v5.10)

Perl 5.18 provides a new way to introduce experimental features in a program, augmenting the feature pragma that v5.10 added. This change marks certain broken v5.10 features as experimental with an eye toward possible removal from the language.

Smart matching in v5.10 led to several broken and conflated features. The given used a lexical version of $_, which broke many other common uses of that variable inside the given, which I explain in Use for() instead of given() and you can see in given/when and lexical $_ ….

Under v5.18, when you use given, when, or ~~, you get a warning, even if there is no smart match involved:

# given_warning.pl
use v5.10; # earliest occurance of feature
for( 'Buster' ) {
	when( 1 == 1 ) { say "Hello" }
	}

These warnings might cause test suites to fail when people try to install modules on the new perl, like it does for Unicode::Tussle.

% perl5.10.1 given_warning.pl
Hello
% perl5.18.0 given_warning.pl
when is experimental at given_warning.pl line 4.
Hello

Using the diagnostics shows the sort of warning it is:

% perl5.18.0 -Mdiagnostics given_warning.pl
when is experimental at -e line 1 (#1)
    (S experimental::smartmatch) when depends on smartmatch, which is
    experimental.  Additionally, it has several special cases that may
    not be immediately obvious, and their behavior may change or
    even be removed in any future release of perl.
    See the explanation under "Experimental Details on given and when"
    in perlsyn.

Hello

To get rid of this warning, you do the same thing you do with other warnings. Take the category of the warning and turn it off with no (Item 100: Use lexical warnings to selectively turn on or off complaints):

# given_warning.pl
use v5.10; # earliest occurance of feature
no warnings 'experimental::smartmatch';
for( 'Buster' ) {
	when( 1 == 1 ) { say "Hello" }
	}

The lexical $_ is another broken fature that’s now marked as experimental.

# lexical_.pl
use v5.10;

sub cat { my $_ }

Any use in v5.18 gives a warning:

% perl5.18.0 lexical_.pl
Use of my $_ is experimental at lexcial_.pl line 3.

The category is different:

% perl5.18.0 -Mdiagnostics lexical_.pl
Use of my $_ is experimental at lexcial_.pl line 4 (#1)
    (S experimental::lexical_topic) Lexical $_ is an experimental
    feature and its behavior may change or even be removed in any
    future release of perl. See the explanation under "$_" in perlvar.

That takes care of the two retro features. Perl v5.18 introduces two new experimental features, set logic in character classes (for complete Unicode Level 1 regular expression compliance), and lexical subroutines, which I’ll cover in other items.

# regex.pl
use v5.18;

print "Match" if 'foo' =~ /(?[ \p{Thai} & \p{Digit} ])/;

Without turning off the warning, perl knows about the feature and points it out:

% perl5.18.0 regex.pl
The regex_sets feature is experimental in regex; marked by <-- HERE in m/(?[ <-- HERE  \p{Thai} & \p{Digit} ])/ at regex.pl line 4.

In this case, diagnostics is not any help:

% perl5.18.0 -Mdiagnostics regex.pl
The regex_sets feature is experimental in regex; marked by <-- HERE in m/(?[
        <-- HERE  \p{Thai} & \p{Digit} ])/ at regex.pl line 3 (#1)
The regex_sets feature is experimental in regex; marked by <-- HERE in m/(?[ <-- HERE  \p{Thai} & \p{Digit} ])/ at regex.pl line 3.

For lexical named subroutines, you have explicitly enable the feature but you then have to explicitly turn off its warnings.

# lexical_sub.pl
use v5.18;
no warnings 'experimental::lexical_subs';
use feature "lexical_subs";

my sub foo { say "Hello" }

Handling older perls

In v5.18, that's all fine and good, but older versions don't understand those warnings categories and will stop your program.

% perl5.10.1 -e 'no warnings qw(smartmatch)'
Unknown warnings category 'smartmatch' at -e line 1
BEGIN failed--compilation aborted at -e line 1.

Instead of using warnings, you can use the non-core experimental module that handles that for you:

use experimental qw(smartmatch);

For versions without that warning category, nothing happens. For versions with that feature, it turns off the warning.

Summary

This table summarizes the new experimental warnings categories and the features they affect.

Category Features
experimental::smartmatch given, when, ~~
experimental::lexical_topic my $_
experimental::regex_sets (?[ ])
experimental::lexical_subs my sub NAME {}, our sub NAME {}

Things to remember

  • Some v5.10 features now warn under v5.18
  • Some new experimental features must be explicitly enabled
  • Even explicitly enabled features still warn
  • The experimental module is version safe

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Reddit

Perl 5.18 new features

Perl 5.18 is out and there are some major changes that you should know about before you upgrade. Most notably, some features from v5.10 are now marked experimental. If you use those features, you get warnings.

You can download the Perl source from CPAN. For Windows, Strawberry Perl 5.18 is available now.


Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Reddit

The vertical tab is part of \s in Perl 5.18

Up to v5.18, the vertical tab wasn’t part of the \s character class shortcut for ASCII whitespace. No one really knows why. It was curious trivia that I pointed out in Know your character classes under different semantics. Whitespace in ASCII, POSIX, and Unicode represented different sets. Perl whitespace was different from POSIX whitespace by only the exclusion of the vertical tab. Now that little oversight is fixed.

I had this program to mark which sets matched which characters. I required v5.10 because that’s the first appearance of the \h and \v shortcuts for horizontal and vertical whitespace.

use 5.010;

use charnames qw(:full);

print <<"LEGEND";
s   matches \\s, matches Perl whitespace
h   matches \\h, horizontal whitespace
v   matches \\v, vertical whitespace
p   matches [[:space:]], POSIX whitespace
all characters match Unicode whitespace, \\p{Space}

LEGEND

printf qq(%s %s %s %s  %-7s --> %s\n),
	qw( s h v p  Ordinal  Name );
print '-' x 50, "\n";

foreach my $ord ( 0 .. 0x10ffff ) {
	next unless chr($ord) =~ /\p{Space}/;
	my( $s, $h, $v, $posix ) =
		map { chr($ord) =~ m/$_/ ? 'x' : ' ' }
			( qr/\s/, qr/\h/, qr/\v/, qr/[[:space:]]/ );
	printf qq(%s %s %s %s  0x%04X  --> %s\n),
		$s, $h, $v, $posix,
		$ord, charnames::viacode($ord);
	}

Under v5.10, the top of the output showed that \s did not include the vertical tab, which the UCS names LINE TABULATION.

$ perl5.10.1 spaces
s   matches \s, matches Perl whitespace
h   matches \h, horizontal whitespace
v   matches \v, vertical whitespace
p   matches [[:space:]], POSIX whitespace
all characters match Unicode whitespace, \p{Space}

s h v p  Ordinal --> Name
--------------------------------------------------
x x   x  0x0009  --> CHARACTER TABULATION
x   x x  0x000A  --> LINE FEED
    x x  0x000B  --> LINE TABULATION
x   x x  0x000C  --> FORM FEED
x   x x  0x000D  --> CARRIAGE RETURN
x x   x  0x0020  --> SPACE

Run under v5.18, the output changes slightly to have another x in the third row (line 12).

$ perl5.18.0 spaces
s   matches \s, matches Perl whitespace
h   matches \h, horizontal whitespace
v   matches \v, vertical whitespace
p   matches [[:space:]], POSIX whitespace
all characters match Unicode whitespace, \p{Space}

s h v p  Ordinal --> Name
--------------------------------------------------
x x   x  0x0009  --> CHARACTER TABULATION
x   x x  0x000A  --> LINE FEED
x   x x  0x000B  --> LINE TABULATION
x   x x  0x000C  --> FORM FEED
x   x x  0x000D  --> CARRIAGE RETURN
x x   x  0x0020  --> SPACE

I don’t foresee this breaking anything since the vertical tab seems to be a rare character, although in ETL I liked using it as a separator since I figured no one else would be using it.

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Reddit

Effective Perler discounts during OSCON

I’ll be at OSCON on Tuesday, July 17, but you don’t have to find me to get up to 37% off Effective Perl Programming. That’s a slightly lower price than Amazon. To get that discount, you have to buy the book at Pearson’s booth in the exhibition hall. You’ll need to track me down on Tuesday afternoon or evening if you want me to sign your book.

If you can’t make it to OSCON, you can still get 35% off the cover price by ordering directly from the InformIT discount link or using the OSCON2012 discount code when you check out. Instead of navigating their site, you can go directly to our book.

If you’re not sure you want the book, you can look at a free sample chapter, which is also 35% off during OSCON.

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Reddit