Make grep-like syntax – The Effective Perler

To create grep– or map-like syntax, you need to use Perl’s prototypes, despite whatever we told you in Understand why you probably don’t need prototypes. Perl needs the special hints that prototypes to parse a block as an argument to a subroutine.

First, remember the forms of grep. There’s a single expression version and a block version:

grep EXPR, @input       # with a comma
grep { ... } @input     # no comma

That block, the {...}, is an inline subroutine where the current element shows up in $_:

my @odds = grep { $_ % 2 } @input;

For either form, there’s a scalar and list return value, depending on context (Item 12. Understand context and how it affects operations):

my @array = grep ...;
my $count = grep ...;

You can make your own subroutines that work just like grep. The prototype character & tells perl to expect a subroutine reference. However, it can only be the first argument. To try it, define a subroutine that takes a single argument, a code reference:

sub run_it (&) {
	my $sub = shift;
	$sub->();
	}

You can call that subroutine in several ways. You can use a block, the sub keyword with a block, or a reference to a subroutine, or a reference to a named subroutine:

use v5.10;

sub named { say "I have a name" }

my $result = run_it { say "Hello!" };
   $result = run_it sub { say "I have a keyword!" };
   $result = run_it \&named;

However, perl is not smart enough to recognize other forms. It won’t like a scalar variable that might have a code reference later, and it can’t take a bareword that is the name of a defined subroutine (like sort will). These are compile-time errors:

use v5.10;

sub named { say "I have a name" }
my $code_ref = \&named;

my $result = run_it $code_ref;
   $result = run_it named;
   $result = run_it &named;

Handling grep’s second argument

Now you can take a code reference as an argument. The next part of the grep syntax is the the input. You could use the @ character to denote a list of arguments (not an array argument), and that appears to work:

use v5.10;
use warnings;

sub do_with_array (&@) { 
	my( $sub, @args ) = @_;
	my @output;
	
	foreach my $elem ( @args ) {
		local $_ = $elem;
		push @output, $sub->();
		}
	return @output;
}

sub other_cats { qw(Ellie Ginger) }
 
my @cats = qw(Buster Mimi Roscoe);

@result = do_with_array { say $_ } @cats;
@result = do_with_array { say $_ } qw(Buster Mimi Roscoe);
@result = do_with_array { say $_ } 1 .. 10;
@result = do_with_array { say $_ } other_cats();

Like grep which can alias $_ to the original data, you can also means that you can change the original data with your subroutine argument if you use @_ (Item 114. Know when arrays are modified in a loop):

use v5.10;
use warnings;

sub do_with_array (&@) {
   my $sub = shift;
   my @output;
   local $_;

   foreach ( @_ ) {
	   push @output, $sub->();
   }
   return @output;
}

my @original   = qw(1 2 3);
my @new        = do_with_array { $_ += 2 } @original;
say "new = @new";             # 3 4 5
say "original = @original";  # 3 4 5

You might try the \@ prototype, but that limits you in other ways. Now perl expects a named array as an argument. You cannot use an array reference, range, literal list, or the return values from a subroutine call. It’s a named array or an error. That’s no good.

Likewise, you might use the + prototype introduced in Perl 5.14. This allows you to use an array or an array reference argument. Perl doesn’t complain if you use a range or a subroutine call, but it also doesn’t do the right thing:

use v5.10;
use warnings;

sub do_with_array (&+) { 
	my( $sub, $array ) = @_;
	my @output;
	
	foreach my $elem ( @$array ) {
		local $_ = $elem;
		push @output, $sub->();
		}
	return @output;
}

sub other_cats { qw(Ellie Ginger) }

my @cats = qw(Buster Mimi Roscoe);

@result = do_with_array { say $_ } @cats;
@result = do_with_array { say $_ } [ 'a' .. 'g' ];
@result = do_with_array { say $_ } 1 .. 10;
@result = do_with_array { say $_ } other_cats();

The named subroutine and the array reference work just fine. Perl does something weird with the range, and the subroutine call appears to not happen at all:

Buster
Mimi
Roscoe
a
b
c
d
e
f
g
Use of uninitialized value $. in range (or flip) at run_it.pl line 21.

Not only that, but the + prototype character also allows named hashes and hash references:

use v5.10;
use warnings;

sub do_with_array (&+) { 
	my( $sub, $array ) = @_;
	my @output;
	
	foreach my $elem ( @$array ) {
		local $_ = $elem;
		push @output, $sub->();
		}
	return @output;
}

my %cats = qw(Buster Mimi Roscoe Ellie);

@result = do_with_array { say $_ } %cats;
@result = do_with_array { say $_ } { 'a' => 'b' };

You don’t get an error until runtime when you try the array dereference:

Not an ARRAY reference at run_it.pl line 8.

You can check these things at runtime, though. This reminds you, as we said in Understand why you probably don’t need prototypes, that prototypes probably don’t do what you think. You end up doing a lot of the work that most people think prototypes do for you:

use v5.10;
use warnings;
use Carp;

sub do_with_array (&+) { 
	my( $sub, $array ) = @_;
	croak "do_with_array takes an array argument"
		unless ref $array eq ref [];

	my @output;
	
	foreach my $elem ( @$array ) {
		local $_ = $elem;
		push @output, $sub->();
		}

	return @output;
}

my @cats = qw(Buster Mimi Roscoe Ellie);
my %cats = map { $_, 1 } @cats;

@result = do_with_array { say $_ } @cats;
@result = do_with_array { say $_ } %cats;

Handling context

We explained context in Item 12. Understand context and how it affects operations, and if you want to emulate grep you have to handle them. In list context grep returns a list, in scalar context it returns a count, and in void context it potentially does nothing. You really only need to handle the void case. In this case, you’ll simply return without doing anything:

use v5.10;
use warnings;
use Carp;

sub do_with_array (&+) {
	return unless defined wantarray;
	my( $sub, $array ) = @_;
	croak "do_with_array takes an array argument"
		unless ref $array eq ref [];

	my @output;
	
	foreach my $elem ( @$array ) {
		local $_ = $elem;
		push @output, $sub->();
		}

	return @output;
}

The list and scalar contexts come from returning an array. When you return a named array, you get the same results as assigning an array. In list context you get the list elements, and in scalar context you get the count. If you want to return something different, such as a list instead of a named array, you have to do more work:

	return wantarray ? qw( a b c ) : 3;

So, once again, prototypes half solve the problem, but leave you with more work to do.

Things to remember

Use the & prototype character to specify a code reference argument
If the code reference argument argument is the first argument, you can leave off the sub keyword
The reference has to be a block of code or a reference to a named subroutine, and specifically not a scalar variable