Respect the global state of the flip flop operator

Perl’s flip-flop operator, .., (otherwise known as the range operator in scalar context) is a simple way to choose a window on some data. It returns false until its lefthand side is true. Once the lefthand side is true, the flip-flop operator returns true until its righthand side is true. Once the righthand side is true, the flip flop operator returns false. That is, the lefthand side turns it on and the righthand side turns it off.

Start with a simple file that has START and END markers:

# input.txt
Ignore this
Ignore this too
START
Show this
And this
Also this
END
Don't show this
Or this

You need to extract the lines between those two markers:

# flip-flop
while( <> ) {
	say if /START/ .. /END/;
	}

The output shows the just the stuff between those markers:

% perl flip-flop input.txt
START
Show this
And this
Also this
END

What if you make the file a bit more complicated so there is an extra matching window? Once the flip-flop operator goes back to false, it can turn to true once its lefthand side matches again. Here’s a file with two windows:

# input2.txt
Ignore this
Ignore this too
START
Show this
And this
Also this
END
Don't show this
Or this
START
Show this again
And this again
Also this again
END
But ignore this

Now you get both windows of output:

% perl flip-flop input2.txt
START
Show this
And this
Also this
END
START
Show this again
And this again
Also this again
END

That’s fine, but it gets a bit more complicated when you try to use the same flip flop more than once when you don’t know its state. Modify the flip-flop program so it goes through each file separately instead of combining all the files into the ARGV filehandle:

foreach my $file ( @ARGV ) {
	open my $fh, '<', $file or die "Could not find $file\n";
	while( <$fh> ) {
		say if /START/ .. /END/;
		}
	}

To watch it work (or not work, coming later), split the input2a.txt file into two separate files, each of which has its own window you want to extract:

# input2a.txt
Ignore this
Ignore this too
START
Show this
And this
Also this
END
Don't show this
# input2b.txt
Or this
START
Show this again
And this again
Also this again
END
But ignore this

The output isn’t surprising and it looks the same as it did with the previous program:

% perl flip-flop input2a.txt input2b.txt
START
Show this
And this
Also this
END
START
Show this again
And this again
Also this again
END

However, it’s at this point that some people get confused. The flip-flip operator doesn’t care about which file you are looking at, what happened in the last file, and so on. To see it “break”, change input2a.txt to it doesn’t have the END marker:

# input2a.txt
Ignore this
Ignore this too
START
Show this
And this
Also this
Don't show this

Since input2a.txt doesn’t complete the window as you intended, the flip-flop, maintaining its state, is still true when it starts the second file:

% perl flip-flop input2a.txt input2b.txt
START
Show this
And this
Also this
Don't show this
# inputb.txt
Or this
START
Show this again
And this again
Also this again
END

The flip-flop maintains its global state. It doesn’t care about starting new loops, new iterations, or anything else. You might think that you could find that in a subroutine, but it’s not even safe there. Every flip-flop operator that perl compiles has its own state, and perl compiles a subroutine only once:

foreach my $file ( @ARGV ) {
	open my $fh, '<', $file or die "Could not find $file\n";
	extract( $fh );
	}

sub extract {
	my( $fh ) = shift;
	
	while( <$fh> ) {
		print if /START/ .. /END/; # this is the same .. on every call
		}
	}	

The output doesn’t change! The flip-flop doesn’t really care that it’s in a subroutine. It’s really just the same flip-flop like it was before.

So, if every flip-flop operator that perl compiles has its own state, and you want a flip-flop operator with a new state, you just need to compile a new flip-flop for each iteration. That’s simple enough, kinda. This program won’t work because the subroutine reference is the same each time. When perl compiles it, it knows that the anonymous subroutine is going to be the same each time so perl reuses it:

foreach my $file ( @ARGV ) {
	open my $fh, '<', $file or die "Could not find $file\n";
	make_extractor()->($fh);
	}

sub make_extractor {
	sub { # only compiled once
		my( $fh ) = shift;
	
		while( <$fh> ) {
			print if /START/ .. /END/;
			}
		};
	}	

You can verify this by dumping the return value of make_extractor:

# dump-subs.pl
use Devel::Peek;

my @subs = map { make_extractor() } 1 .. 3;

print Dump( $_ ) foreach @subs;

sub make_extractor {
	sub { # only compiled once
		my( $fh ) = shift;
	
		while( <$fh> ) {
			print if /START/ .. /END/;
			}
		};
	}	

You get the same subroutine each time, which means you get the same flip-flop each time:

% perl dump-subs.pl
SV = RV(0x80f66c) at 0x80f660
  REFCNT = 2
  FLAGS = (ROK)
  RV = 0x81a4f0
  SV = PVCV(0x80e4b8) at ...
SV = RV(0x80f6fc) at 0x80f6f0
  REFCNT = 2
  FLAGS = (ROK)
  RV = 0x81a4f0
  SV = PVCV(0x80e4b8) ...
SV = RV(0x8030bc) at 0x8030b0
  REFCNT = 2
  FLAGS = (ROK)
  RV = 0x81a4f0
  SV = PVCV(0x80e4b8) ...

You have to make each subroutine different somehow. The trick is to use a closure, which is a subroutine that references a lexical variable that has gone out of scope. In this case, you can enlist state to keep track of how many flip-flop operators you make, and since each new anonymous subroutine needs to capture the value of $count, perl can’t reuse previous definitions. You force it to make a new subroutine:

# flip-flop
use 5.010;

foreach my $file ( @ARGV ) {
	open my $fh, '<', $file or die "Could not find $file\n";
	make_extractor()->($fh);
	}

sub make_extractor {
	state $count = 0;
	$count++;

	sub {
		my( $fh ) = shift;
	
		while( <$fh> ) {
			print "$count: $_" if /START/ .. /END/;
			}
		};
	}	

Now each file gets its own flip-flop. You can see where the first file ends (and is missing its marker) and the second file begins. Every file gets its own flip-flop:

% perl flip-flop input2a.txt input2b.txt
1: START
1: Show this
1: And this
1: Also this
1: Don't show this
2: START
2: Show this again
2: And this again
2: Also this again
2: END

For more information about flip-flops, see perlop’s entry for Range Operators.

Things to remember

  • Every flip-flop maintains a global state
  • Flip-flops are not scoped
  • Create a new flip-flop by wrapping it in a closure
Leave a comment

4 Comments.

  1. The glob operator (spelled variously glob(“*.c”) or ) is another place where perl stores “hidden” global state in the operator when it’s compiled. At http://stackoverflow.com/questions/2633447/why-doesnt-perl-file-glob-work-outside-of-a-loop-in-scalar-context/2634012#2634012 I used a similar trick of creating closures to encapsulate the state of a glob operator, so that it can be used outside of a single while loop. This is slightly useless (since you can just glob to a list instead) but at the same time it’s pretty cool.

  2. Hi Hobbs, I visited the link you provided (in fact I voted up your comment!). There you have given two examples, the second one uses a closure, to refer to a lexical variable, but the first one just returns an anonymous subroutine, so that should have the same problem as the flip-flop operator here does with anonymous subroutines.

  3. Would you please explain how “Flip-flops are not scoped”?
    Does that mean you can’t do:

    if ( $line =~ /START/ .. /END/ ) {
      ....
    }
    
    • In short, it means each range you create maintains its global state and side effects. These don’t disappear when the scope you are in finishes, like a my variable would, for instance.

Leave a Reply


[ Ctrl + Enter ]

7ads6x98y