Know the phases of a Perl program’s execution

There are two major phases in the execution of a program run by Perl, which you sometimes see as “compile time” and “run time”, or sometimes now, “compile phase” and “run phase”. In the broadest of strokes, perl compiles code in the compile phase, and when it’s completely done with that, it moves on to the run phase, where it executes the code that it completely compiled.

That’s really too simple of a model, though, even if it is a good starting point. perl can also run code in the compile phase and compile code in the run phases. Along with that, perl has various interstitial phases where you can do more work. Each of these phases and sub-phases has a special subroutine and you can define each of them as many times as you like. Each also executes in a particular order relative to other definitions of the same subroutines. These are documnted in perlmod. People sometimes call these scheduled blocks:

Subroutine Description Execution
order
BEGIN Compile its code and run immediately FIFO
CHECK Run right after the main compile phase and before the main run phase. LIFO
UNITCHECK Runs right after the main compile phase pass on each compilation unit, which is probably a file. LIFO
INIT Run right before the main run phase starts FIFO
END Compile the code and run at program end LIFO

Some of these execute in the order that you define them, or first in, first out (FIFO), while others execute in reverse order of definition, or last in, first out (LIFO). In general, the subroutines that happen right away or the start of a phase, such as BEGIN and INIT, are FIFO. The ones that happen at the end, such as CHECK, UNITCHECK, and END, are LIFO.

Looking at it from top to bottom, the subroutines run in this order:

Initial designs

So far, perl can’t save the state of its compilation, so you can’t merely compile a Perl program and run it later, perhaps on another machine, like you can do with Java, Python, or Ruby. It’s a glaring omission, but also a technical limitation based on the other features you get from Perl.

The CHECK gave Perl’s compiler toolkit a place to stop and perhaps examine or save its state after it had compiled everything in its main phase. However, this still wouldn’t include modules that you require (so, load and compile at run time) or code that you load with a string eval. As such, in general, you can never pre-compile every Perl program, which is almost the same as saying it’s a broken feature.

The UNITCHECK subroutine was designed to do the same thing, but to save the compilation of each file. There’s a little known feature of Perl, probably deservedly so, of the .pmc file that is a pre-compiled module file. perl will look for the .pmc version before it tries to load the .pm version. That means that perl could save itself the work of compiling that module, which might be substantial. This feature hasn’t worked out so well, just yet.

Inadvertant execution

Since some of these subroutines run at compile time, that means that you might inadvertantly run code when you don’t think anything will happen, such as under the -c for a syntax check. This program is innocent and doesn’t run any code:

$ perl -c 'print qq|Hello compiler!\n|';
-e syntax OK

These one-liners, however, have the subroutines that execute during the compile phase, so each compiles its code and run it even though it never enters the main run phase:

$ perl -c -e 'BEGIN { print qq|Hello compiler!\n| }';
Hello compiler!
-e syntax OK
$ perl -c -e 'CHECK { print qq|Hello compiler!\n| }';
Hello compiler!
-e syntax OK
$ perl -c -e 'UNITCHECK { print qq|Hello compiler!\n| }';
Hello compiler!
-e syntax OK

This means that someone might be able to provide you a module or library that, maliciously or otherwise, runs code that you might not want to run.

This is not true for the INIT or END blocks, since they run in the main run phase.

$ perl -c -e 'END { print qq|Hello compiler!\n| }';
-e syntax OK
$ perl -c -e 'INIT { print qq|Hello compiler!\n| }';
-e syntax OK

This also matters for files that you load. Create a small module that has each of these subroutines:

package Phases;
use 5.012;

BEGIN     { say 'Ran BEGIN' }
UNITCHECK { say 'Ran UNITCHECK' }
CHECK     { say 'Ran CHECK' }
INIT      { say 'Ran INIT'  }
END       { say 'Ran END'   }

1;

By merely loading this module, you run some code. This simple module is in the current work:

$ perl5.14.1 -c -E 'use Phases;'
Ran BEGIN
Ran UNITCHECK
Ran CHECK
-e syntax OK

This also happens if you load the module with -M switch:

$ perl5.14.1 -c -MPhases -E '1'
Ran BEGIN
Ran UNITCHECK
Ran CHECK
-e syntax OK

Even though you explicitly used the -c switch, but you might also have this problem with your editor if it actually does syntax checks for you. This means that someone might be able to trick you into running code that you never know about. This hasn’t been a huge problem for Perlers, but it is possible.

Black magic

If you want to change these subroutines dynamically, you can use the Devel::Hook module to insert them through code rather than having to type them out literally. Since these scheduled blocks are really just arrays of code references, you might want to change around those arrays. Here’s the example from the module’s documentation:

use Devel::Hook ();

INIT {
	print "INIT #2\n";
	}

BEGIN {
	Devel::Hook->push_INIT_hook( sub { print "INIT #3 (hook)\n" } );
	Devel::Hook->unshift_INIT_hook( sub { print "INIT #1 (hook)\n" } );
	}

print "RUNTIME\n";

It provides methods to add or remove some of the code references, whether from the front or back of the array. So far, it doesn’t supply a way to splice those internal arrays.

Things to remember

  • BEGIN blocks compile and run immediately
  • UNITCHECK blocks run at the end of the file’s compilation, in LIFO order.
  • CHECK blocks run at the end of the main compilation, in LIFO order.
  • INIT blocks run at the beginning of the run phase, in FIFO order.
  • END blocks run at the end of the run phase, in LIFO order.

Post to Twitter Post to Delicious Post to Digg Post to Facebook Post to Reddit

Leave a comment

4 Comments.

  1. It’d be great to have a tutorial on how these phases are actually *useful* – hint hint :D

  2. I put in the UNITCHECK and it turns out to be less useful than expected, as when compiling a module you’re not just compiling the Perl code you’re also running it, the module isn’t compiled until the final 1; sings. I tried to make use of UNITCHECK to automatically namespace clean and immutablise Moose classes, but you need to do this later than UNITCHECK. I sometimes think Perl needs a UNITEND for this, that both the current unit, and any child units can insert code blocks into.

Leave a Reply

You must be logged in to post a comment.