Understand the Test Anything Protocol

The Test Anything Protocol, or just TAP, is the formalization of Perl 5’s test structure from the Test::Harness module. Either Andreas König or Tim Bunce (they don’t remember which one of them did it) created the module, but they can’t remember who did what or when. The Changes file for the Test-Harness starts in seriousness in 2006, around the time that people started working on the next generation of Perl’s testing backend, despite it existing for several years before that. Now TAP is semi-formalized (and IETF RFC is in the works) and has it’s own website at testanything.org.

A little history

In the beginning, there was Larry Wall’s TEST program that came in Perl 1.0. It was a simple program that ran Perl test scripts and inspected the output to look for various markers to tell if things worked. If it saw an ok at the beginning of line, it figured that the test had passed.

Perl 5 changed things. It created the idea of a module, and by extension, it made Perl distribution much more important. These needed tests too, and eventually either Andreas or Tim created the Test::Harness harness module to interpret the output of the test scripts. People would create the output by hand, so in old Perl distributions you’ll see test scripts that handle their own output. You’ll often see this code in test.pl:

BEGIN { $test = 1; $| = 1; print "1..3\n"; }
END {print "not ok 1\n" unless $loaded;}
$loaded = 1;

print "ok ", $test++, "\n" if ...;

print "not " if ...;
print "ok ", $test++, "\n".

Many people, including me, cribbed that code from other distributions when they started. Business-ISBN-1.56 was my first distribution with tests. This script handles all of the output details itself, and if everything worked, you’ll get this output:

$ make test
PERL_DL_NONLAZY=1 /Users/brian/bin/perl "-Iblib/lib" "-Iblib/arch" test.pl
1..14
ok 1
ok 2
ok 3
ok 4
...
ok 14

If something didn’t work, one of those lines would start with not. It’s a simple format.

This is the basis of TAP, although it wasn’t called that at the time. There’s a line like 1..14 that tells the test suite to expect 14 tests. That’s the plan, and it’s at the top, before the rest of the output. Your test script might die in the middle, never getting to test 14, but also not outputing any lines with not ok. Your test script should only pass if it runs all of the tests. This feature is enhanced in later TAP versions.

There needs to be something to interpret this output though, and that’s where the Test::Harness comes in. Instead of just looking at the output with your eyes, you should be able to use a Perl program to do that for you. You know, extract and report and all of that.

Eventually there was a module, Test, which offered the convenience function ok() which did all sorts of things. This is part of the synopsis for the module:

use Test;

# use a BEGIN block so we print our plan before MyModule is loaded
BEGIN { plan tests => 10, todo => [3,4] }

ok(0); # failure
ok(1); # success

ok(0); # ok, expected failure (see todo list, above)
ok(1); # surprise success!

ok(0,1);             # failure: '0' ne '1'
ok('broke','fixed'); # failure: 'broke' ne 'fixed'
ok('fixed','fixed'); # success: 'fixed' eq 'fixed'
ok('fixed',qr/x/);   # success: 'fixed' =~ qr/x/

ok(sub { 1+1 }, 2);  # success: '2' eq '2'
ok(sub { 1+1 }, 3);  # failure: '2' ne '3'

This made the test output much more interesting. The ok() would output helpful diagnostic messages on failure to show what it was expecting and what it was expecting and what it got:

1..14 todo 3 4;
# Running under perl version 5.014001 for darwin
# Current time local: Mon Aug  8 21:17:09 2011
# Current time GMT:   Tue Aug  9 02:17:09 2011
# Using Test.pm version 1.25_02
not ok 1
# Failed test 1 in test2.pl at line 6
#  test2.pl line 6 is: ok(0); # failure
ok 2
not ok 3
# Failed test 3 in test2.pl at line 9 *TODO*
#  test2.pl line 9 is: ok(0); # ok, expected failure (see todo list, above)
ok 4 # (test2.pl at line 10 TODO?!)
not ok 5
# Test 5 got: "0" (test2.pl at line 12)
#   Expected: "1"
#  test2.pl line 12 is: ok(0,1);             # failure: '0' ne '1'
not ok 6
# Test 6 got: "broke" (test2.pl at line 13)
#   Expected: "fixed"
#  test2.pl line 13 is: ok('broke','fixed'); # failure: 'broke' ne 'fixed'
ok 7
ok 8
ok 9
not ok 10
# Test 10 got: "2" (test2.pl at line 18)
#    Expected: "3"
#  test2.pl line 18 is: ok(sub { 1+1 }, 3);  # failure: '2' ne '3'

Notice the todo feature which allows you to let tests that actually fail not really fail. That feature lets you declare that you know those tests are going to fail, perhaps because you want to delay the work to make them pass. This feature gets extended later, and the enhanced version is near the end of this Item.

Test evolved into Test::Simple, first released in 2001. Schwern wanted something that made it easy to write tests. This eventually turned into Test::More, which comes in the same distribution and is now the basis for many test programs now.

Now, TAP is documented in Test::Harness::TAP in the Test-Harness distribution.

Seeing the TAP yourself

Most people never see the TAP output. A TAP consumer, such as Test::Harness, intercepts the TAP output when you run make test or Build test. If you look at the output from one of those commands, you’ll probably see a line like this:

$ make test
...
PERL_DL_NONLAZY=1 /usr/local/perls/perl-5.10.1/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t

That test_harness takes the file list, t/*.t, runs those files, intercepts the output, and makes a summary, like this output from the YAML distribution:

$ make test
...
PERL_DL_NONLAZY=1 /Users/brian/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'inc', 'blib/lib', 'blib/arch')" t/*.t
t/2-scalars.t ........ ok   
t/basic-tests.t ...... ok   
t/bugs-emailed.t ..... ok     
...   
All tests successful.
Files=34, Tests=452,  7 wallclock secs ( 0.20 usr  0.13 sys +  4.98 cusr  0.55 csys =  5.86 CPU)
Result: PASS

In that output, the Test::Harness summarizes the output from each test file and hides the rest.

If you want to see the output that Test::Harness hides from you, you can run the tests in verbose mode.

If the distribution uses Makefile.PL, you can add TEST_VERBOSE=1 to the command line. Here’s a run from XML-Entities. You can see that each ok has a test number, which allows the harness to summarize which tests failed:

$ make test TEST_VERBOSE=1
PERL_DL_NONLAZY=1 /Users/brian/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(1, 'inc', 'blib/lib', 'blib/arch')" t/*.t
t/2-scalars.t ........ 
1..9
ok 1
ok 2
ok 3
ok 4
ok 5
ok 6
ok 7
ok 8
ok 9
ok
t/basic-tests.t ...... 
1..4
ok 1 - A simple map
ok 2 - Common String Types
ok 3 - Multiple documents
ok 4 - Comments
ok
t/bugs-emailed.t ..... 
1..25
ok 1 - Date: Tue, 03 Jan 2006 18:04:56 (n->y)
ok 2 - Date: Tue, 03 Jan 2006 18:04:56 (y->n)
...

If the distribution uses Build.PL, you can use the --verbose switch:

$ Build --verbose test
t/00-xml-entities-data.t .. 
1..8
ok 1 - use XML::Entities::Data;
ok 2 - Are any entity sets defined?
ok 3 - Do names() contain real function names?
ok 4 - Does all() return a hashref?
ok 5 - Does caching work?
ok 6 - Is every set a subset of all?
ok 7 - Does char2entity return a hashref?
ok 8 - Does char2entity return a reverse mapping of all?
ok
t/01-xml-entities.t ....... 
1..9
ok 1 - use XML::Entities;
ok 2 - Single entity in ASCII string
ok 3 - Single entity in UNICODE string

You can also just run the files your without the harness. You can use the blib module to automatically add the right build directories to @INC:

$ perl -Mblib t/some_test.t

If you want to run tests like this, you need to ensure that you rebuild everything in case you changed something. You might do it as part of the same command line, choosing the right one for the distribution’s build system:

$ make; perl -Mblib t/some_test.t
$ Build; perl -Mblib t/some_test.t

You can also use the -v switch to prove (Item 87. Use prove for flexible test runs.). You can run all of the test scripts:

$ make; prove -b -v

Or you can run individual test scripts:

$ make; prove -b -v t/some_test.t
$ Build; perl -b -v t/some_test.t

Basic features

There are many features to TAP, but I’ll highlight just the most commonly used ones by using Test::More to create the output. You don’t need to run these scripts in a distribution. The programs are just here to create the right TAP output.

The plan

The plan tells the test harness how many tests to expect. You can tell Test::More how many tests that you expect to run with the tests key in the import list. This script uses the Test::More functions pass() and fail():

# test.t
use Test::More tests => 2;

pass();
fail();

In the output, you see the plan, 1..2 as the first line of output. Test::More keeps track of the test numbers for you. The pass outputs ok with the next test number. The fail does the same with not ok:

$ perl test.t
1..2
ok 1
not ok 2
#   Failed test at test2.pl line 4.
# Looks like you failed 1 test of 2.

You don’t have to put the plan at the top though. Later versions of TAP allows the plan to be at the end. Since the plan tells the harness how many tests to expect so it knows it got to the end, a plan at the end does the same thing. If it doesn’t see the plan at the beginning or the end, it knows that something went wrong.

# test.t
use Test::More;

pass();
fail();

done_testing();

You might want to do this if you don’t know ahead of time how many tests you’ll run. You can use Test::More‘s done_testing() at the end to signal that you made it all the way through. Now the plan is at the end:

$ perl test.t
ok 1
not ok 2
#   Failed test at test2.pl line 4.
1..2
# Looks like you failed 1 test of 2.

The done_testing() is in Test::More 0.88, and replaces that previous way to do this. Before then, you could specify no_plan in the import list to do the same thing:

# test.t
use Test::More 'no_plan';

pass();
fail();

Labels

TAP test lines can have labels, which makes it easier for you to look at the output and know which line belongs to which tests. In many of the Test::More functions, you can add an optional argument to specify the label:

# test.t
use Test::More 'no_plan';

pass('This should pass');
fail('This should fail');

The label goes after the test number and a hyphen:

$ perl test.t
ok 1 - This should pass
not ok 2 - This should fail
#   Failed test 'This should fail'
#   at test2.pl line 5.
1..2
# Looks like you failed 1 test of 2.

You should come up with a consistent policy for your labels. For instance, as in this example, the label says what should have happened if everything worked.

Diagnostics

TAP lines that start with a # are diagnostic lines. You can output anything you like as diagnostics to see extra information. You can output these yourself with Test::More‘s diag:

use Test::More;

diag( 'Running pass now' );
pass( 'This should pass' );

diag( 'Running fail now' );
fail( 'This should fail' );

done_testing();

Your diagnostic lines show up on standard error, which TAP consumers ignore, instead of standard output, but usually you’ll see those lines interleaved:

$ perl test.t
# Running pass now
ok 1 - This should pass
# Running fail now
not ok 2 - This should fail
#   Failed test at test2.pl line 7.
1..2
# Looks like you failed 1 test of 2.

Some people add diagnostic lines to their test scripts to they can see that you are running:

use Test::More;

diag( "This is $0 run by $^X $^V" );

diag( 'Running pass now' );
pass( 'This should pass' );

diag( 'Running fail now' );
fail( 'This should fail' );

done_testing();

Now your test output tells you what the person running the test used:

$ perl test.t
# This is test2.pl run by perl v5.14.1
# Running pass now
ok 1 - This should pass
# Running fail now
not ok 2 - This should fail
#   Failed test at test2.pl line 9.
1..2
# Looks like you failed 1 test of 2.

These diagnostics lines are going to be the basis of the advanced features in Test::More:

use Test::More;

diag( "This is $0 run by $^X $^V" );

my $string = 'Buster Bean';
like( $string, qr/Mimi/, 'Matched the cat name' );

done_testing();

The like passes if its first argument matches the pattern in the second argument. If it doesn’t match, it shows you what it got and what it expected:

$ perl test.t
# This is test2.pl run by perl v5.14.1
not ok 1 - Matched the cat name
#   Failed test 'Matched the cat name'
#   at test2.pl line 6.
#                   'Buster Bean'
#     doesn't match '(?^:Mimi)'
1..1
# Looks like you failed 1 test of 1.

Skipping tests

Sometimes you don’t want to run some tests, so you can skip them. You wrap those tests in a block that you label with SKIP. Inside that block, you call Test::More‘s skip. You give skip a label and a number of tests that you won’t run:

# test.t
use Test::More;

pass('This should pass');

SKIP: {
	skip( 'This does not work yet', 2 );
	fail( 'This should fail' );
	pass( 'This should pass too' );
	}
	
done_testing();

Perl doesn’t even run those tests. The skip outputs enough lines to fill in the number you won’t run using the label you gave it. It then short-circuits the block:

$ perl test.t
ok 1 - This should pass
ok 2 # skip This does not work yet
ok 3 # skip This does not work yet
1..3

You might want to skip tests if some feature is missing from platform, the person doesn’t have an optional module, and so on. If you don’t support a feature before Perl 5.10, you can skip those tests if the person runs your tests with an earlier perl:

# test.t
use Test::More;

pass('This should pass');

SKIP: {
	skip( 'This does not work before Perl 5.10', 2 ) if $] < 5.010;
	fail( 'This should fail' );
	pass( 'This should pass too' );
	}
	
done_testing();

When you run it with Perl 5.10 or later, the tests run:

$ perl5.14.1 test.t
ok 1 - This should pass
not ok 2 - This should fail
#   Failed test 'This should fail'
#   at test.t line 8.
ok 3 - This should pass too
1..3
# Looks like you failed 1 test of 3.

If you use an earlier perl, you skip the tests:

$ perl5.8.9 test.t
ok 1 - This should pass
ok 2 # skip This does not work before Perl 5.10
ok 3 # skip This does not work before Perl 5.10
1..3

You can also skip all the tests in file. The plan has a skip_all option:

# test.t
use Test::More;

plan skip_all => 'The time is odd' if time % 2;

pass('This should pass');

SKIP: {
	skip( 'This does not work before Perl 5.10', 2 ) if $] < 5.010;
	fail( 'This should fail' );
	pass( 'This should pass too' );
	}

done_testing();

On an odd second, none of the tests run, and there's a special plan that notes that you skipped all tests:

$ perl test.t
1..0 # SKIP The time is odd

A second later, the tests run because the script makes it past the skip_all:

$ perl test.t
ok 1 - This should pass
not ok 2 - This should fail
#   Failed test 'This should fail'
#   at test.t line 10.
ok 3 - This should pass too
1..3
# Looks like you failed 1 test of 3.

To do tests

Sometimes tests don't work. You may be in the middle of changing around some features that you haven't completely fixed, for instance. You don't want to change the tests, but you don't want them to really fail because you are ignoring those for a bit. You can mark them as "TO DO". The syntax looks like that for SKIP, but in this case, you still run the tests:

# test.t
use Test::More;

pass('This should pass');

TODO: {
	local $TODO = 'Oops, I did it again';
	fail( 'This should fail' );
	pass( 'This should pass too' );
	}

done_testing();

The TAP output has a directive after the label that marks it as "TODO". A test harness won't count those as hard failures (although it will warn if a "TODO" test passes):

$ perl test.t
ok 1 - This should pass
not ok 2 - This should fail # TODO Oops, I did it again
#   Failed (TODO) test 'This should fail'
#   at test.t line 8.
ok 3 - This should pass too # TODO Oops, I did it again
1..3

Things to remember

TAP is the Test Anywhere Protocol, a simple test report format
Label your tests so you can look at the raw output and identity tests
Use the Test::More module's functions to output the proper things