Make deep copies

When you want to make a completely disconnected copy of a hash or an array, it’s not enough to merely assign it to a new variable name, at least in the general case:

my @copy = @original;

This is a bad habit that new Perl programmers pick up because they are only dealing with flat arrays: that is, every value is a simple scalar, so they never see the problem:

my @original = qw( Buster Ginger Mimi Ella );

my @copy     = @original;

If you know that you have only a flat array, that’s not a problem. However, the experienced programmer knows that “you know” is often the source of misguided assumptions that blow up later.

At the Learning Perl level, we conveniently skirt the issue because we don’t talk about references until Intermediate Perl. That’s right: blame me.

Consider the case where one of the array elements is a reference. As long as you don’t change the values, you don’t see the problem:

my $buster      = { 
	name => 'Buster', 
	colors => [ qw(black white) ] 
	};
my @original = ( $buster, qw( Ginger Mimi Ella ) );

my @copy     = @original;


printf "In \@original, the first cat's name is %s\n", $original[0]->{name};
printf "In \@copy, the first cat's name is %s\n", $copy[0]->{name};

The output doesn’t show anything out of place:

In @original, the first cat's name is Buster
In @copy, the first cat's name is Buster

Now change one of the values in @copy:

my $buster      = { 
	name => 'Buster', 
	colors => [ qw(black white) ] 
	};
my @original = ( $buster, qw( Ginger Mimi Ella ) );

my @copy     = @original;

$copy[0]->{name} = 'Roscoe';

printf "In \@original, the first cat's name is %s\n", $original[0]->{name};
printf "In \@copy, the first cat's name is %s\n", $copy[0]->{name};

The output now shows both arrays were affected even though you only wanted to change @copy:

In @original, the first cat's name is Roscoe
In @copy, the first cat's name is Roscoe

Instead of an assignment, which is the source of your problem, make a deep copy. There are several ways that you can do this. The dclone function (for deep clone) from Storable (part of the Standard Library) can do it for you:

use Storable qw(dclone);

my $buster      = { 
	name => 'Buster', 
	colors => [ qw(black white) ] 
	};
my @original = ( $buster, qw( Ginger Mimi Ella ) );

my $copy     = dclone \@original;

$copy->[0]{name} = 'Roscoe';

printf "In \@original, the first cat's name is %s\n", $original[0]->{name};
printf "In \@copy, the first cat's name is %s\n", $copy->[0]{name};

Now the output shows that the two arrays remained distinct:

In @original, the first cat's name is Buster
In @copy, the first cat's name is Roscoe

You had to make one subtle change to your program, however. dclone takes a reference and returns a reference, so you didn’t copy @original to another array variable. That’s not very nice. You could copy the de-referenced value:

my @copy     = @{ dclone \@original };

That’s not very pretty either. You could hide that in a subroutine, but the easiest thing is to just give over to references and use them everywhere from the start (but that’s another Item for another time):

use Storable qw(dclone);

my $buster      = { 
	name => 'Buster', 
	colors => [ qw(black white) ] 
	};
my $original = [ $buster, qw( Ginger Mimi Ella ) ];

my $copy     = dclone $original;

$copy->[0]{name} = 'Roscoe';

printf "In \$original, the first cat's name is %s\n", $original->[0]{name};
printf "In \$copy, the first cat's name is %s\n", $copy->[0]{name};

Set aside the issue named variables to consider a more common case where that reference element in your array is an object. Here’s the simple class that implements a simple Cat:

BEGIN {
package Cat;

sub new {
	my( $class, %hash ) = @_;
	bless \%hash, $class;
	}
	
sub set_name { $_[0]->{name} = $_[1] }
sub get_name { $_[0]->{name} }
}

Converting your previous problematic program to use Cat, you have:

my $buster      = Cat->new( 
	name => 'Buster', 
	colors => [ qw(black white) ] 
	);
my $original = [ $buster, qw( Ginger Mimi Ella ) ];

my $copy     = $original;

$copy->[0]->set_name( 'Roscoe' );

printf "In \$original, the first cat's name is %s\n", $original->[0]->get_name;
printf "In \$copy, the first cat's name is %s\n", $copy->[0]->get_name;

Again, the output shows that you changed “both” objects:

In $original, the first cat's name is Roscoe
In $copy, the first cat's name is Roscoe

Using dclone solves the problem:

my $buster      = Cat->new( 
	name => 'Buster', 
	colors => [ qw(black white) ] 
	);
my $original = [ $buster, qw( Ginger Mimi Ella ) ];

my $copy     = dclone $original;

$copy->[0]->set_name( 'Roscoe' );

printf "In \$original, the first cat's name is %s\n", $original->[0]->get_name;
printf "In \$copy, the first cat's name is %s\n", $copy->[0]->get_name;

Now you have two separate objects, each with distinct values:

In $original, the first cat's name is Buster
In $copy, the first cat's name is Roscoe

There’s an additional problem here though. Simply cloning an object might not be the right thing. What if the object has database connections, open filehandles, or other bits that should be properly re-initialized? What if the object is actually a singleton, so you end up with a copy anyway? You can add a clone method to your class to handle that, and we’ll cover that in a later Item.

Leave a comment

0 Comments.

Leave a Reply


[ Ctrl + Enter ]