[Israel.pm] What is the best way to compare huge arrays?

manora manora at netvision.net.il
Sun Apr 17 04:27:20 PDT 2005


My question is which of the 2 styles is more perlish, memory efficient,
faster.
When comparing two huge arrays, to find the union and unique of each
array.
Here is my sample code, better solutions are very welcome.
Thenx, arik manor.

use strict;
use warnings FATAL => 'all';

my @arr1 = qw(aa bb cc);
my @arr2 = qw(aa cc dd ee);

# style 1
my ($union, $uniq1, $uniq2) = compare_arrays_1( \@arr1, \@arr2);
print "style_1:\nunion=@$union\nuniq1=@$uniq1\nuniq2=@$uniq2\n";

# style 2
my (@union, @uniq1, @uniq2) = ();
compare_arrays_2( \@arr1, \@arr2, \@union, \@uniq1, \@uniq2);
print "\nstyle_2:\nunion=@union\nuniq1=@uniq1\nuniq2=@uniq2\n";

sub compare_arrays_1{
  my ($arr1, $arr2) = @_;
  my (%union, %uniq1, %uniq2) = ();
  foreach my $cell(@$arr1) { $uniq1{$cell}++ }
  foreach my $cell(@$arr2) {
    if ( $uniq1{$cell} ) {
      $union{$cell}++;
      delete $uniq1{$cell};
    } else {
      $uniq2{$cell}++;
    }
  }
  my @union = keys %union;
  my @uniq1 = keys %uniq1;
  my @uniq2 = keys %uniq2;
  return ( \@union, \@uniq1, \@uniq2);
}

sub compare_arrays_2{
  my ($arr1, $arr2, $union, $uniq1, $uniq2) = @_;
  my (%union, %uniq1, %uniq2) = ();
  foreach my $cell(@$arr1) { $uniq1{$cell}++ }
  foreach my $cell(@$arr2) {
    if ( $uniq1{$cell} ) {
      $union{$cell}++;
      delete $uniq1{$cell};
    } else {
      $uniq2{$cell}++;
    }
  }
  @$union = keys %union;
  @$uniq1 = keys %uniq1;
  @$uniq2 = keys %uniq2;
}




More information about the Perl mailing list