[Israel.pm] count chars

Amir E. Aharoni amir.aharoni at gmail.com
Mon Sep 3 05:08:16 PDT 2007


On 03/09/07, Yossi Itzkovich <Yossi.Itzkovich at ecitele.com> wrote:
> I don't have a better solution, but I don't think it saves anything in
> run time: not speed nor memory (assuming @matches goes out of scope very
> soon).
> To be able to return the scalar of a list, the list still needs to be
> built.

Of course, but maybe there's a way to do it without a list at all? Not
even an anonymous list ... just a counter.

This got me curious, and i decided to try to measure it.

I'm not an expert in Perl benchmarking, but i decided to try and run a
very very crude benchmark of my own (see the code in the end).

First i build a very long string. Then i try to count the chars using
m//g ("array counting") and then using s///g ("subst counting"). I
measured the time using the `time' builtin function. To measure memory
i simply watched using Windows Task Manager. (Can anyone recommend me
a Perlish way to measure a program's memory usage?)

The results are quite interesting.

When i try to run array first and then subst, it takes 27 seconds to
find it with array and 21 seconds to find it with subst.

When i try to run subst first and then array, it takes 45 seconds to
find it with subst and only 4 seconds to find it with array!

My wild guess is that it happens because the pattern matching is
optimized after the first run and that different regex operations
produce different optimizations. Can anyone provide a more precise
explanation?

In both cases it seems that array counting uses much more memory.

Here's the code. I am in a bit of a hurry, and i hope that it doesn't
have stupid bugs:

use strict;
use warnings;

my $string = "Abracadabra";

my $time_begin_string = time;

for (1 .. 20) {
    $string .= $string++;
}

my $time_end_string = time - $time_begin_string;
print "time to build string: $time_end_string\n";

###########################

sleep 5;

print "starting array\n";

my $time_begin_array = time;

my @matches = ($string =~ /a/g);

my $array_matches = scalar @matches;
print "matches with subst: $array_matches\n";

my $time_end_array = time - $time_begin_array;
print "time to search with array: $time_end_array\n";

###########################

sleep 5;

print "starting subst\n";

my $time_begin_subst = time;

my $subst_matches = ($string =~ s/(a)/$1/g);

print "matches with subst: $subst_matches\n";

my $time_end_subst = time - $time_begin_subst;
print "time to search with subst: $time_end_subst\n";



More information about the Perl mailing list