[Israel.pm] why does upper-casing costs more than lower-casing?

Omer Zak w1 at zak.co.il
Thu Feb 23 18:18:24 PST 2012


Maybe because there are 2 characters to change to lowercase and 25
characters to change to uppercase?

Thinking in terms of machine code implementation:
If the string is Unicode, then it is not feasible to look up every
character in a lookup table which either returns the same character
unchanged or a lc/uc version of the character.
Then one must test if the character is in a range, and if yes, then
translate it to lc/uc version using a small lookup table and/or
arithmetic.

You may want to rerun the benchmark when the string is ISO_8859-1, so
that a lookup table is guaranteed to be used unconditionally.

Another test is to rerun the benchmark using a string with different
uppercase/lowercase ratio.

--- Omer


On Fri, 2012-02-24 at 03:51 +0200, Nathan Abu wrote:
> Hi,
> 
> 
> been wondering why does uc takes almost twice the time compared
> to lc? 
> 
> 
> code:
> 
> 
> #!/usr/bin/perl
> use warnings;
> use strict;
> use Benchmark qw(cmpthese timethese :hireswallclock);
> my $string  = 'Phone: 054-8765434 Email: nabu at lknhy4564.com 054-3232
> 888-22222 abdsx';
> 
> 
> cmpthese(
>     timethese(
>         50_000_000,
>         {   to_lower => sub { lc($string); },
>             to_upper => sub { uc($string); }
>         }
>     )
> )
> 
> 
> which results in:
> Benchmark: timing 50000000 iterations of to_lower, to_upper...
>   to_lower: 8.45244 wallclock secs ( 8.45 usr + -0.00 sys =  8.45 CPU)
> @ 5917159.76/s (n=50000000)
>   to_upper: 15.321 wallclock secs (15.31 usr +  0.00 sys = 15.31 CPU)
> @ 3265839.32/s (n=50000000)
>               Rate to_upper to_lower
> to_upper 3265839/s       --     -45%
> to_lower 5917160/s      81% 

-- 
May the holy trinity of  $_, @_ and %_ be hallowed.
My own blog is at http://www.zak.co.il/tddpirate/

My opinions, as expressed in this E-mail message, are mine alone.
They do not represent the official policy of any organization with which
I may be affiliated in any way.
WARNING TO SPAMMERS:  at http://www.zak.co.il/spamwarning.html



More information about the Perl mailing list