[Israel.pm] Re: optimising memmory usage

Yosef Meller mellerf at netvision.net.il
Mon Jan 5 12:55:45 PST 2004


On ב', 2004-01-05 at 14:54, Shlomo Yona wrote:
> On Mon, 5 Jan 2004, Yosef Meller wrote:
> 
> > Is it important for all the input files to be concatenated? If not, you
> > can just fetch the file names and then process each at a time.
> > If it is, you can still process each at a time and then sum the results
> > from all files into one (however then sequences at ends of files will
> > not be concatenated to starts of files).
> 
> That is actually good advice.
> I did that (see the other emails I sent on this thread last
> night), and of course, it made greate improvement.
> 

Yeah, I missed that 'cause it took me some while to write the packing
script.

> 
> > Inspired by the code of Acme::Bleach, I found a way to reduce the size
> > of your hash keys by packing them with single bits instead of space
> > charachters. 
> 
> Ahhh! Wait!
> pack and unpack() don't work well (please correct me if I'm
> wrong) with UTF8 encoding.
> 

See the 2nd paragraph prior to the script in my post...
Also, pack can be used to convert any string to byte-strings.

> Another problem -- I'm using keys representing string which
> are in many many cases longer than 8 characters. So...
> thanks, but I need something more robust. Anyway, your idea
> is nice and useful, had I been working on English text, for
> example. As I'm handling Hebrew, and as I'm doing statistics
> on N-Grams which make up keys larger than 8 characters/bytes
> I need some other strategy.
> 

This time it's the 1st paragraph prior to the script.

> Thanks again.

No, thank you. It was really a fun thing to do. I never thought a module
in the Acme:: namespace can turn out to be useful.

Yosef Meller.




More information about the Perl mailing list