[Israel.pm] Re: optimising memmory usage
mellerf at netvision.net.il
Mon Jan 5 12:55:45 PST 2004
On ב', 2004-01-05 at 14:54, Shlomo Yona wrote:
> On Mon, 5 Jan 2004, Yosef Meller wrote:
> > Is it important for all the input files to be concatenated? If not, you
> > can just fetch the file names and then process each at a time.
> > If it is, you can still process each at a time and then sum the results
> > from all files into one (however then sequences at ends of files will
> > not be concatenated to starts of files).
> That is actually good advice.
> I did that (see the other emails I sent on this thread last
> night), and of course, it made greate improvement.
Yeah, I missed that 'cause it took me some while to write the packing
> > Inspired by the code of Acme::Bleach, I found a way to reduce the size
> > of your hash keys by packing them with single bits instead of space
> > charachters.
> Ahhh! Wait!
> pack and unpack() don't work well (please correct me if I'm
> wrong) with UTF8 encoding.
See the 2nd paragraph prior to the script in my post...
Also, pack can be used to convert any string to byte-strings.
> Another problem -- I'm using keys representing string which
> are in many many cases longer than 8 characters. So...
> thanks, but I need something more robust. Anyway, your idea
> is nice and useful, had I been working on English text, for
> example. As I'm handling Hebrew, and as I'm doing statistics
> on N-Grams which make up keys larger than 8 characters/bytes
> I need some other strategy.
This time it's the 1st paragraph prior to the script.
> Thanks again.
No, thank you. It was really a fun thing to do. I never thought a module
in the Acme:: namespace can turn out to be useful.
More information about the Perl