[Israel.pm] Memory problem - Loading big files

Omer Zak w1 at zak.co.il
Thu Jun 19 04:33:45 PDT 2008


I would say that the problem is with the algorithm, which requires Assaf
to load the entire file into memory.  What happens if Assaf's business
grows and he has to deal with a 2.5GB sized file?

I suggest that the algorithm be reviewed and maybe some other algorithm
accomplishing the same objective can be identified.  After all, computer
scientists have been dealing with space and time tradeoffs for several
years.  Can Assaf publicize his algorithm for us to review it and
provide useful advice about optimizing its memory requirements?

If the data can be processed sequentially, then process the file line by
line.

If not, consider loading its contents into a DB and using DB facilities
to access random parts of the file.

If you MUST load the entire file into memory, then Perl is not the tool
for you.  Use C/C++ functions for this purpose.  If you still want to
script using Perl, then build a DLL for managing the big file and use
swig or some such tool to build interface to it from Perl.

                                                --- Omer
P.S.: Assaf, if your algorithm is not for public knowledge and you would
like to get consulting about tweaking it, contact me in private.

On Thu, 2008-06-19 at 14:14 +0300, Yossi Itzkovich wrote:
>  Hi,
> 
> The problem is not in the file.  You have @probes with number of items as the amount of lines in the file.
> For each line you have an additional array...
> 
> Yossi
> 
> -----Original Message-----
> From: perl-bounces at perl.org.il [mailto:perl-bounces at perl.org.il] On Behalf Of Assaf Gordon
> Sent: Tuesday, June 17, 2008 2:36 AM
> To: perl at perl.org.il
> Subject: [Israel.pm] Memory problem - Loading big files
> 
> Hello all,
> 
> I'm having problems loading big files into memory - maybe you could help
> me solve them.
> 
> My data file is a big (~250MB) text file, with eight tab-separated
> fields. I want to load the entire file into a list.

-- 
My Commodore 64 is suffering from slowness and insufficiency of memory;
and its display device is grievously short of pixels.  Can anyone help?
My own blog is at http://www.zak.co.il/tddpirate/

My opinions, as expressed in this E-mail message, are mine alone.
They do not represent the official policy of any organization with which
I may be affiliated in any way.
WARNING TO SPAMMERS:  at http://www.zak.co.il/spamwarning.html




More information about the Perl mailing list