[Israel.pm] Memory problem - Loading big files

Gaal Yahas gaal at forum2.org
Thu Jun 19 12:54:13 PDT 2008


There'll always be problem sizes which C will allow you to do things
Perl wouldn't.

Regarding the cost of Perl objects, I'd suggest looking at how other
dynamic languages fare in naive native mode. The results may be
interesting.

For your specific problem, under your given performance requirements:
assuming your records are roughly the same size, and the distribution
of keys is more or less uniform, you really have to do very few disk
seeks. Seek a rough estimate of your next binary search target, find a
record separator, and go from there. I think Tie::File may in fact
allow you to do this with little effort. It looks like you have less
than a thousand seeks for the whole task, which won't take very long.
Of course, you never did say the distributions are like that, so this
may turn out very badly :)

On Thu, Jun 19, 2008 at 10:44 PM, Shmuel Fomberg <semuelf at 012.net.il> wrote:
> Assaf Gordon wrote:
>
>> I'm having problems loading big files into memory - maybe you could help
>> me solve them.
>>
>> My data file is a big (~250MB) text file, with eight tab-separated
>> fields. I want to load the entire file into a list.
>
> Another idea: instead of loading the entire file, you can load "only"
> the list of file positions for every line. (using tell()) and when you
> search the file, you just seek to the location (seek()), read the line,
> and be off with it.
>
> Shmuel.
> _______________________________________________
> Perl mailing list
> Perl at perl.org.il
> http://perl.org.il/mailman/listinfo/perl
>



-- 
Gaal Yahas <gaal at forum2.org>
http://gaal.livejournal.com/



More information about the Perl mailing list