[Israel.pm] Memory problem - Loading big files

Gabor Szabo szabgab at gmail.com
Thu Jun 19 04:45:50 PDT 2008


On Tue, Jun 17, 2008 at 2:35 AM, Assaf Gordon <gordon at cshl.edu> wrote:
> Hello all,
>
> I'm having problems loading big files into memory - maybe you could help
> me solve them.
>
> My data file is a big (~250MB) text file, with eight tab-separated
> fields. I want to load the entire file into a list.
>
> I've narrowed down the code into this:
> -------------
> #!/usr/bin/perl
> use strict;
> use warnings;
> use Data::Dumper;
> use Devel::Size qw (size total_size);
>
> my @probes;
> while (<>) {
>        my @fields = split(/\s+/);
>        push @probes, \@fields;
> }
>
> print "size = ", size(\@probes),"\n";
> print "total size= ", total_size(\@probes),"\n";
> print "data size = ", total_size(\@probes)- size(\@probes),"\n";
> print Dumper(\@probes),"\n";
> ------------
> (Can't get any simpler than that, right?)
>
> But when I run the program, the perl process consumes 2.5GB of memory,
> prints "out of memory" and stops.
>
> I know that perl isn't the most efficient memory consumer, but surely
> there's a way to do it...
>
> If you care to test it yourselves, here's a simple script that creates a
> dummy text file, similar to my own data file:
> -----
> #!/usr/bin/perl
> foreach (1..2100000) { print join("\t", "LONG-TEXT-FIELD", 11111,
> 222222, 3333333, 44444444, 5555555, 6666666,
> "VERY-VERY-VERY-VERY-VERY-VERY-VERY-VERY-VERY-LONG-TEXT-FIELD" ),"\n" ; }
> -----

Perl variables do consume a lot of memory.
I ran a script similar to yours (just without creating the external file).
for me it only used 800 Mb memory on a perl 5.8.8 on Ubuntu.
So I wonder if your actual file contains more lines (even if they are shorter)
or more fields or if you are using a different version of perl
or a different version of Devel::Size that might show different numbers.

In any case I don't think I would ever want to load such a file into memory.
What are you trying to achive? Maybe you can do it line by line?


Gabor

-- 
Gabor Szabo http://szabgab.com/blog.html
Perl Training in Israel http://www.pti.co.il/
Test Automation Tips http://szabgab.com/test_automation_tips.html



More information about the Perl mailing list