[Israel.pm] making a csv file from xml data

Shlomi Fish shlomif at shlomifish.org
Sat Oct 27 04:09:51 PDT 2012


Hi Moshe,

some comments on your code.

On Sat, 27 Oct 2012 12:59:12 +0200
moshe nahmias <moshegrey at ubuntu.com> wrote:

> Hi,
> 
> I need to convert data from xml to csv format, the data is in Hebrew.
> I tried to do it with extracting the data by my own code and it worked, but
> when i try to write it to a file every variable gets a new line, even when
> chomping the data right before printing it to a file.
>

Maybe it's \r\n? You can use a debugger (perl -d/etc.) to see what actually
happens. See:

http://perl-begin.org/topics/debugging/
 
> Here is the code:
> 
> use warnings;
> use strict;
> use LWP::UserAgent;
> use utf8;
> 
> open my $source, "<", "/home/moshe/perl/work/moreshet.xml" or die "can't
> read file 'moreshet': $! ";
> open my $file, ">", "/home/moshe/perl/work/file.txt" or die "can't write
> file 'file.txt': $! ";

You can use autodie here.

> my $line;
> 
> my ( @links, $country, $topic, $description, $lang, $type, $mordesc,
> $level, $note, @items );
> 
> my $i = 1;
> #my $ua = LWP::UserAgent->new;
> #$ua->timeout(10);
> #$ua->env_proxy;
> 
> my @header = ( 'country', 'topic', 'description', 'language', 'type', 'sub
> description', 'level', 'note', 'item' );
> 
> while ( my $entry = <$source> ) {
>     chomp $entry;
>     if ( $entry =~ /<record_country>/ ) {
>         $entry =~ s/\s+\<record_country\>//;
>         $entry =~ s/\<\/record_country\>//;
>         $country = "$entry";
>     }
>     if ( $entry =~ /<record_topic>/ ) {
>         $entry =~ s/\s+\<record_topic\>//;
>         $entry =~ s/\<\/record_topic\>//;
>         $topic = "$entry";
>     }

Don't parse XML using regular expressions. See:

http://perl-begin.org/uses/xml/

Furthermore, in general, you can use a different delimiter
for the s/// and you don't need to escape "<". So you get:

$entry =~ s{</record_topic>}{};

Or

$entry =~ s#</record_topic>##;

Also no need to do "$entry" if $entry is already a string.

Regards,

	Shlomi Fish


-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
Rethinking CPAN - http://shlom.in/rethinking-cpan

XSLT is the worst thing since non‐sliced bread.

Please reply to list if it's a mailing list post - http://shlom.in/reply .


More information about the Perl mailing list