[Israel.pm] making a csv file from xml data
Shlomi Fish
shlomif at shlomifish.org
Sat Oct 27 04:09:51 PDT 2012
Hi Moshe,
some comments on your code.
On Sat, 27 Oct 2012 12:59:12 +0200
moshe nahmias <moshegrey at ubuntu.com> wrote:
> Hi,
>
> I need to convert data from xml to csv format, the data is in Hebrew.
> I tried to do it with extracting the data by my own code and it worked, but
> when i try to write it to a file every variable gets a new line, even when
> chomping the data right before printing it to a file.
>
Maybe it's \r\n? You can use a debugger (perl -d/etc.) to see what actually
happens. See:
http://perl-begin.org/topics/debugging/
> Here is the code:
>
> use warnings;
> use strict;
> use LWP::UserAgent;
> use utf8;
>
> open my $source, "<", "/home/moshe/perl/work/moreshet.xml" or die "can't
> read file 'moreshet': $! ";
> open my $file, ">", "/home/moshe/perl/work/file.txt" or die "can't write
> file 'file.txt': $! ";
You can use autodie here.
> my $line;
>
> my ( @links, $country, $topic, $description, $lang, $type, $mordesc,
> $level, $note, @items );
>
> my $i = 1;
> #my $ua = LWP::UserAgent->new;
> #$ua->timeout(10);
> #$ua->env_proxy;
>
> my @header = ( 'country', 'topic', 'description', 'language', 'type', 'sub
> description', 'level', 'note', 'item' );
>
> while ( my $entry = <$source> ) {
> chomp $entry;
> if ( $entry =~ /<record_country>/ ) {
> $entry =~ s/\s+\<record_country\>//;
> $entry =~ s/\<\/record_country\>//;
> $country = "$entry";
> }
> if ( $entry =~ /<record_topic>/ ) {
> $entry =~ s/\s+\<record_topic\>//;
> $entry =~ s/\<\/record_topic\>//;
> $topic = "$entry";
> }
Don't parse XML using regular expressions. See:
http://perl-begin.org/uses/xml/
Furthermore, in general, you can use a different delimiter
for the s/// and you don't need to escape "<". So you get:
$entry =~ s{</record_topic>}{};
Or
$entry =~ s#</record_topic>##;
Also no need to do "$entry" if $entry is already a string.
Regards,
Shlomi Fish
--
-----------------------------------------------------------------
Shlomi Fish http://www.shlomifish.org/
Rethinking CPAN - http://shlom.in/rethinking-cpan
XSLT is the worst thing since non‐sliced bread.
Please reply to list if it's a mailing list post - http://shlom.in/reply .
More information about the Perl
mailing list