[Israel.pm] making a csv file from xml data

moshe nahmias moshegrey at ubuntu.com
Sat Oct 27 03:59:12 PDT 2012


Hi,

I need to convert data from xml to csv format, the data is in Hebrew.
I tried to do it with extracting the data by my own code and it worked, but
when i try to write it to a file every variable gets a new line, even when
chomping the data right before printing it to a file.

Here is the code:

use warnings;
use strict;
use LWP::UserAgent;
use utf8;

open my $source, "<", "/home/moshe/perl/work/moreshet.xml" or die "can't
read file 'moreshet': $! ";
open my $file, ">", "/home/moshe/perl/work/file.txt" or die "can't write
file 'file.txt': $! ";
my $line;

my ( @links, $country, $topic, $description, $lang, $type, $mordesc,
$level, $note, @items );

my $i = 1;
#my $ua = LWP::UserAgent->new;
#$ua->timeout(10);
#$ua->env_proxy;

my @header = ( 'country', 'topic', 'description', 'language', 'type', 'sub
description', 'level', 'note', 'item' );

while ( my $entry = <$source> ) {
    chomp $entry;
    if ( $entry =~ /<record_country>/ ) {
        $entry =~ s/\s+\<record_country\>//;
        $entry =~ s/\<\/record_country\>//;
        $country = "$entry";
    }
    if ( $entry =~ /<record_topic>/ ) {
        $entry =~ s/\s+\<record_topic\>//;
        $entry =~ s/\<\/record_topic\>//;
        $topic = "$entry";
    }
    if ( $entry =~ /<description>/ ) {
        $entry =~ s/\s+\<description\>//;
        $entry =~ s/\<\/description\>//;
        $description = "$entry";
    }
    if ( $entry =~ /<lang id="/ ) {
        $entry =~ s/\s+<lang id="\w\w">//;
        $entry =~ s/\<\/lang\>//;
        $lang = "$entry";
    }
    if ( $entry =~ /<type id="/ ) {
        $entry =~ s/\s+<type id="\w+">//;
        $entry =~ s/\<\/type\>//;
        $type = "$entry";
    }
    if ( $entry =~ /<sub_description>/ ) {
        $entry =~ s/\s+\<sub_description>//;
        $entry =~ s/\<\/sub_description\>//;
        $mordesc = "$entry";
    }
    if ( $entry =~ /<level>/ ) {
        $entry =~ s/\s+\<level>//;
        $entry =~ s/\<\/level\>//;
        $level = "$entry";
    }
    if ( $entry =~ /<note>/ ) {
        $entry =~ s/\s+\<note>//;
        $entry =~ s/\<\/note\>//;
        $note = "$entry";
    }
    if ( $entry =~ /\<item/ ) {
        $entry =~ s/\s+\<item id\=\"\d?\d\" source\=\"//;
        $entry =~ s/\<\/item\>//;
        my @item = split /\"\>/, $entry;
#        my $response = $ua->get("$entry");
        push @links, $item[0];  # or whatever
        push @items, $item[1];
    }
    if ( $entry =~ /\<\/record/ ) {
        chomp $country;
        chomp $topic;
        chomp $description;
        chomp $lang;
        chomp $type;
        chomp $mordesc;
        chomp $level;
        print $file
"$country,$topic,$description,$lang,$type,$mordesc,$level, at items, at links\n";
        print "$country,$topic,$description,$lang,$type,$mordesc,$level,",
@items, ",", @links, "\n";
        @links = "";
        @items = "";
    }
}

close $file;

Do you have an idea why this is not working?

Moshe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.perl.org.il/pipermail/perl/attachments/20121027/dbd2ced1/attachment.htm 


More information about the Perl mailing list