[Perl] parsing XML with hebrew

Gil Klein gklein at barak-online.net
Mon Mar 17 01:42:36 PST 2003


 'not well-formed (invalid token)'
 There are a number of causes of this error, here are some common ones:

Unquoted attributes
All attribute values must always be quoted in XML. For example, this would
be well formed:
<item name="widget"></item>

while this would not:

<item name=widget></item>

Bad encoding declaration
An incorrect or missing encoding declaration can cause this. By default the
encoding is assumed to be UTF8 so if your data is (say) ISO-8859-1 encoded
then you must include an encoding declaration. For example:

<?xml version='1.0' encoding='iso-8859-1'?>

----- Original Message -----
From: "Gabor Szabo" <gabor at tracert.com>
To: <perl at perl.org.il>
Sent: Monday, March 17, 2003 11:09 AM
Subject: [Perl] parsing XML with hebrew

> I have several comma delimitered files that I transformed to XML
> using XML::Simple. All seemed to work well, XML::Simple could read back
> the data and I could work with it until one of the comma delimitered
> files contained hebrew text in one of the fields.
> It seems that the conversion to XML (using XML::Simple) still worked
> well but when I tried to read it again using XML::Simple it failed with
> the following:
> not well-formed (invalid token) at line 327, column 37, byte 11286 at
> /usr/local/lib/perl5/site_perl/5.8.0/i686-linux/XML/Parser.pm line 185
> What shall I do now ?
> Gabor
> _______________________________________________
> Perl mailing list
> Perl at perl.org.il
> http://www.perl.org.il/mailman/listinfo/perl
> YAPC::Israel::2003
> http://www.perl.org.il/YAPC/2003/

More information about the Perl mailing list