[Israel.pm] utf-8 hebrew
Gabor Szabo
gabor at perl.org.il
Sun May 2 04:29:39 PDT 2004
Hey Unicode wizzards here is a question to you:
I got two strings:
"\x{5db}\x{5dc}\x{5d1}";
"\x{d7}\x{9b}\x{d7}\x{9c}\x{d7}\x{91}";
The first I got by reading a file using utf8 and
The second I got from a browser via a CGI script.
They are both supposed to be the same word.
I got the above representation by Dumping their variables
using Data::Dumper.
When I tried to compare them (using regex and index)
they seemed to be different.
when I applied
use Encode;
$x = decode("utf-8", STRING);
to the second string it became really equal to the first string
so I thought maybe the second is not really utf-8.
But when I checked the originals if the are utf8
using utf8::is_utf8(STRING) they were both said to be utf8.
But then again it was at night...
So can someone explain me why did I get different representations
and what are these two representations ?
thanks
Gabor
More information about the Perl
mailing list