[Israel.pm] utf-8 hebrew
omerz at actcom.co.il
Sun May 2 02:43:32 PDT 2004
On Sun, 2 May 2004, Gabor Szabo wrote:
> Hey Unicode wizzards here is a question to you:
> I got two strings:
> The first I got by reading a file using utf8 and
> The second I got from a browser via a CGI script.
> They are both supposed to be the same word.
> I got the above representation by Dumping their variables
> using Data::Dumper.
The first string is in UCS-2 encoding (each character is encoded in 16
bits; Unicode characters beyond U+FFFF are encoded using surrogate pairs
[this is only my guess, as there is no such a thing in the example
The second string is in UTF-8 encoding. But somehow you are using 16 bits
to represent each character?
My opinions, as expressed in this E-mail message, are mine alone.
They do not represent the official policy of any organization with which
I may be affiliated in any way.
WARNING TO SPAMMERS: at http://www.zak.co.il/spamwarning.html
More information about the Perl