[Israel.pm] utf-8 hebrew

Shlomo Yona shlomo at cs.haifa.ac.il
Sun May 2 00:40:51 PDT 2004

On Sun, 2 May 2004, Gabor Szabo wrote:

> Hey Unicode wizzards here is a question to you:
> I got two strings:
> "\x{5db}\x{5dc}\x{5d1}";

The first looks like Hebrew: the word "dog".

> "\x{d7}\x{9b}\x{d7}\x{9c}\x{d7}\x{91}";

These do not look like Hebrew unicode characters.

I suspect that somewhere along the way, you are treating the
unicode string like a simple string of bytes and so you end
up having later on the 2nd (bad?) representation.

Can you force your code, in all stages to consider the
strings as UTF8 strings?

Shlomo Yona
shlomo at cs.haifa.ac.il

